Concentration of the multinomial in Kullback-Leibler divergence near the ratio of alphabet and sample sizes
We bound the moment generating function of the Kullback-Leibler divergence between the empirical distribution of independent samples from a distribution over a finite alphabet (e.g. a multinomial distribution) and the underlying distribution via a simple reduction to the case of a binary alphabet (e.g. a binomial distribution). The resulting concentration inequality becomes meaningful (less than 1) when the deviation ε is a constant factor larger than the ratio (k-1)/n for k the alphabet size and n the number of samples, whereas the standard method of types bound requires ε > (k-1)/n ·(1 + n/(k-1)).
READ FULL TEXT