In mathematics and information theory, Sanov's theorem gives a bound on the probability of observing an atypical sequence of samples from a given probability distribution. In the language of large deviations theory, Sanov's theorem identifies the rate function for large deviations of the empirical measure of a sequence of i.i.d. random variables.

Let A be a set of probability distributions over an alphabet X, and let q be an arbitrary distribution over X (where q may or may not be in A). Suppose we draw n i.i.d. samples from q, represented by the vector . Then, we have the following bound on the probability that the empirical measure of the samples falls within the set A:

,

where

  • is the joint probability distribution on , and
  • is the information projection of q onto A.

In words, the probability of drawing an atypical distribution is bounded by a function of the KL divergence from the true distribution to the atypical one; in the case that we consider a set of possible atypical distributions, there is a dominant atypical distribution, given by the information projection.

Furthermore, if A is a closed set, then

Technical statement

edit

Define:

  •   is a finite set with size  . Understood as “alphabet”.
  •   is the simplex spanned by the alphabet. It is a subset of  .
  •   is a random variable taking values in  . Take   samples from the distribution  , then   is the frequency probability vector for the sample.
  •   is the space of values that   can take. In other words, it is

 Then, Sanov's theorem states:[1]

  • For every measurable subset  , 
  • For every open subset  , 

Here,   means the interior, and   means the closure.

References

edit
  1. ^ Dembo, Amir; Zeitouni, Ofer (2010). "Large Deviations Techniques and Applications". Stochastic Modelling and Applied Probability. 38: 16–17. doi:10.1007/978-3-642-03311-7. ISBN 978-3-642-03310-0. ISSN 0172-4568.
  • Sanov, I. N. (1957) "On the probability of large deviations of random variables". Mat. Sbornik 42(84), No. 1, 11–44.
  • Санов, И. Н. (1957) "О вероятности больших отклонений случайных величин". МАТЕМАТИЧЕСКИЙ СБОРНИК' 42(84), No. 1, 11–44.