Typical set
From Free net encyclopedia
TimBentley (Talk | contribs)
Disambiguation link repair - [[Wikipedia:Disambiguation pages with links|You can help!]]
Next diff →
Current revision
In information theory, the typical set is a set of sequences whose probability is close to two raised to the negative power of the entropy of their source distribution. That this set has total probability close to one is a consequence of the asymptotic equipartition property (AEP) which is a kind of law of large numbers.
This has great use in compression theory as it provides a theoretical means for compressing data, allowing us to represent any sequence <math>X^n</math> using <math>nH(X)</math> bits on average, and, hence, justifying the use of entropy as a measure of information from a source.
The AEP can also be proven for a large class of stationary ergodic processes, allowing typical set to be defined in more general cases.
Contents |
(Weakly) typical sequences
If a sequence x1, ..., xn is drawn from an i.i.d. distribution <math>X</math> then the typical set, <math>{A_\epsilon}^{(n)}</math> is defined as those sequences which satisfy:
- <math>
2^{-n(H(X)+\epsilon)} \leq p(x_1, x_2, ..., x_n) \leq 2^{-n(H(X)-\epsilon)} </math>
The probability above need only be within a factor of <math>2^{n\epsilon}</math>.
It has the following properties if n is sufficiently large, ε can be chosen arbitrarily small so that:
- The probability of a sequence from <math>X</math> being drawn from <math>{A_\epsilon}^{(n)}</math> is greater than <math>1-\epsilon</math>
- <math>\left| {A_\epsilon}^{(n)} \right| \leq 2^{n(H(X)+\epsilon)}</math>
- <math>\left| {A_\epsilon}^{(n)} \right| \geq (1-\epsilon)2^{n(H(X)-\epsilon)}</math>
For a general stochastic process <math>\{X(t)\}</math> with AEP, the (weakly) typical set can be defined similarly with <math>p(x_1, x_2, ..., x_n)</math> replaced by <math>p(x_0^\tau)</math> (i.e. the probability of the sample limited to the time interval <math>[0,\tau]</math>), <math>n</math> being the degree of freedom of the process in the time interval and <math>H(X)</math> being the entropy rate. If the process is continuous-valued, differential entropy is used instead.
Strongly typical sequences
Jointly typical sequences
Two sequences <math>x_1^n</math> and <math>y_1^n</math> are jointly ε-typical if the pair <math>(x_1^n,y_1^n)</math> is ε-typical with respect to the joint distribution <math>p(x_1^n,y_1^n)</math> and both <math>x_1^n</math> and <math>y_1^n</math> are ε-typical with respect to their marginal distributions <math>p(x_1^n)</math> and <math>p(y_1^n)</math>. The set of all such pairs of sequences <math>(x_1^n,y_1^n)</math> is denoted by <math>A_{\epsilon}^n(X,Y)</math>. Jointly ε-typical n-tuple sequences are defined similarly.
Applications of typicality
Typical set encoding
In communication, typical set encoding encodes only the typical set of a stochastic source with fixed length block codes. Asymptotically, it is, by the AEP, lossless and achieves the minimum rate equal to the entropy rate of the source.
Typical set decoding
In communication, typical set decoding is used in conjunction with random coding to estimate the transmitted message as the one with a codeword that is jointly ε-typical with the observation. i.e.
- <math>\hat{w}=w \iff (\exists!w)( (x_1^n(w),y_1^n)\in A_{\epsilon}^n(X,Y)) </math>
where <math>\hat{w},x_1^n(w),y_1^n</math> are the message estimate, codeword of message <math>w</math> and the observation respectively. <math>A_{\epsilon}^n(X,Y)</math> is defined with respect to the joint distribution <math>p(x_1^n)p(y_1^n|x_1^n)</math> where <math>p(y_1^n|x_1^n)</math> is the transition probability that characterizes the channel statistics, and <math>p(x_1^n)</math> is some input distribution used to generate the codewords in the random codebook.
Universal null-hypothesis testing
Universal channel code
Template:Sect-num-stub See also: algorithmic complexity theory
See also