next up previous
Next: Entropy Rate Up: Intro to Information Theory Previous: Thought experiment

Entropy

Think of it as a measure of disorder
1.
If you have no idea what should happen it should be maximal
2.
If you know exactly what's going to happen it should be zero.
3.
If events are indept., then the entropy should be sum of individual

\begin{displaymath}H(X)=-\sum_{i=1}^N p_i\log p_i \mbox{\qquad or \qquad} \overbrace{\sum_i p_i\log p_i}^{\mbox{countably infinite}}\end{displaymath}

I
H(X)= s bits (use $\log_2$)
II
H(X)=1
III

\begin{eqnarray*}H(X)&=&-\sum_{n=0}^{\infty}(1-\alpha)\alpha^n\log(1-\alpha)\alp...
...ha^n n\\
&=& \frac{\alpha}{1-\alpha}(\log(1-\alpha)+\log\alpha)
\end{eqnarray*}


$\alpha=1/2\qquad \rightarrow H(X)\equiv 2$

Idea
Entropy might have something to do with how compactly you can code the output of a source Def.
Self information is $I(p_j)=-\log p_j$ of source digit aj

Def.
Entropy $H(I)=E(I(p_j))=-\sum_j p_j \log p_j$

Def.
Joint entropy

\begin{displaymath}H(X,Y)=-\sum p(x,y)\log p(x,y) \end{displaymath}

In general

\begin{displaymath}H(X) = \sum_{\underline{x}} p(x_1,x_2,...,x_n)\log p(x_1,...x_n)\end{displaymath}

Show using $\log x\le x-1 $ that $H(X)\le \log N$

\begin{displaymath}H(X)-\log N \le \sum p_r \log\frac{1}{p_r N}\end{displaymath}

etc.

Def.
Conditional entropy

\begin{displaymath}H(X\vert Y)=\sum_{x,y} p(x,y) \log p(x\vert y) \end{displaymath}

and from decomposition of p(x,y)

\begin{displaymath}H(X\vert Y)=\sum p(x)p(y\vert x)\log p(x\vert y)\end{displaymath}

In general

\begin{displaymath}H(X_n\vert X_{n-1}...X_1)=\sum_{\underline{x}}p(\underline{x})\log p(x_n\vert x_{n-1}...x_1)\end{displaymath}

which implites that

\begin{displaymath}H(\underline{X})=H(X_1)+H(X_2\vert X_1)...+H(X_n\vert X_{n-1}...X_1)\end{displaymath}

Usually seen as

H(XY)=H(X)+H(Y|X)=H(Y)+H(X|Y)

Notice that if X indept. Y then H(XY)=H(X)+H(Y) why?


next up previous
Next: Entropy Rate Up: Intro to Information Theory Previous: Thought experiment
Christopher Rose
1999-02-24