Relative entropy

9/2/2023

Formally, given two probability distributions p(x) and q(x) over a discrete random variable X, the relative entropy given by D(pjjq) is de ned as follows: D. A physical interpretation of the quantity is the optimal distinguishability of the state ρ from separable states. 2 Relative Entropy The relative entropy, also known as the Kullback-Leibler divergence, between two probability distributions on a random variable is a measure of the distance between them. Where the minimum is taken over the family of separable states. The relative entropy of entanglement of ρ is defined by Let a composite quantum system have state space Suppose the probabilities of a finite sequence of events is given by the probability distribution P =. I plan to write a follow-up post to give examples of using these metrics in Data Science and Machine Learning.For simplicity, it will be assumed that all objects in the article are finite dimensional. ConclusionĮverything I cover here is introductory information theory, mostly found in the first chapter of the classic Cover, Thomas: Elements of Information Theory or the wikipedia pages linked above.

Mutual information does not have a useful interpretation in terms of channel coding. The following image explains the relationship between entropy, conditional entropy, join entropy and mutual information. Therefore, the relative entropy is used to evaluate the assembly clearance uniformity of each pair of shaft and hole. The smaller its value, the smaller the difference, that is, the more uniform the clearance between shaft and hole. when $ x = y + 1 $), and for these non-zero cases $ p(x,y) = p(x) = p(y) $, so $ I(X, Y) = - \sum p(x) log = H(X) $. Relative entropy is an index to judge the difference between two probability distributions. This is because in such a case, certain $p(x, y)$ combinations will be non-zero (eg. $ X = Y + 1 $, then one contains all the information about the other, so $ I(X, Y) = H(X) = H(Y) $. If $X$ and $Y$ completely determine another, eg.

What if $X$ and $Y$ are independent? In that case, $ I(X, Y) = 0 $ because $ p(x, y) = p(x) p(y) $ and $ log = 0 $. It follows trivially from the definition that mutual information is symmetric, $I(X, Y) = I(Y, X)$. If you compare this to the relative entropy formula above, it's the same with $p = p(x, y)$ and $q = p(x)p(y)$. Let's say it's a fair tetrahedron, so each side comes up with $ p = \frac $ What is a good encoding to minimizes the average amount of bits she sends? In the previous article I discussed the case of a fair coin, so let's make it a bit more complicated here, and use a 4-sided dice, ie. Section 4, the conditions for convergence are examined and a proof. Imagine Alice has a random variable and she needs to communicate the outcome over a digital binary channel to Bob. Then, the Bayesian control rule is derived from a revised relative entropy criterion.

EntropyĮntropy is the amount of uncertainty of a random variable, expressed in bits.

Mutual information (also known as Information gain)Įverything I cover here is elementary information theory, mostly found in the first chapter of the classic Cover, Thomas: Elements of Information Theory or the wikipedia pages linked above.
Relative entropy (also known as Kullback–Leibler divergence).
As a follow-up, I revisit more advanced variations of entropy and related concepts: In the previous article I discussed entropy.

0 Comments

Relative entropy

Leave a Reply.

Author

Archives

Categories