日韩久久久精品,亚洲精品久久久久久久久久久,亚洲欧美一区二区三区国产精品 ,一区二区福利

信息熵 information Entropy

系統(tǒng) 2836 0

Introduction

Entropy is a measure of disorder, or more precisely unpredictability. For example, a series of coin tosses with a fair coin has maximum entropy, since there is no way to predict what will come next. A string of coin tosses with a coin with two heads and no tails has zero entropy, since the coin will always come up heads. Most collections of data in the real world lie somewhere in between. It is important to realize the difference between the entropy of a set of possible outcomes, and the entropy of a particular outcome. A single toss of a fair coin has an entropy of one bit, but a particular result (e.g. "heads") has zero entropy, since it is entirely "predictable".

English text has fairly low entropy. In other words, it is fairly predictable. Even if we don't know exactly what is going to come next, we can be fairly certain that, for example, there will be many more e's than z's, or that the combination 'qu' will be much more common than any other combination with a 'q' in it and the combination 'th' will be more common than any of them. Uncompressed, English text has about one bit of entropy for each byte (eight bits) of message. [ citation needed ]

If a compression scheme is lossless—that is, you can always recover the entire original message by uncompressing—then a compressed message has the same total entropy as the original, but in fewer bits. That is, it has more entropy per bit. This means a compressed message is more unpredictable, which is why messages are often compressed before being encrypted. Roughly speaking, Shannon's source coding theorem says that a lossless compression scheme cannot compress messages, on average, to have more than one bit of entropy per bit of message. The entropy of a message is in a certain sense a measure of how much information it really contains.

Shannon's theorem also implies that no lossless compression scheme can compress all messages. If some messages come out smaller, at least one must come out larger. In the real world, this is not a problem, because we are generally only interested in compressing certain messages, for example English documents as opposed to random bytes, or digital photographs rather than noise, and don't care if our compressor makes random messages larger.




最初定義

信息理論的鼻祖之一Claude E. Shannon把信息(熵)定義為離散隨機(jī)事件的出現(xiàn)概率。所謂信息熵,是一個(gè)數(shù)學(xué)上頗為抽象的概念,在這里不妨把信息熵理解成某種特定信息的出現(xiàn)概率。


對于任意一個(gè)隨機(jī)變量 X,它的熵定義如下:變量的不確定性越大,熵也就越大,把它搞清楚所需要的信息量也就越大。   


信息熵是 信息論 中用于度量信息量的一個(gè)概念。一個(gè)系統(tǒng)越是有序,信息熵就越低;反之,一個(gè)系統(tǒng)越是混亂,信息熵就越高。所以,信息熵也可以說是系統(tǒng)有序化程度的一個(gè)度量。


Named after Boltzmann's H-theorem , Shannon denoted the entropy H of a discrete random variable X with possible values { x 1 , ..., x n } as,

Here E is the expected value , and I is the information content of X .

I ( X ) is itself a random variable. If p denotes the probability mass function of X then the entropy can explicitly be written as

where b is the base of the logarithm used. Common values of b are 2, Euler's number e , and 10, and the unit of entropy is bit for b =2, nat for b = e , and dit (or digit) for b =10. [ 3 ]

In the case of p i =0 for some i , the value of the corresponding summand 0log b 0 is taken to be 0, which is consistent with the limit :

.

The proof of this limit can be quickly obtained applying l'H?pital's rule .


計(jì)算公式

  H(x)=E[I(xi)]=E[ log(1/p(xi)) ]=-∑p(xi)log(p(xi)) (i=1,2,..n)



具體應(yīng)用 示例

1、香農(nóng)指出,它的準(zhǔn)確信息量應(yīng)該是   = -(p1*log p1 + p2 * log p2 + ... +p32 *log p32),其中,p1,p2 , ...,p32 分別是這 32 個(gè)球隊(duì)奪冠的概率。香農(nóng)把它稱為“信息熵” (Entropy),一般用符號 H 表示,單位是比特。有興趣的讀者可以推算一下當(dāng) 32 個(gè)球隊(duì)奪冠概率相同時(shí),對應(yīng)的信息熵等于五比特。有數(shù)學(xué)基礎(chǔ)的讀者還可以證明上面公式的值不可能大于五。


2、在很多情況下,對一些隨機(jī)事件,我們并不了解其概率分布,所掌握的只是與隨機(jī)事件有關(guān)的一個(gè)或幾個(gè)隨機(jī)變量的平均值。例如,我們只知道一個(gè)班的學(xué)生考試成績有三個(gè)分?jǐn)?shù)檔:80分、90分、100分,且已知平均成績?yōu)?0分。顯然在這種情況下,三種分?jǐn)?shù)檔的概率分布并不是唯一的。因?yàn)樵谙铝幸阎獥l件限制下p1*80+p2*90+p3*100=90,P1+p2+p3=1。有無限多組解,該選哪一組解呢?即如何從這些相容的分布中挑選出“最佳的”、“最合理”的分布來呢?這個(gè)挑選標(biāo)準(zhǔn)就是最大信息熵原理。

按最大信息熵原理,我們從全部相容的分布中挑選這樣的分布,它是在某些約束條件下(通常是給定的某些隨機(jī)變量的平均值)使信息熵達(dá)到極大值的分布。這一原理是由楊乃斯提出的。這是因?yàn)樾畔㈧厝〉脴O大值時(shí)對應(yīng)的一組概率分布出現(xiàn)的概率占絕對優(yōu)勢。從理論上可以證明這一點(diǎn)。在我們把熵看作是計(jì)量不確定程度的最合適的標(biāo)尺時(shí),我們就基本已經(jīng)認(rèn)可在給定約束下選擇不確定程度最大的那種分布作為隨機(jī)變量的分布。因?yàn)檫@種隨機(jī)分布是最為隨機(jī)的,是主觀成分最少,把不確定的東西作最大估計(jì)的分布。

3 Data as a Markov process

A common way to define entropy for text is based on the Markov model of text. For an order-0 source (each character is selected independent of the last characters), the binary entropy is:

where p i is the probability of i . For a first-order Markov source (one in which the probability of selecting a character is dependent only on the immediately preceding character), the entropy rate is:

where i is a state (certain preceding characters) and p i ( j ) is the probability of j given i as the previous character.

For a second order Markov source, the entropy rate is

4 b -ary entropy

In general the b -ary entropy of a source = ( S , P ) with source alphabet S = { a 1 , ..., a n } and discrete probability distribution P = { p 1 , ..., p n } where p i is the probability of a i (say p i = p ( a i )) is defined by:

Note: the b in " b -ary entropy" is the number of different symbols of the "ideal alphabet" which is being used as the standard yardstick to measure source alphabets. In information theory, two symbols are necessary and sufficient for an alphabet to be able to encode information, therefore the default is to let b = 2 ("binary entropy"). Thus, the entropy of the source alphabet, with its given empiric probability distribution, is a number equal to the number (possibly fractional) of symbols of the "ideal alphabet", with an optimal probability distribution, necessary to encode for each symbol of the source alphabet. Also note that "optimal probability distribution" here means a uniform distribution : a source alphabet with n symbols has the highest possible entropy (for an alphabet with n symbols) when the probability distribution of the alphabet is uniform. This optimal entropy turns out to be .







信息熵 information Entropy


更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主

微信掃碼或搜索:z360901061

微信掃一掃加我為好友

QQ號聯(lián)系: 360901061

您的支持是博主寫作最大的動(dòng)力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點(diǎn)擊下面給點(diǎn)支持吧,站長非常感激您!手機(jī)微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點(diǎn)擊微信右上角掃一掃功能,選擇支付二維碼完成支付。

【本文對您有幫助就好】

您的支持是博主寫作最大的動(dòng)力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描上面二維碼支持博主2元、5元、10元、自定義金額等您想捐的金額吧,站長會(huì)非常 感謝您的哦!!!

發(fā)表我的評論
最新評論 總共0條評論
主站蜘蛛池模板: 乐山市| 永城市| 云林县| 渝北区| 邢台市| 土默特左旗| 灵武市| 新巴尔虎右旗| 兰坪| 安乡县| 达州市| 吴旗县| 斗六市| 黄石市| 阳东县| 永嘉县| 铜山县| 扬中市| 张掖市| 广平县| 射阳县| 昌江| 洞头县| 涟水县| 黄石市| 北海市| 略阳县| 青川县| 商水县| 九龙县| 湘乡市| 健康| 湖北省| 新绛县| 瑞昌市| 交口县| 饶平县| 扶余县| 原平市| 贵定县| 渭源县|