Information Entropy & Redundancy Calculator
Calculate Shannon entropy, maximum entropy, redundancy, and efficiency for a probability distribution or text input.
Input Method
Formulas
Shannon Entropy:
H(X) = −∑i=1N pi · logb(pi)
where pi is the probability of symbol i, b is the logarithm base, and by convention 0 · log(0) = 0.
Maximum Entropy (Uniform Distribution):
Hmax = logb(N)
where N is the number of symbols with non-zero probability. Achieved when all symbols are equally likely.
Absolute Redundancy:
R = Hmax − H(X)
Relative Redundancy:
r = R / Hmax = 1 − H(X) / Hmax
Efficiency:
η = H(X) / Hmax = 1 − r
Self-Information of symbol i:
I(xi) = −logb(pi)
Logarithm Base Units: Base 2 → bits | Base e → nats | Base 10 → hartleys (bans)
Assumptions & References
- Symbols are assumed to be independent and identically distributed (i.i.d.).
- Probabilities must be non-negative and sum to 1 (for direct probability input).
- Symbols with probability 0 are excluded from effective alphabet size N but contribute 0 to entropy (limit: 0 · log 0 = 0).
- For text input, each unique character is treated as a distinct symbol; relative frequencies are used as probability estimates.
- Shannon's source coding theorem states the minimum average code length L ≥ H (in bits when base 2).
- Redundancy measures how far a source is from the theoretical maximum entropy; higher redundancy implies more compressibility.
- English text has an estimated entropy of ~1.0–1.5 bits/character and relative redundancy of ~50–75%.
- Reference: Shannon, C.E. (1948). "A Mathematical Theory of Communication." Bell System Technical Journal, 27(3), 379–423.
- Reference: Cover, T.M. & Thomas, J.A. (2006). Elements of Information Theory (2nd ed.). Wiley.