Information Entropy & Redundancy Calculator

Calculate Shannon entropy, maximum entropy, redundancy, and efficiency for a probability distribution or text input.

Input Method

Formulas

Shannon Entropy:
H(X) = −∑i=1N pi · logb(pi)

where pi is the probability of symbol i, b is the logarithm base, and by convention 0 · log(0) = 0.

Maximum Entropy (Uniform Distribution):
Hmax = logb(N)

where N is the number of symbols with non-zero probability. Achieved when all symbols are equally likely.

Absolute Redundancy:
R = Hmax − H(X)

Relative Redundancy:
r = R / Hmax = 1 − H(X) / Hmax

Efficiency:
η = H(X) / Hmax = 1 − r

Self-Information of symbol i:
I(xi) = −logb(pi)

Logarithm Base Units: Base 2 → bits  |  Base e → nats  |  Base 10 → hartleys (bans)

Assumptions & References

  • Symbols are assumed to be independent and identically distributed (i.i.d.).
  • Probabilities must be non-negative and sum to 1 (for direct probability input).
  • Symbols with probability 0 are excluded from effective alphabet size N but contribute 0 to entropy (limit: 0 · log 0 = 0).
  • For text input, each unique character is treated as a distinct symbol; relative frequencies are used as probability estimates.
  • Shannon's source coding theorem states the minimum average code length L ≥ H (in bits when base 2).
  • Redundancy measures how far a source is from the theoretical maximum entropy; higher redundancy implies more compressibility.
  • English text has an estimated entropy of ~1.0–1.5 bits/character and relative redundancy of ~50–75%.
  • Reference: Shannon, C.E. (1948). "A Mathematical Theory of Communication." Bell System Technical Journal, 27(3), 379–423.
  • Reference: Cover, T.M. & Thomas, J.A. (2006). Elements of Information Theory (2nd ed.). Wiley.

In the network