Information Entropy & Redundancy Calculator

Calculate Shannon entropy, maximum entropy, redundancy, and efficiency for a probability distribution or text input.

Input Type

Probabilities (must sum to 1)

Logarithm Base

Shannon Entropy:
H(X) = −∑_i=1^N p_i · log_b(p_i)

where p_i is the probability of symbol i, b is the logarithm base, and by convention 0 · log(0) = 0.

Maximum Entropy (Uniform Distribution):
H_max = log_b(N)

where N is the number of symbols with non-zero probability. Achieved when all symbols are equally likely.

Absolute Redundancy:
R = H_max − H(X)

Relative Redundancy:
r = R / H_max = 1 − H(X) / H_max

Efficiency:
η = H(X) / H_max = 1 − r

Self-Information of symbol i:
I(x_i) = −log_b(p_i)

Logarithm Base Units: Base 2 → bits | Base e → nats | Base 10 → hartleys (bans)

Symbols are assumed to be independent and identically distributed (i.i.d.).
Probabilities must be non-negative and sum to 1 (for direct probability input).
Symbols with probability 0 are excluded from effective alphabet size N but contribute 0 to entropy (limit: 0 · log 0 = 0).
For text input, each unique character is treated as a distinct symbol; relative frequencies are used as probability estimates.
Shannon's source coding theorem states the minimum average code length L ≥ H (in bits when base 2).
Redundancy measures how far a source is from the theoretical maximum entropy; higher redundancy implies more compressibility.
English text has an estimated entropy of ~1.0–1.5 bits/character and relative redundancy of ~50–75%.
Reference: Shannon, C.E. (1948). "A Mathematical Theory of Communication." Bell System Technical Journal, 27(3), 379–423.
Reference: Cover, T.M. & Thomas, J.A. (2006). Elements of Information Theory (2nd ed.). Wiley.

In the network