Compute Entropy H(X) and Cross-Entropy H(P, Q) for discrete probability distributions. Entropy measures the average information content; cross-entropy measures the average number of bits needed to encode data from P using a code optimized for Q.
import numpy as np
def entropy(p: np.ndarray) -> float:
p = np.array(p, dtype=float)
p = p / p.sum()
mask = p > 0
return float(-np.sum(p[mask] * np.log(p[mask])))
def cross_entropy(p: np.ndarray, q: np.ndarray) -> float:
p = np.array(p, dtype=float)
q = np.array(q, dtype=float)
p = p / p.sum()
q = q / q.sum()
# Clip q to avoid log(0)
q = np.clip(q, 1e-12, 1.0)
return float(-np.sum(p * np.log(q)))
def kl_from_entropy(p: np.ndarray, q: np.ndarray) -> float:
return cross_entropy(p, q) - entropy(p)H(P) = -sum(P(x) * log(P(x))) for all x where P(x) > 0. Measures the intrinsic uncertainty of distribution P.H(P, Q) = -sum(P(x) * log(Q(x))). Measures the expected number of nats to encode samples from P using distribution Q.H(P, Q) >= H(P), with equality when P = Q.H(P, Q) - H(P) equals the KL divergence KL(P || Q), which measures the inefficiency of using Q to represent P.