#260 · Machine Learning · Medium
⊣ Solve on deep-ml.comCalculate the Expected Calibration Error (ECE) for a classification model. ECE measures how well predicted probabilities match actual outcomes by binning predictions and comparing average confidence to average accuracy in each bin.
Partition predictions into equal-width confidence bins, compute the absolute difference between accuracy and confidence in each bin, and take the weighted average.
def expected_calibration_error(
y_true: list[int],
y_pred_prob: list[float],
n_bins: int = 10,
) -> float:
n = len(y_true)
if n == 0:
return 0.0
bin_boundaries = [i / n_bins for i in range(n_bins + 1)]
ece = 0.0
for b in range(n_bins):
lo = bin_boundaries[b]
hi = bin_boundaries[b + 1]
# Collect samples in this bin
indices = []
for i in range(n):
if b == n_bins - 1:
if lo <= y_pred_prob[i] <= hi:
indices.append(i)
else:
if lo <= y_pred_prob[i] < hi:
indices.append(i)
if not indices:
continue
bin_size = len(indices)
avg_confidence = sum(y_pred_prob[i] for i in indices) / bin_size
avg_accuracy = sum(y_true[i] for i in indices) / bin_size
ece += (bin_size / n) * abs(avg_accuracy - avg_confidence)
return round(ece, 6)n_bins equal-width intervals.|accuracy - confidence| across bins, weighted by bin size.