← back

Calculate Batch Prediction Health Metrics

#249 · MLOps · Easy

⊣ Solve on deep-ml.com

Problem

Calculate batch prediction health metrics for monitoring ML pipeline health. Given batch prediction results, compute metrics like completion rate, data drift indicators, and prediction distribution statistics.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
def batch_prediction_health(
    predictions: list[float],
    expected_mean: float,
    expected_std: float,
    total_input: int,
    failed_count: int = 0,
) -> dict:
    """
    predictions: list of model prediction values from the batch
    expected_mean/std: historical baseline statistics
    total_input: total number of inputs in the batch
    failed_count: number of inputs that failed prediction
    """
    n = len(predictions)
    completed = n + failed_count

    # Completion rate
    completion_rate = completed / total_input if total_input > 0 else 0.0
    success_rate = n / total_input if total_input > 0 else 0.0

    if n == 0:
        return {
            "completion_rate": round(completion_rate, 6),
            "success_rate": round(success_rate, 6),
            "mean_prediction": 0.0,
            "std_prediction": 0.0,
            "mean_drift": 0.0,
            "std_drift_ratio": 0.0,
            "is_healthy": False,
        }

    # Prediction distribution
    mean_pred = sum(predictions) / n
    var_pred = sum((p - mean_pred) ** 2 for p in predictions) / n
    std_pred = var_pred ** 0.5

    # Drift detection
    mean_drift = abs(mean_pred - expected_mean)
    std_drift_ratio = std_pred / expected_std if expected_std > 0 else float('inf')

    # Health check: drift within 2 std, std ratio between 0.5 and 2.0, success rate > 0.95
    is_healthy = (
        mean_drift < 2 * expected_std
        and 0.5 <= std_drift_ratio <= 2.0
        and success_rate > 0.95
    )

    return {
        "completion_rate": round(completion_rate, 6),
        "success_rate": round(success_rate, 6),
        "mean_prediction": round(mean_pred, 6),
        "std_prediction": round(std_pred, 6),
        "mean_drift": round(mean_drift, 6),
        "std_drift_ratio": round(std_drift_ratio, 6),
        "is_healthy": is_healthy,
    }

Explanation

  1. Completion rate: fraction of inputs that were processed (success + failure).
  2. Success rate: fraction of inputs that produced valid predictions.
  3. Mean/std drift: compare the current batch prediction statistics to historical baselines.
  4. Std drift ratio: if the prediction variance changes significantly, it signals data or model issues.
  5. Health check: a simple threshold-based rule combining drift detection and success rate.

Complexity

  • Time: O(n) where n is the number of predictions
  • Space: O(1) beyond the input