← back

Calculate Matthews Correlation Coefficient

#279 · Machine Learning · Medium

⊣ Solve on deep-ml.com

Problem

Calculate the Matthews Correlation Coefficient (MCC) for binary classification. MCC is a balanced measure that accounts for true/false positives and negatives, producing a value between -1 and +1.

Solution

Compute the confusion matrix components (TP, TN, FP, FN) and apply the MCC formula.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import math

def matthews_correlation_coefficient(
    y_true: list[int],
    y_pred: list[int],
) -> float:
    tp = tn = fp = fn = 0

    for true, pred in zip(y_true, y_pred):
        if true == 1 and pred == 1:
            tp += 1
        elif true == 0 and pred == 0:
            tn += 1
        elif true == 0 and pred == 1:
            fp += 1
        else:
            fn += 1

    numerator = tp * tn - fp * fn
    denominator = math.sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))

    if denominator == 0:
        return 0.0

    return round(numerator / denominator, 6)

Explanation

  1. Count the four confusion matrix components: TP, TN, FP, FN.
  2. MCC = (TP * TN - FP * FN) / sqrt((TP+FP)(TP+FN)(TN+FP)(TN+FN)).
  3. MCC = +1 indicates perfect prediction, 0 is no better than random, -1 is total disagreement.
  4. Unlike accuracy, MCC is informative even with highly imbalanced classes.
  5. It is considered one of the best single-number metrics for binary classification quality.

Complexity

  • Time: O(n) where n is the number of samples
  • Space: O(1) — only four counters