← back

Implement Label Smoothing for Multi-Class Cross-Entropy

#194 · Machine Learning · Medium

⊣ Solve on deep-ml.com

Problem

Implement Label Smoothing for multi-class cross-entropy loss. Instead of using hard one-hot targets, blend the true label with a uniform distribution over all classes using a smoothing parameter epsilon.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import numpy as np

def label_smoothing_cross_entropy(y_true: np.ndarray, y_pred: np.ndarray,
                                   epsilon: float = 0.1) -> float:
    y_true = np.array(y_true)
    y_pred = np.array(y_pred, dtype=float)

    if y_true.ndim == 1:
        # Convert class indices to one-hot
        n_classes = y_pred.shape[-1]
        n_samples = len(y_true)
        one_hot = np.zeros((n_samples, n_classes))
        one_hot[np.arange(n_samples), y_true.astype(int)] = 1.0
        y_true = one_hot

    n_classes = y_true.shape[-1]
    smoothed = y_true * (1.0 - epsilon) + epsilon / n_classes

    # Clip predictions to avoid log(0)
    y_pred = np.clip(y_pred, 1e-12, 1.0)

    # Cross-entropy with smoothed labels
    loss = -np.sum(smoothed * np.log(y_pred), axis=-1)
    return float(np.mean(loss))

Explanation

  1. If y_true contains class indices, convert to one-hot encoding.
  2. Apply label smoothing: smoothed = (1 - epsilon) * one_hot + epsilon / K where K is the number of classes.
  3. This redistributes a fraction epsilon of the probability mass uniformly across all classes.
  4. Compute the cross-entropy loss using the smoothed targets: -sum(smoothed * log(y_pred)).
  5. Label smoothing acts as a regularizer and prevents the model from becoming overconfident.

Complexity

  • Time: O(n * K) where n is the number of samples and K is the number of classes
  • Space: O(n * K) for the smoothed label matrix