← back

Implement Score Matching for Score-Based Diffusion

#404 · Deep Learning · Medium

⊣ Solve on deep-ml.com

Problem

Implement the score matching objective for score-based diffusion models. Instead of predicting noise, score-based models learn the score function (gradient of the log probability). Implement denoising score matching where the model predicts the score at various noise levels.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import numpy as np

def denoising_score_matching_loss(
    model_fn,
    x0: np.ndarray,
    sigmas: np.ndarray
) -> float:
    batch_size = x0.shape[0]

    # Sample random noise levels
    sigma_idx = np.random.randint(0, len(sigmas), size=batch_size)
    sigma = sigmas[sigma_idx]

    # Reshape sigma for broadcasting
    sigma_shaped = sigma.copy()
    while sigma_shaped.ndim < x0.ndim:
        sigma_shaped = sigma_shaped[..., np.newaxis]

    # Add noise
    noise = np.random.randn(*x0.shape)
    x_noisy = x0 + sigma_shaped * noise

    # True score: -noise / sigma (gradient of log p(x_noisy | x0))
    target_score = -noise / sigma_shaped

    # Model predicts score
    predicted_score = model_fn(x_noisy, sigma)

    # Weighted MSE loss: sigma^2 * ||predicted - target||^2
    weights = sigma_shaped ** 2
    loss = np.mean(weights * (predicted_score - target_score) ** 2)
    return float(loss)


def langevin_sampling(score_fn, sigmas: np.ndarray, x_init: np.ndarray, steps_per_sigma: int = 100, step_size: float = 0.01) -> np.ndarray:
    x = x_init.copy()
    for sigma in reversed(sigmas):
        alpha = step_size * (sigma / sigmas[-1]) ** 2
        for _ in range(steps_per_sigma):
            score = score_fn(x, sigma)
            noise = np.random.randn(*x.shape)
            x = x + 0.5 * alpha * score + np.sqrt(alpha) * noise
    return x

Explanation

  1. Score function: the score s(x) = grad_x log p(x) points toward higher-density regions.
  2. Denoising score matching: add noise at level sigma, then the optimal score is -(x_noisy - x0) / sigma^2 = -noise / sigma.
  3. Weighting: multiply loss by sigma^2 so each noise level contributes equally (without weighting, high noise levels dominate).
  4. Langevin sampling: generate samples by iterating x <- x + alpha/2 * score(x) + sqrt(alpha) * noise, starting from high noise and annealing to low noise.

Complexity

  • Time: O(B d) per training step, O(S L * d) for sampling (S noise levels, L steps each)
  • Space: O(B * d) for the noisy samples and scores