Forward Diffusion Process

#303 · Deep Learning · Medium

Problem

Implement the forward diffusion process for a diffusion model. Given clean data x_0, add noise progressively according to a noise schedule to produce noisy samples at any timestep t.

Solution

import numpy as np

def linear_beta_schedule(timesteps: int, beta_start: float = 1e-4,
                          beta_end: float = 0.02) -> np.ndarray:
    return np.linspace(beta_start, beta_end, timesteps)

def compute_alpha_bars(betas: np.ndarray) -> np.ndarray:
    alphas = 1.0 - betas
    alpha_bars = np.cumprod(alphas)
    return alpha_bars

def forward_diffusion(x_0: np.ndarray, t: int, alpha_bars: np.ndarray,
                      noise: np.ndarray = None) -> dict:
    """
    q(x_t | x_0) = N(x_t; sqrt(alpha_bar_t) * x_0, (1 - alpha_bar_t) * I)
    """
    if noise is None:
        noise = np.random.randn(*x_0.shape)

    alpha_bar_t = alpha_bars[t]
    sqrt_alpha_bar = np.sqrt(alpha_bar_t)
    sqrt_one_minus_alpha_bar = np.sqrt(1.0 - alpha_bar_t)

    x_t = sqrt_alpha_bar * x_0 + sqrt_one_minus_alpha_bar * noise

    return {"x_t": x_t, "noise": noise, "alpha_bar_t": float(alpha_bar_t)}

Explanation

Beta schedule defines how much noise is added at each step. Linear schedule goes from a small beta_start to a larger beta_end.
Alpha bar is the cumulative product of (1 - beta), representing the fraction of original signal remaining at timestep t.
Forward diffusion uses the closed-form expression: x_t = sqrt(alpha_bar_t) x_0 + sqrt(1 - alpha_bar_t) epsilon, where epsilon ~ N(0, I).
This allows sampling x_t at any timestep directly from x_0 without iterating through all intermediate steps.

Complexity

Time: O(T) for computing the schedule, O(d) for a single forward step
Space: O(T + d) for the schedule and noisy sample

← #302 #304 →