← back

Implement Gaussian Mixture Model (GMM) E-step

#365 · Machine Learning · Medium

⊣ Solve on deep-ml.com

Problem

Implement the E-step (Expectation step) of the Gaussian Mixture Model (GMM) EM algorithm. Given current parameters (mixing coefficients, means, covariances), compute the responsibility of each component for each data point.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import numpy as np

def gmm_e_step(
    X: np.ndarray,
    pi: np.ndarray,
    mu: np.ndarray,
    sigma: np.ndarray
) -> np.ndarray:
    n = X.shape[0]
    k = len(pi)
    d = X.shape[1]
    resp = np.zeros((n, k))

    for j in range(k):
        diff = X - mu[j]
        inv_cov = np.linalg.inv(sigma[j])
        det_cov = np.linalg.det(sigma[j])
        norm_const = np.sqrt((2 * np.pi) ** d * det_cov) + 1e-300
        exponent = -0.5 * np.sum(diff @ inv_cov * diff, axis=1)
        resp[:, j] = pi[j] * np.exp(exponent) / norm_const

    # Normalize responsibilities
    resp /= resp.sum(axis=1, keepdims=True) + 1e-300
    return resp

Explanation

  1. For each Gaussian component j, compute the multivariate Gaussian density for all data points using N(x | mu_j, sigma_j).
  2. Weight each density by the mixing coefficient pi_j.
  3. Normalize across all components for each data point so that responsibilities sum to 1. The responsibility r(i,j) represents the posterior probability that point i belongs to component j.

Complexity

  • Time: O(n k d^2) for matrix operations per component
  • Space: O(n * k) for the responsibility matrix