← back

Implement Xavier/Glorot Weight Initialization

#289 · Deep Learning · Medium

⊣ Solve on deep-ml.com

Problem

Implement Xavier (Glorot) weight initialization for neural networks. Given the number of input units (fan_in) and output units (fan_out), generate a weight matrix sampled from the appropriate distribution to maintain variance across layers.

Solution

1
2
3
4
5
6
7
8
9
import numpy as np

def xavier_uniform(fan_in: int, fan_out: int) -> np.ndarray:
    limit = np.sqrt(6.0 / (fan_in + fan_out))
    return np.random.uniform(-limit, limit, size=(fan_in, fan_out))

def xavier_normal(fan_in: int, fan_out: int) -> np.ndarray:
    std = np.sqrt(2.0 / (fan_in + fan_out))
    return np.random.normal(0, std, size=(fan_in, fan_out))

Explanation

  1. Xavier uniform samples weights from U[-limit, limit] where limit = sqrt(6 / (fan_in + fan_out)). This ensures the variance of the outputs is approximately equal to the variance of the inputs.
  2. Xavier normal samples from N(0, std^2) where std = sqrt(2 / (fan_in + fan_out)).
  3. Both approaches keep the signal from exploding or vanishing through layers, making training more stable. Best suited for layers with tanh or sigmoid activations.

Complexity

  • Time: O(fan_in * fan_out)
  • Space: O(fan_in * fan_out)