← back

Implement Xavier/Glorot Weight Initialization

#369 · Deep Learning · Easy

⊣ Solve on deep-ml.com

Problem

Implement Xavier/Glorot weight initialization for neural network layers. This method initializes weights from a distribution scaled by the number of input and output neurons, keeping variance stable across layers.

Solution

1
2
3
4
5
6
7
8
9
10
11
import numpy as np

def xavier_init(fan_in: int, fan_out: int, mode: str = "uniform") -> np.ndarray:
    if mode == "uniform":
        limit = np.sqrt(6.0 / (fan_in + fan_out))
        return np.random.uniform(-limit, limit, (fan_in, fan_out))
    elif mode == "normal":
        std = np.sqrt(2.0 / (fan_in + fan_out))
        return np.random.normal(0, std, (fan_in, fan_out))
    else:
        raise ValueError(f"Unknown mode: {mode}")

Explanation

  1. Xavier uniform draws weights from U[-limit, limit] where limit = sqrt(6 / (fan_in + fan_out)).
  2. Xavier normal draws weights from N(0, std) where std = sqrt(2 / (fan_in + fan_out)).
  3. The scaling ensures the variance of activations and gradients remains approximately constant across layers, preventing vanishing or exploding values.
  4. Works best with symmetric activations like tanh and sigmoid.

Complexity

  • Time: O(fan_in * fan_out) to generate the weight matrix
  • Space: O(fan_in * fan_out) for the weight matrix