← back

Implement the SELU Activation Function

#103 · Deep Learning · Easy

⊣ Solve on deep-ml.com

Problem

Implement the SELU (Scaled Exponential Linear Unit) activation function. SELU is defined as f(x) = scale * (x if x > 0 else alpha * (exp(x) - 1)) with specific constants that enable self-normalizing properties in neural networks.

Solution

1
2
3
4
5
6
import numpy as np

def selu(x: np.ndarray) -> np.ndarray:
    alpha = 1.6732632423543772
    scale = 1.0507009873554805
    return scale * np.where(x > 0, x, alpha * (np.exp(x) - 1))

Explanation

  1. SELU uses mathematically derived constants: alpha = 1.6732... and scale = 1.0507....
  2. For positive inputs, the output is scale * x (slightly scaled up).
  3. For negative inputs, the output is scale * alpha * (exp(x) - 1), which saturates at -scale * alpha ≈ -1.758.
  4. These specific constants ensure that activations converge toward zero mean and unit variance during forward propagation, enabling self-normalization without batch normalization.
  5. SELU requires LeCun normal initialization and works best with fully connected layers.

Complexity

  • Time: O(n) where n is the number of elements
  • Space: O(n) for the output array