Implement the SELU (Scaled Exponential Linear Unit) activation function. SELU is defined as f(x) = scale * (x if x > 0 else alpha * (exp(x) - 1)) with specific constants that enable self-normalizing properties in neural networks.
import numpy as np
def selu(x: np.ndarray) -> np.ndarray:
alpha = 1.6732632423543772
scale = 1.0507009873554805
return scale * np.where(x > 0, x, alpha * (np.exp(x) - 1))alpha = 1.6732... and scale = 1.0507....scale * x (slightly scaled up).scale * alpha * (exp(x) - 1), which saturates at -scale * alpha ≈ -1.758.