#113 · Deep Learning · Easy
⊣ Solve on deep-ml.comImplement a simple residual block with a shortcut (skip) connection. Given an input, apply two transformations (e.g., linear layers with activation) and add the original input to the output. This is the fundamental building block of ResNet architectures.
import numpy as np
def residual_block(x: np.ndarray, W1: np.ndarray, b1: np.ndarray, W2: np.ndarray, b2: np.ndarray) -> np.ndarray:
def relu(z):
return np.maximum(0, z)
# First transformation: linear + ReLU
out = relu(x @ W1 + b1)
# Second transformation: linear (no activation before adding residual)
out = out @ W2 + b2
# Shortcut connection: add the input
out = out + x
# Apply ReLU after addition
out = relu(out)
return outx to the output of the second layer. This creates a shortcut that allows gradients to flow directly through the network.F(x) = H(x) - x rather than the full mapping H(x). Learning residuals is easier because the identity mapping is already provided by the shortcut.