Learned Positional Embeddings

#375 · Deep Learning · Easy

Problem

Implement learned positional embeddings for a transformer model. Given a maximum sequence length and embedding dimension, create a learnable embedding table that maps each position to a vector that is added to token embeddings.

Solution

import numpy as np

class LearnedPositionalEmbedding:
    def __init__(self, max_len: int, d_model: int):
        self.max_len = max_len
        self.d_model = d_model
        # Initialize with small random values
        self.embedding = np.random.normal(0, 0.02, (max_len, d_model))

    def forward(self, seq_len: int) -> np.ndarray:
        return self.embedding[:seq_len]

    def __call__(self, token_embeddings: np.ndarray) -> np.ndarray:
        seq_len = token_embeddings.shape[-2]
        pos_emb = self.embedding[:seq_len]
        return token_embeddings + pos_emb

Explanation

Initialize an embedding table of shape (max_len, d_model) with small random values.
During the forward pass, slice the table to match the input sequence length.
Add positional embeddings element-wise to token embeddings so each position gets a unique learned offset.
Unlike sinusoidal embeddings, these are fully learned during training, allowing the model to discover optimal position representations.

Complexity

Time: O(seq_len * d_model) for the addition
Space: O(max_len * d_model) for the embedding table

← #374 #376 →