← back

Learned Positional Embeddings

#375 · Deep Learning · Easy

⊣ Solve on deep-ml.com

Problem

Implement learned positional embeddings for a transformer model. Given a maximum sequence length and embedding dimension, create a learnable embedding table that maps each position to a vector that is added to token embeddings.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import numpy as np

class LearnedPositionalEmbedding:
    def __init__(self, max_len: int, d_model: int):
        self.max_len = max_len
        self.d_model = d_model
        # Initialize with small random values
        self.embedding = np.random.normal(0, 0.02, (max_len, d_model))

    def forward(self, seq_len: int) -> np.ndarray:
        return self.embedding[:seq_len]

    def __call__(self, token_embeddings: np.ndarray) -> np.ndarray:
        seq_len = token_embeddings.shape[-2]
        pos_emb = self.embedding[:seq_len]
        return token_embeddings + pos_emb

Explanation

  1. Initialize an embedding table of shape (max_len, d_model) with small random values.
  2. During the forward pass, slice the table to match the input sequence length.
  3. Add positional embeddings element-wise to token embeddings so each position gets a unique learned offset.
  4. Unlike sinusoidal embeddings, these are fully learned during training, allowing the model to discover optimal position representations.

Complexity

  • Time: O(seq_len * d_model) for the addition
  • Space: O(max_len * d_model) for the embedding table