← back

Label Encoding for Ordinal Variables

#356 · Data Preprocessing · Easy

⊣ Solve on deep-ml.com

Problem

Implement label encoding for ordinal categorical variables. Given a list of category strings and a specified ordering, map each category to its integer rank. Categories not in the ordering should be assigned -1.

Solution

1
2
3
def label_encode(data: list[str], ordering: list[str]) -> list[int]:
    label_map = {cat: idx for idx, cat in enumerate(ordering)}
    return [label_map.get(val, -1) for val in data]

Explanation

  1. Build a dictionary mapping each category in the specified ordering to its integer index (0-based).
  2. For each value in the input data, look up its encoded integer from the map. If not found, return -1.
  3. This preserves the ordinal relationship: categories earlier in the ordering list get smaller integer values.

Complexity

  • Time: O(n + k) where n is the data length and k is the number of categories
  • Space: O(k) for the label map