Bradley-Terry Model for Pairwise Rankings

#322 · NLP · Medium

Problem

Implement the Bradley-Terry model for pairwise rankings. Given a set of pairwise comparison results (win/loss between items), estimate the strength parameter for each item using iterative maximum likelihood estimation.

Solution

from typing import Dict, List, Tuple

def bradley_terry(
    comparisons: List[Tuple[str, str]],
    max_iter: int = 100,
    tol: float = 1e-6
) -> Dict[str, float]:
    # comparisons: list of (winner, loser)
    players = set()
    for w, l in comparisons:
        players.add(w)
        players.add(l)

    # Initialize strengths
    strength = {p: 1.0 for p in players}

    # Count wins and build matchup pairs
    wins = {p: 0 for p in players}
    for w, l in comparisons:
        wins[w] += 1

    for iteration in range(max_iter):
        new_strength = {}
        for p in players:
            numerator = wins[p]
            denominator = 0.0
            for w, l in comparisons:
                if p == w:
                    denominator += 1.0 / (strength[p] + strength[l])
                elif p == l:
                    denominator += 1.0 / (strength[w] + strength[p])

            if denominator > 0:
                new_strength[p] = numerator / denominator
            else:
                new_strength[p] = strength[p]

        # Normalize so strengths sum to number of players
        total = sum(new_strength.values())
        n = len(players)
        for p in new_strength:
            new_strength[p] *= n / total

        # Check convergence
        diff = max(abs(new_strength[p] - strength[p]) for p in players)
        strength = new_strength
        if diff < tol:
            break

    return strength

Explanation

The Bradley-Terry model assigns a strength parameter to each player such that P(i beats j) = strength_i / (strength_i + strength_j).
Using the MM (minorization-maximization) algorithm, iteratively update each player's strength as: wins_i / sum(1 / (strength_i + strength_j)) over all games involving player i.
Normalize strengths after each iteration to prevent divergence.
Repeat until convergence (max change below tolerance).

Complexity

Time: O(I * (P + M)) per iteration, where I is iterations, P is players, M is comparisons
Space: O(P + M)

← #321 #323 →