← back

Math

Game Theory (Nim / Sprague-Grundy)

For Nim-like games, the first player loses if and only if the position is a losing position (often a power of some number). Compute the Grundy value (nimber) for each state to determine optimal play.

Game Theory: Nim and the Sprague-Grundy Theorem

Two-player combinatorial games show up surprisingly often in competitive programming. The surface framing varies, whether stones in piles, matches on a table, or subtraction games, but the underlying mathematics is nearly always the same. Once you recognize the pattern, a problem that looks impossibly complex collapses into a single XOR operation or a small memoized computation.

What Makes a Game "Impartial"

A game is impartial when both players have exactly the same set of legal moves from any position. Neither player has a private piece color or a restricted direction of movement. Nim is the canonical example: a collection of piles of stones where each player, on their turn, removes any positive number of stones from exactly one pile. The player who cannot move (all piles empty) loses. This is called normal play convention.

The key insight is that we never need to enumerate the full game tree. Instead, we classify every position as either a P-position (Previous player wins, meaning the player who just moved wins and the player to move loses) or an N-position (Next player wins). Terminal positions with no moves are P-positions by definition.

Nim: The XOR Rule

For classic Nim, the winning criterion is elegantly simple: XOR all the pile sizes together. If the result is nonzero, the player to move wins; if it is zero, they lose.

1
2
3
4
5
def can_first_player_win_nim(piles: list[int]) -> bool:
    xor = 0
    for p in piles:
        xor ^= p
    return xor != 0

Walked example, piles [3, 4, 5]:

In binary: 3 = 011, 4 = 100, 5 = 101. XOR them column by column:

  • Bit 2: 0 ^ 1 ^ 1 = 0
  • Bit 1: 1 ^ 0 ^ 0 = 1
  • Bit 0: 1 ^ 0 ^ 1 = 0

Result: 010 = 2, which is nonzero, so the first player wins. A winning move must leave XOR = 0. The first player can remove 2 stones from pile 3 (leaving pile sizes [1, 4, 5]): 001 ^ 100 ^ 101 = 000. Now whatever the second player does, the first player can always restore XOR = 0, guaranteeing the second player eventually faces all-empty piles.

Why XOR works: the balance argument. When XOR = 0, every move must change at least one bit in the modified pile, which necessarily makes XOR nonzero. You cannot fix this with a single-pile change, because touching only one pile cannot return all bit columns to even parity when they already were. Conversely, from any nonzero XOR state, a winning move always exists: find the highest set bit in the XOR, pick any pile that has that bit set, and reduce it to the value that zeroes out the XOR. That reduced value is strictly smaller than the original pile size, making it a legal move.

The Sprague-Grundy Theorem

The Sprague-Grundy theorem generalizes Nim to any impartial game: every game position is equivalent to a single Nim pile of some size, called its Grundy number (or nimber). This means two things in practice:

  • You can replace any sub-game with a virtual Nim pile of the corresponding Grundy size.
  • In a compound game (several independent sub-games played simultaneously), XOR the Grundy numbers of all sub-games. If the composite XOR is nonzero, the current player wins.

The Grundy number of a position is defined recursively as the mex (minimum excludant) of the Grundy numbers of all positions reachable in one move. The mex of a set is the smallest non-negative integer not in that set. For example, mex({0, 1, 3}) = 2.

Terminal positions have no moves, so their reachable set is empty and mex({}) = 0. This confirms that terminal positions are P-positions (Grundy 0), consistent with the earlier classification.

Computing Grundy Values

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from functools import lru_cache

def solve_game(initial_state):
    @lru_cache(maxsize=None)
    def grundy(state):
        reachable = set()
        for next_state in get_moves(state):
            reachable.add(grundy(next_state))
        # mex: smallest non-negative integer not in reachable
        g = 0
        while g in reachable:
            g += 1
        return g

    return grundy(initial_state) != 0

The get_moves function is problem-specific. For a subtraction game where you can remove 1, 2, or 3 stones from a pile of size n, you would yield states n-1, n-2, and n-3 (clamped at zero).

Example, subtraction game with allowed moves {1, 2, 3}:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from functools import lru_cache

def subtraction_grundy(n: int, allowed: frozenset = frozenset({1, 2, 3})) -> int:
    @lru_cache(maxsize=None)
    def g(x: int) -> int:
        if x == 0:
            return 0
        reachable = {g(x - k) for k in allowed if x - k >= 0}
        mex = 0
        while mex in reachable:
            mex += 1
        return mex
    return g(n)

# g(0)=0, g(1)=1, g(2)=2, g(3)=3, g(4)=0, g(5)=1, ...
# The sequence is periodic with period 4: Grundy(n) = n % 4

Many subtraction games produce periodic Grundy sequences, which lets you handle pile sizes too large to enumerate directly.

Multi-Game Composition

When a problem presents several independent sub-games played in parallel, compute a Grundy number for each sub-game and XOR them all together. This is the most powerful application of the theorem: it decomposes a complex board into small, independently solvable pieces.

1
2
3
4
5
def multi_game_winner(sub_game_states: list) -> bool:
    composite_xor = 0
    for state in sub_game_states:
        composite_xor ^= grundy(state)
    return composite_xor != 0

Stone Game (LeetCode 877)

Two players take turns picking stones from either end of a row. Each stone has a value, and the player with the higher total wins. LeetCode 877. Stone Game guarantees an even number of piles and that the total is odd (so no ties).

This is not a standard impartial game. It is a two-player zero-sum game with optimal play. While you can solve it with interval DP (dp[i][j] = max score advantage the current player can achieve from piles i..j), there is also a clever game theory insight: the first player can always win by choosing to take all even-indexed or all odd-indexed piles.

The DP approach is more general and works when the clever trick does not apply (e.g., when the number of piles is odd, or in Stone Game II/III variants):

1
2
3
4
5
6
7
8
9
10
11
12
def stoneGame(piles):
    n = len(piles)
    # dp[i][j] = max score difference (current player - opponent) for piles[i..j]
    dp = [[0] * n for _ in range(n)]
    for i in range(n):
        dp[i][i] = piles[i]
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = max(piles[i] - dp[i + 1][j],
                           piles[j] - dp[i][j - 1])
    return dp[0][n - 1] > 0

The key recurrence: if you take the left pile, you gain piles[i] but then your opponent plays optimally on the remaining range, achieving advantage dp[i+1][j], which is against you, hence the subtraction. This "negamax" formulation is a common pattern in two-player game DP.

Can I Win (LeetCode 464)

LeetCode 464. Can I Win asks: two players take turns choosing from integers 1 to maxChoosableInteger, without replacement. The first player to push the running total to desiredTotal or above wins. Can the first player force a win?

This combines game theory with bitmask DP. The state is a bitmask of which integers have been used, and each position is analyzed as a game state:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
from functools import lru_cache

def canIWin(maxChoosableInteger, desiredTotal):
    if maxChoosableInteger * (maxChoosableInteger + 1) // 2 < desiredTotal:
        return False  # impossible to reach total
    if desiredTotal <= 0:
        return True

    @lru_cache(maxsize=None)
    def can_win(used_mask, remaining):
        for i in range(1, maxChoosableInteger + 1):
            if used_mask & (1 << i):
                continue
            if i >= remaining:
                return True  # this pick wins immediately
            # If the opponent cannot win after this move, then this move wins
            if not can_win(used_mask | (1 << i), remaining - i):
                return True
        return False  # no winning move exists

    return can_win(0, desiredTotal)

This is a direct application of game state analysis: a position is winning if there exists at least one move that leads to a losing position for the opponent.

Complexity

Grundy computation runs in O(S×M)O(S \times M) time, where SS is the number of distinct game states and MM is the maximum number of moves from any state. Space is O(S)O(S) for the memoization table. For classic Nim, the XOR shortcut runs in O(n)O(n) time and O(1)O(1) space, where nn is the number of piles, with no state enumeration needed.

The practical challenge is recognizing when Sprague-Grundy applies (impartial game, normal play convention) and correctly defining the state representation. Once those two pieces are in place, the algorithm is mechanical.

Recognizing this pattern

You're probably looking at game theory / Nim / Sprague-Grundy when:

  • Two players alternate, both play optimally, and the question is who wins (or who takes the last move).
  • The game is impartial, so both players have the same set of legal moves from any state.
  • The position decomposes into independent sub-games whose outcomes XOR together (multi-pile Nim, stone-game variants).
  • Brute-force minimax is exponential, but the state space has a clean numeric structure (a Grundy number per state).

Common templates:

  • Single-pile / parity classification: winner determined by n mod k or similar. Example: Nim Game.
  • Classic Nim XOR: XOR pile sizes; first player wins iff the XOR is non-zero. Example: Stone Game IV.
  • Memoized minimax over states: DP on (remaining, turn) for small state spaces. Example: Stone Game.
  • Grundy number computation: mex of reachable positions, then XOR across independent components. Example: Cat and Mouse.