Backtracking

Word Search on Grid

DFS on a 2D grid following the characters of a target word. Mark cells as visited before recursing and restore them after to allow other paths to use the cell.

Word Search on a Grid

You are given a 2D grid of characters and a word. Can you find the word by starting at any cell and following a path through adjacent cells (up, down, left, right), using each cell at most once? This is LeetCode 79, Word Search, and it is one of the cleanest examples of backtracking on a grid.

The problem feels like it should be simple, just walk through the grid matching characters, but the constraint that each cell can only be used once per path turns it into a genuine backtracking problem. You cannot just scan; you must explore, hit dead ends, undo your steps, and try again.

DFS on a Grid

The approach starts with a scan of the entire grid looking for cells that match the first character of the word. Each such cell is a potential starting point. From each starting point, you launch a depth-first search: if the current cell matches the current character of the word, move to a neighbor and try to match the next character. If you match the entire word, return true. If you reach a dead end, meaning no neighbor matches the next character or all neighbors have already been visited on this path, backtrack.

The DFS function takes the current cell coordinates (r, c) and the index k into the word. The base case is k == len(word), meaning you have matched every character. Before doing any work, you check three conditions: the cell is within bounds, the cell has not been visited on the current path, and the character at this cell equals word[k]. If any condition fails, return false immediately.

The Visited Constraint and the In-Place Marking Trick

Each cell can be used at most once in a single path. The standard way to track this is a visited set, but there is a slicker approach: temporarily replace the character in the grid with a sentinel value like "#". When the DFS function enters a cell, it saves the character, overwrites it with "#", recurses into neighbors, and then restores the original character on the way back.

This avoids allocating a separate visited set and is slightly faster in practice because you are checking board[r][c] != word[k] anyway, and a cell marked with "#" will automatically fail the character match for any real word character. Two birds, one stone.

Walking Through an Example

Consider this grid and the word "ABCCED":

1

2

3

A B C E
S F C S
A D E E

We scan for cells matching 'A'. There are two: (0,0) and (2,0). Start DFS at (0,0) with k=0.

Cell (0,0) = 'A' matches word[0]. Mark it as '#'. Now we need 'B' (k=1). Check neighbors: (1,0) = 'S', no. (0,1) = 'B', yes.

Cell (0,1) = 'B' matches word[1]. Mark it. Need 'C' (k=2). Neighbors: (0,0) = '#' (visited), no. (1,1) = 'F', no. (0,2) = 'C', yes.

Cell (0,2) = 'C' matches word[2]. Mark it. Need 'C' (k=3). Neighbors: (0,1) = '#', no. (1,2) = 'C', yes. (0,3) = 'E', no.

Cell (1,2) = 'C' matches word[3]. Mark it. Need 'E' (k=4). Neighbors: (0,2) = '#', no. (2,2) = 'E', yes. (1,1) = 'F', no. (1,3) = 'S', no.

Cell (2,2) = 'E' matches word[4]. Mark it. Need 'D' (k=5). Neighbors: (2,1) = 'D', yes.

Cell (2,1) = 'D' matches word[5]. k+1 = 6 = len(word). All characters matched. Return true.

The path is (0,0) → (0,1) → (0,2) → (1,2) → (2,2) → (2,1), spelling "ABCCED". Each cell was used exactly once.

Code: Word Search

def exist(board, word):
    rows, cols = len(board), len(board[0])

    def dfs(r, c, k):
        if k == len(word):
            return True
        if r < 0 or r >= rows or c < 0 or c >= cols:
            return False
        if board[r][c] != word[k]:
            return False

        tmp = board[r][c]
        board[r][c] = "#"
        found = any(
            dfs(r + dr, c + dc, k + 1)
            for dr, dc in ((1, 0), (-1, 0), (0, 1), (0, -1))
        )
        board[r][c] = tmp
        return found

    return any(
        dfs(r, c, 0)
        for r in range(rows)
        for c in range(cols)
    )

A few details worth noting. The bounds check and character check happen before any marking, so you never mark a cell that does not match. The any() generator short-circuits: as soon as one neighbor path finds the word, it stops exploring the others. And the character restoration on line board[r][c] = tmp happens regardless of whether the search succeeded, because you are restoring after the recursive call returns.

Why $O(m \cdot n \cdot 3^L)$ ?

The time complexity is $O(m \cdot n \cdot 3^L)$ , where m and n are the grid dimensions and L is the length of the word. Here is the reasoning.

You try starting from each of the $m \times n$ cells. From each cell, the DFS branches into at most 4 neighbors at the first level, but at every subsequent level it branches into at most 3, because you came from one neighbor and that cell is now marked as visited. So the search tree from each starting cell has at most $4 \cdot 3^{L-1}$ nodes, which is $O(3^L)$ . Multiplied across all starting cells, the total is $O(m \cdot n \cdot 3^L)$ .

In practice, it is much faster. The character match check prunes most branches immediately. If the word starts with 'Z' and your grid has only two Z's, you launch DFS from just two cells instead of m * n.

Pruning: Check Before You Recurse

A subtle but important optimization: check the character match before recursing, not after. The code above does this correctly: the board[r][c] != word[k] check is the third line of the DFS function, before any recursive calls. This means you never push a call onto the stack for a cell that does not match. In a grid where most characters differ from the next character you need, this eliminates the vast majority of work.

Some implementations check the character match inside the loop over neighbors. That also works but is slightly less clean because you have to handle the starting cell separately. The "check at the top of DFS" pattern is simpler and equally efficient.

Word Search II: Searching for Multiple Words

Word Search II (LeetCode 212) asks you to find all words from a given list that appear in the grid. A naive approach would run the single-word search for each word, but if the word list is large, that is wasteful, because each DFS from a given cell re-explores the same paths.

The key insight is to search for all words simultaneously using a trie. You build a trie from the word list, then run a single DFS through the grid. At each cell, instead of matching against a specific word index, you check whether the current character leads to a child in the trie. If it does not, you prune that branch, because no word in the list could possibly extend through this cell along this path.

When you reach a trie node that marks the end of a word, you add that word to the results. You can continue searching from that node, because longer words might share the same prefix.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.word = None

def find_words(board, words):
    root = TrieNode()
    for w in words:
        node = root
        for ch in w:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.word = w

    rows, cols = len(board), len(board[0])
    result = []

    def dfs(r, c, node):
        if r < 0 or r >= rows or c < 0 or c >= cols:
            return
        ch = board[r][c]
        if ch not in node.children:
            return

        child = node.children[ch]
        if child.word is not None:
            result.append(child.word)
            child.word = None  # avoid duplicates

        board[r][c] = "#"
        for dr, dc in ((1, 0), (-1, 0), (0, 1), (0, -1)):
            dfs(r + dr, c + dc, child)
        board[r][c] = ch

        # Prune: remove leaf nodes from the trie
        if not child.children:
            del node.children[ch]

    for r in range(rows):
        for c in range(cols):
            dfs(r, c, root)

    return result

The trie pruning line at the bottom (if not child.children: del node.children[ch]) is a powerful optimization. Once a trie branch has yielded all its words and has no remaining children, there is no reason to ever traverse it again. Removing it from the trie prevents future DFS paths from wasting time on exhausted branches. Without this pruning, Word Search II can time out on large inputs.

Setting child.word = None after finding a word prevents the same word from being added to the result multiple times, in case the word appears along different paths in the grid.

Complexity Analysis

For the single word search, time complexity is $O(m \cdot n \cdot 3^L)$ as discussed above. Space complexity is $O(L)$ for the recursion stack, where L is the length of the word. The in-place marking avoids additional space for a visited set.

For Word Search II, let W be the number of words and L be the maximum word length. Building the trie takes $O(W \cdot L)$ time. The DFS visits each cell and can explore paths of length up to L, so the search is $O(m \cdot n \cdot 3^L)$ in the worst case, but the trie pruning makes it dramatically faster in practice. Space complexity is $O(W \cdot L)$ for the trie, plus $O(L)$ for the recursion stack.

The trie-based approach shines when the word list is large. Instead of running $m \cdot n \cdot 3^L$ work per word, you run the search once and let the trie guide you to all words simultaneously.

Recognizing this pattern

You're probably looking at grid word search backtracking when:

You need to find a word (or words) by walking in 4 directions on a 2D board.
A cell cannot be reused within the same path; re-visit is prohibited.
The board is moderate in size, but path length L is small relative to total cells.
Phrasing mentions "adjacent cells", "horizontally or vertically neighboring", or "sequence of letters on the board".

Common templates:

DFS with in-place marking: temporarily overwrite the cell, recurse in 4 directions, restore. Example: Word Search.
Trie-guided multi-word DFS: walk the board and trie together; prune exhausted branches. Example: Word Search II.
Path enumeration with visited set: collect every cell along the way and record full paths. Example: Longest Increasing Path in a Matrix.
Boundary-anchored flood: start DFS only from edges to handle "reach the border" variants. Example: Surrounded Regions.

Practice Problems (2)

⊣ 79 Word SearchMedium

⊣ 212 Word Search IIHard