← back

Tree

Lowest Common Ancestor (LCA)

Recursively search both subtrees for the two target nodes. If the current node is one of the targets, or if one target is in each subtree, the current node is the LCA.

Given two nodes in a binary tree, find their lowest common ancestor: the deepest node that is an ancestor of both. This problem appears frequently in interviews, both on its own and as a building block for other tree problems like computing distances between nodes.

The classic version assumes you have a pointer to the root and references to the two target nodes p and q, but the nodes do not have parent pointers. You need to find the LCA using only top-down traversal.

The recursive insight

The algorithm for LeetCode 236 is elegant. At each node, ask: is p or q here, or in my left subtree, or in my right subtree? The base case handles two situations: if the current node is null, there is nothing to find, so return null. If the current node is p or q itself, return it. You have found one of the targets, and there is no need to look deeper (if the other target is below this node, then this node is the LCA anyway).

For the recursive case, search both subtrees:

1
2
3
4
5
6
7
8
def lowest_common_ancestor(root, p, q):
    if not root or root == p or root == q:
        return root
    left = lowest_common_ancestor(root.left, p, q)
    right = lowest_common_ancestor(root.right, p, q)
    if left and right:
        return root
    return left or right

The logic after the recursive calls is where the real work happens. If both left and right return non-null values, it means p was found in one subtree and q was found in the other. The current node is the point where they split, so it must be the LCA. If only one side returns a non-null value, it means both p and q are on that side, so you propagate the result upward. The LCA is somewhere deeper, and it has already been identified by a recursive call below.

Walking through a concrete example

Consider this tree, and suppose we want the LCA of nodes 5 and 1:

1
2
3
4
5
6
7
         3
        / \
       5   1
      / \ / \
     6  2 0  8
       / \
      7   4

We call lca(3, 5, 1). The root is 3, which is neither 5 nor 1, so we recurse. The left call is lca(5, 5, 1). Node 5 matches p, so it immediately returns 5. We do not even look at its children. The right call is lca(1, 5, 1). Node 1 matches q, so it returns 1. Back at node 3, both left (5) and right (1) are non-null. The current node, 3, is the LCA.

Now suppose we want the LCA of 5 and 4. We call lca(3, 5, 4). The left call is lca(5, 5, 4) which returns 5 immediately (it matches p). The right call is lca(1, 5, 4), which searches the subtree rooted at 1 and finds neither 5 nor 4, so it returns null. Back at node 3, left is 5 and right is null. We return left, which is 5. And indeed, 5 is the LCA of 5 and 4, because 4 is a descendant of 5.

This example highlights a subtle but important point: a node can be its own ancestor. The LCA of a node and one of its descendants is the node itself.

LCA in a binary search tree

For LeetCode 235, when the tree is a BST, you can exploit the ordering to find the LCA more efficiently. The BST property guarantees that for any node, all values in the left subtree are smaller and all values in the right subtree are larger. This means you can navigate directly to the LCA without searching both subtrees.

The logic is simple: if both p and q have values less than the current node, the LCA must be in the left subtree. If both have values greater, it must be in the right subtree. The moment p and q "split", meaning one is less and the other is greater, or one of them equals the current node, you have found the LCA.

1
2
3
4
5
6
7
8
def lca_bst(root, p, q):
    while root:
        if p.val < root.val and q.val < root.val:
            root = root.left
        elif p.val > root.val and q.val > root.val:
            root = root.right
        else:
            return root

This version does not even need recursion. It follows a single path from root to the LCA, making it O(h)O(h) time with O(1)O(1) space, compared to the general version which is O(n)O(n) time and O(h)O(h) space. On a balanced BST, this is O(logn)O(\log n), significantly faster.

LCA with parent pointers

LeetCode 1650 gives you nodes that have a parent pointer instead of (or in addition to) the usual left and right pointers. With parent pointers, you do not need access to the root at all. The problem transforms into something familiar: finding the intersection of two linked lists.

Starting from p and q, you can walk upward to the root, forming two "paths." These paths converge at the LCA. The classic linked list intersection technique applies: advance two pointers, and when one reaches the root (null), redirect it to the other starting node. They will meet at the intersection point.

1
2
3
4
5
6
def lca_with_parents(p, q):
    a, b = p, q
    while a != b:
        a = a.parent if a else q
        b = b.parent if b else p
    return a

Why does this work? Suppose the path from p to the root has length m, and the path from q to the root has length k. Pointer a travels m steps to the root, then switches to q and travels k steps to the LCA. Pointer b travels k steps to the root, then switches to p and travels m steps to the LCA. Both travel m + k total steps, so they arrive at the LCA at the same time.

An alternative approach is to collect the ancestors of one node into a set, then walk upward from the other node until you find one in the set. This is O(h)O(h) time and O(h)O(h) space, while the two-pointer approach is O(h)O(h) time and O(1)O(1) space.

LCA of deepest leaves

LeetCode 1123 presents a variation: find the LCA of all nodes at the maximum depth. Instead of being given specific target nodes, you need to first determine the deepest level and then find the LCA of all the leaf nodes at that level.

The clean approach combines depth computation with LCA identification in a single traversal. For each node, return the depth of its deepest descendant. If the left and right subtrees have the same deepest depth, the current node is the LCA of the deepest leaves (they are split across both sides). If one side is deeper, the LCA of the deepest leaves must be on that side.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def lca_deepest_leaves(root):
    def dfs(node):
        # Returns (depth, lca_of_deepest)
        if not node:
            return (0, None)
        left_depth, left_lca = dfs(node.left)
        right_depth, right_lca = dfs(node.right)
        if left_depth == right_depth:
            return (left_depth + 1, node)
        elif left_depth > right_depth:
            return (left_depth + 1, left_lca)
        else:
            return (right_depth + 1, right_lca)

    return dfs(root)[1]

When both sides tie in depth, the current node becomes the new LCA candidate because the deepest leaves are spread across both subtrees. When one side is strictly deeper, the deepest leaves are all on that side, so the LCA must be there too.

Distance between two nodes

A common follow-up question asks for the distance (number of edges) between two nodes p and q. The LCA is the key to solving this. The path from p to q must pass through their LCA: it goes up from p to the LCA, then down from the LCA to q. So the distance is:

distance(p, q) = depth(p) + depth(q) - 2 * depth(LCA(p, q))

You compute the depth of each node from the root, find the LCA, compute its depth, and plug into the formula. The factor of 2 accounts for the fact that the LCA's depth is counted in both paths (from root to p and from root to q), so you subtract it twice.

Alternatively, once you have found the LCA, you can compute the distance from the LCA to p and from the LCA to q directly, and add them. Both approaches give the same result.

Complexity analysis

General binary tree LCA: O(n)O(n) time, because in the worst case you visit every node (imagine p and q are both deep in the left subtree, so you still explore the right subtree before learning it returns null). The space is O(h)O(h) for the recursion stack.

BST LCA: O(h)O(h) time, O(1)O(1) space. You follow a single root-to-node path. On a balanced BST this is O(logn)O(\log n); on a skewed BST it degrades to O(n)O(n).

Parent pointer LCA: O(h)O(h) time, O(1)O(1) space with the two-pointer technique, or O(h)O(h) time and O(h)O(h) space with the ancestor set approach.

LCA of deepest leaves: O(n)O(n) time and O(h)O(h) space, since you must visit every node to determine the maximum depth.

In all variants, the LCA problem is fundamentally about understanding how two nodes relate to each other within the tree structure. The recursive approach decomposes this relationship beautifully: either the nodes are split across subtrees (making the current node the answer) or they are both on the same side (and the answer lies deeper). Once you internalize this split-or-propagate logic, every LCA variant becomes a straightforward extension.

Recognizing this pattern

You're probably looking at lowest common ancestor when:

  • The problem explicitly asks for the lowest (or nearest) common ancestor of two nodes in a tree.
  • You need the distance between two tree nodes, where depth(p) + depth(q) - 2·depth(lca) is the standard formula.
  • The problem asks for the deepest node satisfying a property across all leaves, or the smallest subtree containing a given set of nodes.
  • Many queries hit the same tree and you want to answer LCA in O(logn)O(\log n) each after preprocessing (binary lifting / Euler tour + RMQ).

Common templates: