← back

Graph

Tarjan's SCC / Bridges / Articulation Points

DFS with discovery time and low-link values to find strongly connected components, bridges (critical edges), and articulation points (critical nodes) in a single O(V+E) pass.

Tarjan's Algorithm: Bridges, Articulation Points, and Strongly Connected Components

Critical Connections in a Network

You are given a network of n servers numbered 0 to n-1 and a list of connections between them. A critical connection is an edge whose removal would disconnect some pair of servers. Find all such critical connections.

This is LeetCode 1192 (Critical Connections in a Network), and it is the defining problem for bridge-finding in undirected graphs. A bridge is an edge that, if removed, increases the number of connected components. In the server network analogy, removing a bridge means at least one server can no longer reach another. The naive approach would be to remove each edge, run a BFS/DFS to check connectivity, and put it back, giving O(E(V+E))O(E \cdot (V + E)) time. Tarjan's algorithm finds every bridge in a single O(V+E)O(V + E) pass.

The closely related concept is an articulation point (also called a cut vertex): a node whose removal, along with all its incident edges, disconnects the graph. Bridges and articulation points are both about identifying structural vulnerabilities in a graph, and Tarjan's algorithm handles both using the same underlying machinery.

The DFS Tree, Discovery Time, and Low-Link Values

When you run DFS on an undirected graph starting from any node, the traversal visits every node exactly once and uses a subset of the edges to reach them. These edges form a spanning tree of the graph, called the DFS tree. The remaining edges, the ones not used by the traversal, connect a node to one of its ancestors in the DFS tree. These are called back edges.

This is a fundamental property of undirected graphs: every non-tree edge in a DFS is a back edge. There are no cross edges (edges connecting nodes in different subtrees) because if such an edge existed, DFS would have used it to discover the other subtree before backtracking.

Tarjan's algorithm assigns two values to each node during the DFS:

Discovery time (disc[u]): a counter that records when node u was first visited. The first node visited gets disc 0, the next gets 1, and so on. Discovery times give a total ordering of when nodes were reached.

Low-link value (low[u]): the smallest discovery time reachable from u's subtree using tree edges going down and at most one back edge going up. In other words, low[u] tells you how far back toward the root of the DFS tree you can reach from the subtree rooted at u by following any number of tree edges downward and then taking a single back edge upward.

Initially, when we first visit node u, we set low[u] = disc[u]. As the DFS proceeds, low[u] gets updated in two ways:

When we traverse a tree edge from u to an unvisited child v, we recurse into v, and after returning, we set low[u] = min(low[u], low[v]). This propagates information upward: if v's subtree can reach far back via a back edge, then u inherits that reachability.

When we encounter a back edge from u to an already-visited node v (and v is not u's parent in the DFS tree), we set low[u] = min(low[u], disc[v]). This is the direct case: u itself has a back edge reaching up to v.

The Bridge Condition

An edge (u, v) is a bridge if and only if low[v] > disc[u], where v is a child of u in the DFS tree.

Think about what this means. If low[v] > disc[u], then the smallest discovery time reachable from v's entire subtree (via back edges) is strictly greater than u's discovery time. That means no node in v's subtree has a back edge reaching u or anything above u. The only way to get from v's subtree to u is through the edge (u, v) itself. Remove that edge, and v's subtree is completely cut off from the rest of the graph.

Conversely, if low[v] <= disc[u], then some node in v's subtree has a back edge reaching u or above. Even if you remove (u, v), there is still a path from v's subtree to u through that back edge. The edge (u, v) is not a bridge.

Walking Through a 5-Node Example

Consider a graph with 5 nodes (0 through 4) and edges: 0-1, 1-2, 2-0, 1-3, 3-4.

Nodes 0, 1, 2 form a triangle, and then there is a chain from 1 to 3 to 4 branching off.

Start DFS at node 0:

Visit node 0: disc[0] = 0, low[0] = 0. Go to neighbor 1 (tree edge).

Visit node 1: disc[1] = 1, low[1] = 1. Go to neighbor 2 (tree edge).

Visit node 2: disc[2] = 2, low[2] = 2. Neighbor 0 is visited and is not parent (1), so this is a back edge. Update low[2] = min(2, disc[0]) = min(2, 0) = 0. Neighbor 1 is the parent, skip. Return to node 1.

Back at node 1: update low[1] = min(low[1], low[2]) = min(1, 0) = 0. Check bridge: low[2] = 0, disc[1] = 1. Since 0 <= 1, edge 1-2 is not a bridge. Correct, because the triangle provides an alternate path.

Go to neighbor 3 (tree edge).

Visit node 3: disc[3] = 3, low[3] = 3. Go to neighbor 4 (tree edge).

Visit node 4: disc[4] = 4, low[4] = 4. No unvisited neighbors. Return to node 3.

Back at node 3: update low[3] = min(low[3], low[4]) = min(3, 4) = 3. Check bridge: low[4] = 4, disc[3] = 3. Since 4 > 3, edge 3-4 is a bridge. Return to node 1.

Back at node 1: update low[1] = min(low[1], low[3]) = min(0, 3) = 0. Check bridge: low[3] = 3, disc[1] = 1. Since 3 > 1, edge 1-3 is a bridge. Return to node 0.

Back at node 0: update low[0] = min(low[0], low[1]) = min(0, 0) = 0. Check bridge: low[1] = 0, disc[0] = 0. Since 0 <= 0, edge 0-1 is not a bridge.

Final result: bridges are (3, 4) and (1, 3). These are the two edges whose removal would disconnect the graph. The triangle 0-1-2 is redundant (each edge has an alternate path), but the chain 1-3-4 has no redundancy.

Code for Finding Bridges

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def find_bridges(n, edges):
    from collections import defaultdict
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v)
        graph[v].append(u)

    disc = [-1] * n
    low = [-1] * n
    bridges = []
    timer = [0]

    def dfs(u, parent):
        disc[u] = low[u] = timer[0]
        timer[0] += 1
        for v in graph[u]:
            if disc[v] == -1:
                dfs(v, u)
                low[u] = min(low[u], low[v])
                if low[v] > disc[u]:
                    bridges.append((u, v))
            elif v != parent:
                low[u] = min(low[u], disc[v])

    for i in range(n):
        if disc[i] == -1:
            dfs(i, -1)
    return bridges

The structure is a standard DFS with the two-update rule for low values. After recursing into a child v, we check the bridge condition low[v] > disc[u]. The parent check v != parent ensures we do not treat the edge we arrived on as a back edge.

Articulation Points

A node u is an articulation point if removing u (and all its edges) disconnects the graph. The condition is similar to bridges but subtly different.

For a non-root node u with DFS-tree child v: u is an articulation point if low[v] >= disc[u]. Notice the >= rather than the strict > used for bridges. If low[v] == disc[u], that means v's subtree can reach back to u itself via a back edge but cannot reach above u. Removing u still disconnects v's subtree from u's ancestors, so u is an articulation point. For bridges, the edge (u, v) is not a bridge in that case because v's subtree can still reach u through the back edge, meaning the edge (u, v) has a redundant path.

The root of the DFS tree is a special case. The root has no parent, so the low-link condition does not apply in the usual way. Instead, the root is an articulation point if and only if it has two or more children in the DFS tree. If the root has only one child, removing it just detaches itself. If it has two or more children, those children are in separate subtrees (there cannot be edges between subtrees in a DFS tree), so removing the root disconnects them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def find_articulation_points(n, edges):
    from collections import defaultdict
    graph = defaultdict(list)
    for u, v in edges:
        graph[u].append(v)
        graph[v].append(u)

    disc = [-1] * n
    low = [-1] * n
    is_ap = [False] * n
    timer = [0]

    def dfs(u, parent):
        disc[u] = low[u] = timer[0]
        timer[0] += 1
        children = 0
        for v in graph[u]:
            if disc[v] == -1:
                children += 1
                dfs(v, u)
                low[u] = min(low[u], low[v])
                if parent == -1 and children > 1:
                    is_ap[u] = True
                if parent != -1 and low[v] >= disc[u]:
                    is_ap[u] = True
            elif v != parent:
                low[u] = min(low[u], disc[v])

    for i in range(n):
        if disc[i] == -1:
            dfs(i, -1)
    return [i for i in range(n) if is_ap[i]]

The code counts DFS-tree children for the root and applies the low[v] >= disc[u] condition for non-root nodes.

Strongly Connected Components in Directed Graphs

In directed graphs, the analogous concept is strongly connected components (SCCs). A strongly connected component is a maximal set of vertices such that there is a directed path from every vertex in the set to every other vertex. Tarjan's SCC algorithm uses the same disc/low machinery but adapted for directed graphs.

We maintain a stack of nodes currently on the DFS path. Push every node onto the stack when it is first visited. When we finish processing a node u and find that low[u] == disc[u], u is the root of an SCC: pop everything off the stack down to and including u, and those nodes form one component.

The key difference from the undirected case is how cross/back edges work. In directed graphs, we only update low[u] = min(low[u], disc[v]) if v is currently on the stack. If v has already been assigned to a completed SCC, the edge crosses into a finished component and must be ignored. Otherwise low values from finished SCCs would leak into the current one.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def tarjan_scc(n, graph):
    disc = [-1] * n
    low = [0] * n
    on_stack = [False] * n
    stack = []
    sccs = []
    timer = [0]

    def dfs(u):
        disc[u] = low[u] = timer[0]
        timer[0] += 1
        stack.append(u)
        on_stack[u] = True
        for v in graph[u]:
            if disc[v] == -1:
                dfs(v)
                low[u] = min(low[u], low[v])
            elif on_stack[v]:
                low[u] = min(low[u], disc[v])
        if low[u] == disc[u]:
            comp = []
            while True:
                w = stack.pop()
                on_stack[w] = False
                comp.append(w)
                if w == u:
                    break
            sccs.append(comp)

    for i in range(n):
        if disc[i] == -1:
            dfs(i)
    return sccs

Kosaraju's algorithm is an alternative that is often easier to remember: run DFS on the original graph and push nodes onto a stack in finish-time order, then pop nodes off the stack and run DFS on the reversed graph from each unvisited pop. Each second-pass DFS discovers exactly one SCC. Both algorithms run in O(V+E)O(V + E).

The Parallel Edges Gotcha

When the graph can have multiple edges between the same pair of nodes (parallel or multi-edges), tracking the parent by node identity breaks the algorithm. Consider two edges between nodes u and v. When DFS goes from u to v using the first edge, it should be allowed to consider the second edge from v to u as a back edge (since removing one edge still leaves the other). But the simple v != parent check would skip both edges, incorrectly treating the second edge as the parent edge too.

The fix is to track parent by edge index rather than by node. Give each undirected edge a unique index. When you traverse from u to v using edge i, pass i as the parent edge identifier. At v, skip edge i specifically, but allow any other edge to u to be treated normally. This correctly handles parallel edges.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def find_bridges_multi(n, edges):
    from collections import defaultdict
    graph = defaultdict(list)
    for idx, (u, v) in enumerate(edges):
        graph[u].append((v, idx))
        graph[v].append((u, idx))

    disc = [-1] * n
    low = [-1] * n
    bridges = []
    timer = [0]

    def dfs(u, parent_edge):
        disc[u] = low[u] = timer[0]
        timer[0] += 1
        for v, edge_idx in graph[u]:
            if edge_idx == parent_edge:
                continue
            if disc[v] == -1:
                dfs(v, edge_idx)
                low[u] = min(low[u], low[v])
                if low[v] > disc[u]:
                    bridges.append((u, v))
            else:
                low[u] = min(low[u], disc[v])

    for i in range(n):
        if disc[i] == -1:
            dfs(i, -1)
    return bridges

Complexity Analysis

All variants of Tarjan's algorithm run in O(V+E)O(V + E) time. The DFS visits each node once and examines each edge once (twice for undirected graphs, once from each endpoint). The space complexity is O(V+E)O(V + E) for the adjacency list, the disc and low arrays, and the recursion stack. In the worst case (a graph shaped like a long chain), the recursion depth is O(V)O(V), so the call stack can be that deep. For very large graphs, an iterative DFS may be needed to avoid stack overflow.

The key insight is that a single DFS pass extracts all the structural vulnerability information about the graph. Bridges, articulation points, biconnected components, and strongly connected components can all be found in linear time by maintaining discovery times and low-link values during the traversal.

Recognizing this pattern

You're probably looking at Tarjan's algorithm when:

  • The problem asks which edges are "critical" so that removing them disconnects the network
  • You must identify articulation points or cut vertices whose removal breaks connectivity
  • The graph is directed and you need to collapse it into strongly connected components, often as a preprocessing step
  • The wording involves single points of failure, mutual reachability, or 2-edge-connected components

Common templates:

  • Bridge-finding DFS: track discovery time and low-link, classify edge (u, v) as a bridge when low[v]>disc[u]low[v] > disc[u]. Example: Critical Connections in a Network.
  • Articulation point DFS: same disc/low machinery, with a child count check at the root and low[v]disc[u]low[v] \geq disc[u] elsewhere. Example: Critical Connections in a Network.
  • SCC via Tarjan with stack: push nodes on a stack during DFS, pop a full SCC when low[u]=disc[u]low[u] = disc[u]. Example: Number of Good Paths.
  • Condensation DAG: collapse each SCC into a single super-node, then run topological logic on the condensed DAG. Example: Largest Color Value in a Directed Graph.