Graph

Minimum Spanning Tree (Kruskal's / Prim's)

Connect all vertices with minimum total edge weight. Kruskal's sorts edges and greedily adds the lightest non-cycle edge using Union-Find. Prim's grows the tree from a vertex using a min-heap.

Minimum Spanning Tree (Kruskal's / Prim's)

The Motivating Problem

You are given an array of points in the 2D plane. The cost of connecting two points is their Manhattan distance. Find the minimum cost to connect all points so that there is a path between every pair. This is LeetCode 1584. Min Cost to Connect All Points.

At first glance you might consider connecting every point to its nearest neighbor, but that can leave disconnected clusters. You might try connecting everything to a single central hub, but that ignores cheaper local connections. What you actually need is a Minimum Spanning Tree.

What Is a Minimum Spanning Tree?

Given a connected, weighted, undirected graph with $V$ vertices and $E$ edges, a spanning tree is a subset of edges that connects all vertices without forming any cycles. It always has exactly $V - 1$ edges. A minimum spanning tree (MST) is the spanning tree whose total edge weight is as small as possible.

Key properties of an MST:

It connects all $V$ vertices using exactly $V - 1$ edges.
It contains no cycles.
The sum of edge weights is minimized over all possible spanning trees.
The MST is not necessarily unique. If multiple edges share the same weight, there may be several valid MSTs with equal total cost.

There are two classic algorithms for finding an MST: Kruskal's and Prim's. Both rely on the cut property: for any cut that partitions the vertices into two groups, the minimum-weight edge crossing the cut is safe to include in the MST.

Kruskal's Algorithm: Sort and Greedily Add Edges

Kruskal's algorithm takes an edge-centric approach. Sort all edges by weight from lightest to heaviest. Then iterate through them in order: for each edge, add it to the MST if it connects two previously disconnected components. If both endpoints are already in the same component, skip it: adding it would create a cycle.

The key question is how to efficiently check whether two nodes are already connected. This is exactly the problem Union-Find solves. Each union operation merges two components, and each find operation determines which component a node belongs to, both in nearly $O(1)$ amortized time.

Walked Example

Consider 4 points: A(0,0), B(2,2), C(3,1), D(1,3). The edges with Manhattan distances are:

Edge	Distance
A-B	$	0-2	+	0-2	= 4$
A-C	$	0-3	+	0-1	= 4$
A-D	$	0-1	+	0-3	= 4$
B-C	$	2-3	+	2-1	= 2$
B-D	$	2-1	+	2-3	= 2$
C-D	$	3-1	+	1-3	= 4$

Sorted edges by weight: B-C (2), B-D (2), A-B (4), A-C (4), A-D (4), C-D (4).

Start with 4 separate components: {A}, {B}, {C}, {D}.

Step 1: Edge B-C, weight 2. B and C are in different components. Add it. Components: {A}, {B, C}, {D}. MST cost: 2.

Step 2: Edge B-D, weight 2. B is in {B, C}, D is in {D}. Different components. Add it. Components: {A}, {B, C, D}. MST cost: 4.

Step 3: Edge A-B, weight 4. A is in {A}, B is in {B, C, D}. Different components. Add it. Components: {A, B, C, D}. MST cost: 8.

We now have $V - 1 = 3$ edges and all vertices are connected. Done. Total MST cost: 8.

The remaining edges (A-C, A-D, C-D) are all skipped: they would connect vertices already in the same component.

Code: Kruskal's Algorithm

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

def minCostConnectPoints(points):
    n = len(points)
    edges = []
    for i in range(n):
        for j in range(i + 1, n):
            dist = abs(points[i][0] - points[j][0]) + abs(points[i][1] - points[j][1])
            edges.append((dist, i, j))

    edges.sort()
    uf = UnionFind(n)
    cost = 0
    edges_used = 0

    for weight, u, v in edges:
        if uf.union(u, v):
            cost += weight
            edges_used += 1
            if edges_used == n - 1:
                break

    return cost

We generate all $\binom{n}{2}$ edges, sort them, and greedily add them using Union-Find. The early termination when edges_used == n - 1 is an optimization: the MST is complete once we have $V - 1$ edges.

Time complexity: $O(E \log E)$ where $E$ is the number of edges. Sorting dominates. For the complete graph in the points problem, $E = O(V^2)$ , so the total is $O(V^2 \log V)$ .

Space complexity: $O(E + V)$ for storing the edge list and Union-Find.

Prim's Algorithm: Grow the Tree from a Seed

Prim's algorithm takes a vertex-centric approach. Start from any vertex and maintain a growing tree. At each step, find the cheapest edge that connects a vertex in the tree to a vertex outside the tree, and add that edge. Repeat until all vertices are included.

This is implemented with a min-heap. Push all edges from the starting vertex onto the heap. Pop the cheapest edge. If it leads to a vertex already in the tree, skip it. Otherwise, add that vertex to the tree and push all its edges to non-tree vertices onto the heap.

Walked Example (Same Graph)

Using the same 4 points: A(0,0), B(2,2), C(3,1), D(1,3).

Start from vertex A. Mark A as visited. Push A's edges onto the heap: (4, B), (4, C), (4, D).

Step 1: Pop (4, B). B is not visited. Mark B as visited. MST cost: 4. Push B's edges to unvisited vertices: (2, C), (2, D). Heap now contains: (2, C), (2, D), (4, C), (4, D).

Step 2: Pop (2, C). C is not visited. Mark C as visited. MST cost: 6. Push C's edges to unvisited vertices: (4, D). Heap now contains: (2, D), (4, D), (4, C), (4, D).

Step 3: Pop (2, D). D is not visited. Mark D as visited. MST cost: 8. All vertices visited. Done.

Total MST cost: 8, the same result as Kruskal's, as expected.

Code: Prim's Algorithm

import heapq

def minCostConnectPoints(points):
    n = len(points)
    if n <= 1:
        return 0

    visited = [False] * n
    heap = [(0, 0)]  # (cost, node)
    total_cost = 0
    edges_used = 0

    while heap and edges_used < n:
        cost, u = heapq.heappop(heap)
        if visited[u]:
            continue
        visited[u] = True
        total_cost += cost
        edges_used += 1

        for v in range(n):
            if not visited[v]:
                dist = abs(points[u][0] - points[v][0]) + abs(points[u][1] - points[v][1])
                heapq.heappush(heap, (dist, v))

    return total_cost

We start by pushing the source node with cost 0. Each time we pop a node and mark it visited, we push edges to all unvisited neighbors. Stale heap entries (where the node is already visited) are skipped.

Time complexity: $O(V^2 \log V)$ for the complete graph case. Each of the $V$ vertices, when added to the tree, pushes up to $V - 1$ edges onto the heap. Each heap operation is $O(\log(V^2)) = O(\log V)$ .

Space complexity: $O(V^2)$ for the heap in the worst case, plus $O(V)$ for the visited array.

Prim's with Adjacency List (Sparse Graphs)

For sparse graphs given as adjacency lists, Prim's adapts naturally:

def prim_mst(graph, n):
    visited = [False] * n
    heap = [(0, 0)]  # (weight, node)
    total_cost = 0
    edges_used = 0

    while heap and edges_used < n:
        weight, u = heapq.heappop(heap)
        if visited[u]:
            continue
        visited[u] = True
        total_cost += weight
        edges_used += 1

        for v, w in graph[u]:
            if not visited[v]:
                heapq.heappush(heap, (w, v))

    return total_cost

For sparse graphs, the time complexity is $O(E \log V)$ since we push at most $E$ edges onto the heap.

When to Use Which

Kruskal's works best for sparse graphs where $E$ is close to $V$ . It sorts all edges upfront and processes them one by one. If the graph is given as an edge list, Kruskal's is natural: no need to build an adjacency list. The $O(E \log E)$ sorting step dominates, and with few edges this is fast.

Prim's works best for dense graphs where $E$ is close to $V^2$ . In a dense graph, Kruskal's spends $O(V^2 \log V)$ just sorting all the edges. Prim's with an adjacency matrix and no heap can run in $O(V^2)$ by scanning for the minimum edge at each step, which is optimal for dense graphs.

For the Min Cost to Connect All Points problem specifically, the graph is complete ( $E = V^2 / 2$ ), making it a dense graph. Both algorithms work, but Prim's avoids the memory cost of storing all $O(V^2)$ edges explicitly.

Factor	Kruskal's	Prim's
Best graph type	Sparse ( $E \approx V$ )	Dense ( $E \approx V^2$ )
Time complexity	$O(E \log E)$	$O(E \log V)$ with heap
Input format	Edge list	Adjacency list/matrix
Data structure	Union-Find	Min-heap
Can detect cycles	Yes (Union-Find)	No

Applications and Related Problems

Connecting Cities With Minimum Cost

For LeetCode 1135. Connecting Cities With Minimum Cost, you are given $n$ cities and a list of weighted edges. Find the minimum cost to connect all cities. This is a direct MST problem: apply Kruskal's or Prim's and return the total weight. If the MST cannot connect all cities (the graph is disconnected), return -1.

Critical and Pseudo-Critical Edges in MST

A harder variant is LeetCode 1489. Find Critical and Pseudo-Critical Edges in Minimum Spanning Tree, which asks you to classify each edge. A critical edge appears in every MST: removing it increases the MST weight (or disconnects the graph). A pseudo-critical edge appears in at least one MST but not all.

The approach: first compute the MST weight. For each edge, test if removing it raises the MST weight (critical). If not, test if forcing the edge in (include it first, then complete the MST with Kruskal's) still achieves the same MST weight (pseudo-critical).

Connection to Other Algorithms

MST algorithms share DNA with other graph algorithms. Prim's is structurally identical to Dijkstra's: both use a min-heap to greedily expand a frontier. The difference is what they minimize: Dijkstra minimizes distance from source, while Prim's minimizes edge weight to the tree. Kruskal's relies on Union-Find, the same structure used for dynamic connectivity and cycle detection.

Summary

The Minimum Spanning Tree connects all vertices at minimum total cost. Kruskal's sorts edges and greedily adds them using Union-Find. Prim's grows the tree from a seed vertex using a min-heap. Both produce optimal results. Choose Kruskal's for sparse graphs or edge-list inputs, Prim's for dense graphs or adjacency representations. Both run efficiently enough for interview problems: $O(E \log E)$ for Kruskal's, $O(E \log V)$ for Prim's with a heap.

Recognizing this pattern

You're probably looking at a minimum spanning tree when:

The problem asks for the minimum total cost to connect all cities, all houses, or all nodes into a single network
You need to wire up every node exactly once with no cycles, choosing the cheapest set of edges
The input is an undirected weighted graph and the output is a single number (or set of chosen edges)
The problem hints at "connect all points" or "minimum cost to merge clusters"

Common templates:

Kruskal's with Union-Find: sort edges by weight, add an edge if its endpoints are in different components. Example: Min Cost to Connect All Points.
Prim's with a min-heap: grow the tree from a seed, repeatedly pop the cheapest edge crossing the cut. Example: Connecting Cities With Minimum Cost.
MST on a complete geometric graph: derive edge weights from pairwise distances on points, then run Kruskal's or Prim's on the implicit graph. Example: Min Cost to Connect All Points.
MST with mandatory edges: pre-union the forced edges, then run Kruskal's on the rest to extend the partial tree. Example: Optimize Water Distribution in a Village.

Practice Problems (4)

⊣ 1489 Find Critical and Pseudo-Critical Edges in Minimum Spanning TreeHard

⊣ 1584 Min Cost to Connect All PointsMedium

⊣ 3547 Maximum Sum of Edge Values in a GraphHard

⊣ 3600 Maximize Spanning Tree Stability with UpgradesHard