Graph
Connect all vertices with minimum total edge weight. Kruskal's sorts edges and greedily adds the lightest non-cycle edge using Union-Find. Prim's grows the tree from a vertex using a min-heap.
You are given an array of points in the 2D plane. The cost of connecting two points is their Manhattan distance. Find the minimum cost to connect all points so that there is a path between every pair. This is LeetCode 1584. Min Cost to Connect All Points.
At first glance you might consider connecting every point to its nearest neighbor, but that can leave disconnected clusters. You might try connecting everything to a single central hub, but that ignores cheaper local connections. What you actually need is a Minimum Spanning Tree.
Given a connected, weighted, undirected graph with vertices and edges, a spanning tree is a subset of edges that connects all vertices without forming any cycles. It always has exactly edges. A minimum spanning tree (MST) is the spanning tree whose total edge weight is as small as possible.
Key properties of an MST:
There are two classic algorithms for finding an MST: Kruskal's and Prim's. Both rely on the cut property: for any cut that partitions the vertices into two groups, the minimum-weight edge crossing the cut is safe to include in the MST.
Kruskal's algorithm takes an edge-centric approach. Sort all edges by weight from lightest to heaviest. Then iterate through them in order: for each edge, add it to the MST if it connects two previously disconnected components. If both endpoints are already in the same component, skip it: adding it would create a cycle.
The key question is how to efficiently check whether two nodes are already connected. This is exactly the problem Union-Find solves. Each union operation merges two components, and each find operation determines which component a node belongs to, both in nearly amortized time.
Consider 4 points: A(0,0), B(2,2), C(3,1), D(1,3). The edges with Manhattan distances are:
| Edge | Distance | ||||
|---|---|---|---|---|---|
| A-B | $ | 0-2 | + | 0-2 | = 4$ |
| A-C | $ | 0-3 | + | 0-1 | = 4$ |
| A-D | $ | 0-1 | + | 0-3 | = 4$ |
| B-C | $ | 2-3 | + | 2-1 | = 2$ |
| B-D | $ | 2-1 | + | 2-3 | = 2$ |
| C-D | $ | 3-1 | + | 1-3 | = 4$ |
Sorted edges by weight: B-C (2), B-D (2), A-B (4), A-C (4), A-D (4), C-D (4).
Start with 4 separate components: {A}, {B}, {C}, {D}.
Step 1: Edge B-C, weight 2. B and C are in different components. Add it. Components: {A}, {B, C}, {D}. MST cost: 2.
Step 2: Edge B-D, weight 2. B is in {B, C}, D is in {D}. Different components. Add it. Components: {A}, {B, C, D}. MST cost: 4.
Step 3: Edge A-B, weight 4. A is in {A}, B is in {B, C, D}. Different components. Add it. Components: {A, B, C, D}. MST cost: 8.
We now have edges and all vertices are connected. Done. Total MST cost: 8.
The remaining edges (A-C, A-D, C-D) are all skipped: they would connect vertices already in the same component.
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py:
return False
if self.rank[px] < self.rank[py]:
px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]:
self.rank[px] += 1
return True
def minCostConnectPoints(points):
n = len(points)
edges = []
for i in range(n):
for j in range(i + 1, n):
dist = abs(points[i][0] - points[j][0]) + abs(points[i][1] - points[j][1])
edges.append((dist, i, j))
edges.sort()
uf = UnionFind(n)
cost = 0
edges_used = 0
for weight, u, v in edges:
if uf.union(u, v):
cost += weight
edges_used += 1
if edges_used == n - 1:
break
return costWe generate all edges, sort them, and greedily add them using Union-Find. The early termination when edges_used == n - 1 is an optimization: the MST is complete once we have edges.
Time complexity: where is the number of edges. Sorting dominates. For the complete graph in the points problem, , so the total is .
Space complexity: for storing the edge list and Union-Find.
Prim's algorithm takes a vertex-centric approach. Start from any vertex and maintain a growing tree. At each step, find the cheapest edge that connects a vertex in the tree to a vertex outside the tree, and add that edge. Repeat until all vertices are included.
This is implemented with a min-heap. Push all edges from the starting vertex onto the heap. Pop the cheapest edge. If it leads to a vertex already in the tree, skip it. Otherwise, add that vertex to the tree and push all its edges to non-tree vertices onto the heap.
Using the same 4 points: A(0,0), B(2,2), C(3,1), D(1,3).
Start from vertex A. Mark A as visited. Push A's edges onto the heap: (4, B), (4, C), (4, D).
Step 1: Pop (4, B). B is not visited. Mark B as visited. MST cost: 4. Push B's edges to unvisited vertices: (2, C), (2, D). Heap now contains: (2, C), (2, D), (4, C), (4, D).
Step 2: Pop (2, C). C is not visited. Mark C as visited. MST cost: 6. Push C's edges to unvisited vertices: (4, D). Heap now contains: (2, D), (4, D), (4, C), (4, D).
Step 3: Pop (2, D). D is not visited. Mark D as visited. MST cost: 8. All vertices visited. Done.
Total MST cost: 8, the same result as Kruskal's, as expected.
import heapq
def minCostConnectPoints(points):
n = len(points)
if n <= 1:
return 0
visited = [False] * n
heap = [(0, 0)] # (cost, node)
total_cost = 0
edges_used = 0
while heap and edges_used < n:
cost, u = heapq.heappop(heap)
if visited[u]:
continue
visited[u] = True
total_cost += cost
edges_used += 1
for v in range(n):
if not visited[v]:
dist = abs(points[u][0] - points[v][0]) + abs(points[u][1] - points[v][1])
heapq.heappush(heap, (dist, v))
return total_costWe start by pushing the source node with cost 0. Each time we pop a node and mark it visited, we push edges to all unvisited neighbors. Stale heap entries (where the node is already visited) are skipped.
Time complexity: for the complete graph case. Each of the vertices, when added to the tree, pushes up to edges onto the heap. Each heap operation is .
Space complexity: for the heap in the worst case, plus for the visited array.
For sparse graphs given as adjacency lists, Prim's adapts naturally:
def prim_mst(graph, n):
visited = [False] * n
heap = [(0, 0)] # (weight, node)
total_cost = 0
edges_used = 0
while heap and edges_used < n:
weight, u = heapq.heappop(heap)
if visited[u]:
continue
visited[u] = True
total_cost += weight
edges_used += 1
for v, w in graph[u]:
if not visited[v]:
heapq.heappush(heap, (w, v))
return total_costFor sparse graphs, the time complexity is since we push at most edges onto the heap.
Kruskal's works best for sparse graphs where is close to . It sorts all edges upfront and processes them one by one. If the graph is given as an edge list, Kruskal's is natural: no need to build an adjacency list. The sorting step dominates, and with few edges this is fast.
Prim's works best for dense graphs where is close to . In a dense graph, Kruskal's spends just sorting all the edges. Prim's with an adjacency matrix and no heap can run in by scanning for the minimum edge at each step, which is optimal for dense graphs.
For the Min Cost to Connect All Points problem specifically, the graph is complete (), making it a dense graph. Both algorithms work, but Prim's avoids the memory cost of storing all edges explicitly.
| Factor | Kruskal's | Prim's |
|---|---|---|
| Best graph type | Sparse () | Dense () |
| Time complexity | with heap | |
| Input format | Edge list | Adjacency list/matrix |
| Data structure | Union-Find | Min-heap |
| Can detect cycles | Yes (Union-Find) | No |
For LeetCode 1135. Connecting Cities With Minimum Cost, you are given cities and a list of weighted edges. Find the minimum cost to connect all cities. This is a direct MST problem: apply Kruskal's or Prim's and return the total weight. If the MST cannot connect all cities (the graph is disconnected), return -1.
A harder variant is LeetCode 1489. Find Critical and Pseudo-Critical Edges in Minimum Spanning Tree, which asks you to classify each edge. A critical edge appears in every MST: removing it increases the MST weight (or disconnects the graph). A pseudo-critical edge appears in at least one MST but not all.
The approach: first compute the MST weight. For each edge, test if removing it raises the MST weight (critical). If not, test if forcing the edge in (include it first, then complete the MST with Kruskal's) still achieves the same MST weight (pseudo-critical).
MST algorithms share DNA with other graph algorithms. Prim's is structurally identical to Dijkstra's: both use a min-heap to greedily expand a frontier. The difference is what they minimize: Dijkstra minimizes distance from source, while Prim's minimizes edge weight to the tree. Kruskal's relies on Union-Find, the same structure used for dynamic connectivity and cycle detection.
The Minimum Spanning Tree connects all vertices at minimum total cost. Kruskal's sorts edges and greedily adds them using Union-Find. Prim's grows the tree from a seed vertex using a min-heap. Both produce optimal results. Choose Kruskal's for sparse graphs or edge-list inputs, Prim's for dense graphs or adjacency representations. Both run efficiently enough for interview problems: for Kruskal's, for Prim's with a heap.
You're probably looking at a minimum spanning tree when:
Common templates: