Graph
O(V³) DP where for each intermediate node k, update the shortest path between every pair (i, j) through k. Used when all-pairs shortest paths are needed and V is small.
Sometimes you do not just need the shortest path from one source. You need shortest paths between every pair of nodes simultaneously. Given a network of cities, you might want to precompute the travel time between all pairs so that any query can be answered in . Or you might need to find which city can reach the fewest other cities within a distance threshold. These are all-pairs shortest path problems, and Floyd-Warshall is the classic algorithm for solving them.
An obvious approach to all-pairs shortest paths is to run Dijkstra from every node. With a binary heap, each run takes , so runs take . For dense graphs where is close to , this becomes .
Floyd-Warshall runs in with no log factor. For dense graphs, it is both simpler and faster. It also handles negative edge weights, which Dijkstra cannot. The trade-off is that Floyd-Warshall always takes regardless of how sparse the graph is. For sparse graphs with non-negative weights, running Dijkstra times can be faster.
In practice, Floyd-Warshall shines when is small (up to around 400), the graph may have negative edges, or you want clean, simple code. For interview problems with small node counts asking about all-pairs distances, Floyd-Warshall is usually the right tool.
Floyd-Warshall uses dynamic programming. The key idea is to gradually expand the set of allowed intermediate nodes.
Define dist[i][j] as the shortest path from i to j using only nodes 0 through k-1 as intermediates. Initially (k = 0), no intermediate nodes are allowed, so dist[i][j] is simply the direct edge weight from i to j (or infinity if no direct edge exists, or 0 if i equals j).
When we introduce node k as a potential intermediate, we ask: is it cheaper to go from i to j through k? The path from i to j through k consists of the path from i to k and then from k to j. So we check whether dist[i][k] + dist[k][j] is less than dist[i][j] and update accordingly.
After considering all possible intermediate nodes (k from 0 to V-1), dist[i][j] contains the true shortest path between every pair.
The most common mistake when implementing Floyd-Warshall is putting k in one of the inner loops instead of the outer one. This produces incorrect results.
The reason k must be outermost is that the recurrence depends on the subproblem structure. When we process intermediate node k, we need the distance values to already reflect the optimal paths using intermediates 0 through k-1. If k were an inner loop, we would be updating dist[i][j] for a fixed i across all intermediates in a single pass of i. This mixes up the subproblem boundaries: when we consider intermediate k for pair (i, j), the values dist[i][k] and dist[k][j] might not yet have been computed with all intermediates 0 through k-1 for those pairs.
With k as the outer loop, by the time we reach iteration k, the entire dist matrix correctly represents shortest paths using intermediates 0 through k-1. This guarantees that dist[i][k] and dist[k][j] are correct when we use them to update dist[i][j].
Think of it this way: each iteration of k is a complete "upgrade" of the distance matrix, incorporating one more possible waypoint. You must finish one upgrade before starting the next.
Consider 4 nodes (0, 1, 2, 3) with these directed edges: 0 to 1 with weight 3, 0 to 2 with weight 6, 1 to 2 with weight 2, 2 to 3 with weight 1, 3 to 0 with weight 2, 1 to 3 with weight 8.
Initialize the distance matrix. Use 0 for diagonal entries, edge weights for direct edges, and infinity (shown as INF) otherwise.
Initial matrix:
Row 0: [0, 3, 6, INF]. Direct edges to 1 and 2 exist. Row 1: [INF, 0, 2, 8]. Direct edges to 2 and 3. Row 2: [INF, INF, 0, 1]. Direct edge to 3. Row 3: [2, INF, INF, 0]. Direct edge to 0.
Iteration k = 0 (allow node 0 as intermediate). For each pair (i, j), check if going through 0 is cheaper. The relevant paths through 0 require a finite dist[i][0] and dist[0][j]. Node 3 can reach 0 (dist = 2), so paths from 3 through 0 are considered. dist[3][1] = min(INF, dist[3][0] + dist[0][1]) = min(INF, 2 + 3) = 5. dist[3][2] = min(INF, 2 + 6) = 8.
After k = 0: Row 3 becomes [2, 5, 8, 0]. All other rows unchanged.
Iteration k = 1 (allow node 1 as intermediate). Node 0 can reach 1 (dist = 3). dist[0][2] = min(6, 3 + 2) = 5. dist[0][3] = min(INF, 3 + 8) = 11. Node 3 can reach 1 (dist = 5). dist[3][2] = min(8, 5 + 2) = 7. dist[3][3] is already 0, no change.
After k = 1: Row 0 becomes [0, 3, 5, 11]. Row 3 becomes [2, 5, 7, 0].
Iteration k = 2 (allow node 2 as intermediate). Paths through 2 use dist[i][2] + dist[2][j]. dist[2][3] = 1, so any node that can reach 2 can potentially reach 3 more cheaply. dist[0][3] = min(11, 5 + 1) = 6. dist[1][3] = min(8, 2 + 1) = 3. dist[3][3] = min(0, 7 + 1) = 0, no change.
After k = 2: Row 0 becomes [0, 3, 5, 6]. Row 1 becomes [INF, 0, 2, 3].
Iteration k = 3 (allow node 3 as intermediate). dist[3][0] = 2, so paths through 3 back to 0 are considered. dist[1][0] = min(INF, 3 + 2) = 5. dist[2][0] = min(INF, 1 + 2) = 3. dist[2][1] = min(INF, 1 + 5) = 6. Checking others: dist[0][0] = min(0, 6 + 2) = 0, no change.
Final matrix:
Row 0: [0, 3, 5, 6]. Row 1: [5, 0, 2, 3]. Row 2: [3, 6, 0, 1]. Row 3: [2, 5, 7, 0].
Every entry dist[i][j] now holds the shortest distance from node i to node j. For example, the shortest path from 1 to 0 is 1 -> 2 -> 3 -> 0 with cost 2 + 1 + 2 = 5.
def floyd_warshall(n, edges):
INF = float('inf')
dist = [[INF] * n for _ in range(n)]
# Distance from every node to itself is 0
for i in range(n):
dist[i][i] = 0
# Initialize direct edge weights
for u, v, w in edges:
dist[u][v] = min(dist[u][v], w) # handle parallel edges
# k MUST be the outer loop
for k in range(n):
for i in range(n):
if dist[i][k] == INF:
continue # optimization: skip if i can't reach k
for j in range(n):
if dist[i][k] + dist[k][j] < dist[i][j]:
dist[i][j] = dist[i][k] + dist[k][j]
return distThe early continue when dist[i][k] is infinity is a practical optimization. If node i cannot reach intermediate node k, then routing through k cannot help for any destination j. This does not change the asymptotic complexity but avoids unnecessary additions and comparisons.
Note the min when initializing edge weights. If there are multiple edges between the same pair of nodes (parallel edges), we only want the lightest one.
Unlike Dijkstra, Floyd-Warshall correctly handles negative edge weights. The algorithm considers all possible intermediate paths and does not rely on the greedy assumption that closer nodes are finalized first. It will discover that a path through a negative edge offers a shorter route regardless of the order in which intermediates are considered.
However, Floyd-Warshall does not handle negative cycles, cycles whose total weight is negative. If a negative cycle exists, you can loop around it indefinitely and drive the distance to negative infinity. The algorithm will not detect this automatically during its main loop.
After running Floyd-Warshall, check the diagonal of the distance matrix. If dist[i][i] is negative for any node i, then node i lies on a negative cycle. In a graph without negative cycles, the shortest path from any node to itself is 0 (just stay put). A negative diagonal entry means the algorithm found a path from i back to i with negative total weight, which is exactly a negative cycle.
def has_negative_cycle(dist):
for i in range(len(dist)):
if dist[i][i] < 0:
return True
return FalseIf you need to identify which pairs of nodes are affected by negative cycles (i.e., have effectively negative-infinity distances), you can do a second pass: if dist[i][k] is finite and dist[k][j] is finite and dist[k][k] < 0, then the shortest path from i to j is negative infinity because you can detour through the negative cycle at k as many times as you like.
A related problem is transitive closure: given a directed graph, determine for every pair (i, j) whether there exists a path from i to j (of any length). This is a reachability problem rather than a shortest-path problem.
Floyd-Warshall adapts to this by replacing arithmetic with logic. Instead of a distance matrix, maintain a boolean matrix reach[i][j] indicating whether i can reach j. Initialize it to True where a direct edge exists and on the diagonal. The update becomes: reach[i][j] = reach[i][j] OR (reach[i][k] AND reach[k][j]).
def transitive_closure(n, edges):
reach = [[False] * n for _ in range(n)]
for i in range(n):
reach[i][i] = True
for u, v in edges:
reach[u][v] = True
for k in range(n):
for i in range(n):
if not reach[i][k]:
continue
for j in range(n):
if reach[k][j]:
reach[i][j] = True
return reachThe structure is identical to Floyd-Warshall. The outer loop over k is still essential for correctness, for exactly the same reason as before.
LeetCode 399 gives you a list of equations like a / b = 2.0, b / c = 3.0, and asked to evaluate queries like a / c or c / a. This is elegantly modeled as a weighted graph problem.
Each variable is a node. An equation a / b = v creates a directed edge from a to b with weight v, and a reverse edge from b to a with weight 1/v. The query a / c asks for the product of weights along a path from a to c in this graph.
Floyd-Warshall fits naturally, but with multiplication instead of addition. Initialize dist[a][b] = v for each equation. The update becomes dist[i][j] = dist[i][k] * dist[k][j] if both dist[i][k] and dist[k][j] are known. After running the algorithm, dist[a][c] directly gives the answer to a / c.
def calcEquation(equations, values, queries):
# Map variables to indices
variables = set()
for a, b in equations:
variables.add(a)
variables.add(b)
idx = {v: i for i, v in enumerate(variables)}
n = len(variables)
# Initialize distance matrix
dist = [[float('inf')] * n for _ in range(n)]
for i in range(n):
dist[i][i] = 1.0 # a / a = 1
for (a, b), v in zip(equations, values):
dist[idx[a]][idx[b]] = v
dist[idx[b]][idx[a]] = 1.0 / v
# Floyd-Warshall: propagate ratios through intermediates
for k in range(n):
for i in range(n):
if dist[i][k] == float('inf'):
continue
for j in range(n):
if dist[k][j] == float('inf'):
continue
if dist[i][j] == float('inf'):
dist[i][j] = dist[i][k] * dist[k][j]
# Answer queries
results = []
for a, b in queries:
if a not in idx or b not in idx:
results.append(-1.0)
elif dist[idx[a]][idx[b]] == float('inf'):
results.append(-1.0)
else:
results.append(dist[idx[a]][idx[b]])
return resultsIn practice, this problem is more commonly solved with union-find or DFS/BFS since the number of variables is small and we only need specific pairs, not all pairs. But it illustrates how Floyd-Warshall generalizes: any operation that is associative and has an identity element can replace addition, and any compatible comparison can replace min.
Floyd-Warshall has three nested loops, each iterating times, giving time complexity. This is independent of the number of edges. Space complexity is for the distance matrix.
The cubic time makes Floyd-Warshall practical only for small graphs, typically up to around 400 nodes. Beyond that, the term becomes prohibitive. For larger graphs, running Dijkstra from each node or using Johnson's algorithm (which combines Bellman-Ford reweighting with Dijkstra) is preferable.
Despite its simplicity, Floyd-Warshall is remarkably versatile. By changing the "relax" operation, it solves transitive closure, widest path (maximum bottleneck), and multiplicative path problems with the same elegant triple loop.
You're probably looking at Floyd-Warshall when:
Common templates: