MAKE-SET(x) 1 p[x] ← x 2 rank[x] ← 0 UNION(x, y) 1 LINK(FIND-SET(x), FIND-SET(y)) LINK(x, y) 1 if rank[x] > rank[y] 2 then p[y] ← x 3 else p[x] ← y 4 if rank[x] = rank[y] 5 then rank[y] ← rank[y] 1
The FIND-SET procedure with path compression is quite simple.
FIND-SET(x) 1 if x ≠ p[x] 2 then p[x] ← FIND-SET(p[x]) 3 return p[x]
The FIND-SET procedure is a two-pass method: it makes one pass up the find path to find the root, and it makes a second pass back down the find path to update each node so that it points directly to the root. Each call of FIND-SET(x) returns p[x] in line 3. If x is the root, then line 2 is not executed and p[x] = x is returned. This is the case in which the recursion bottoms out. Otherwise, line 2 is executed, and the recursive call with parameter p[x] returns a pointer to the root. Line 2 updates node x to point directly to the root, and this pointer is returned in line 3.
Separately, either union by rank or path compression improves the running time of the operations on disjoint-set forests, and the improvement is even greater when the two heuristics are used together. Alone, union by rank yields a running time of O(m lg n), and this bound is tight. Although we shall not prove it here, if there are n MAKE-SET operations (and hence at most n - 1 UNION operations) and f FIND-SET operations, the path-compression heuristic alone gives a worst-case running time of Θ (n f · (1 log2 f / n n)).
When we use both union by rank and path compression, the worst-case running time is O(m α (n)), where α(n) is a very slowly growing function, which we define in Analysis of union by rank with path compression. In any conceivable application of a disjoint-set data structure, α(n) ≤ 4; thus, we can view the running time as linear in m in all practical situations. In Analysis of union by rank with path compression, we prove this upper bound.