Minimum Spaning Trees

 

Task: In a connected, undirected and weighted graph G=(V,E), find a subset T of E such that the edges in T create a path between any two nodes in G, and the sum of the

weights of the edges in T is minimized.

 

Observation: (V,T) must be acyclic. If there is a cycle in it, removing the edge with highest weight from the cycle will still preserve the connectedness property,

and the sum of the weights will be reduced.

 

Observation: (V,T) must be a tree, since it is acyclic and it connects all the vertices.

 

We call the solution to this problem a minimum spanning tree.

 

A cut of the graph G is a partition (S,V-S) of the vertices. An edge (u,v) crosses the cut if u is in S and v is in V-S or vice-versa. A cut respects a subset A of E if

none of the edges in A cross the cut. An edge is a light edge satisfying a given property if it is the edge with the minimal weight among all the edges satisfying that property.

 

Theorem 1: Let G=(V,E) be a undirected, connected, weighted graph. Let A be a subset of E that is included in some minimum spanning tree for G. Let (S,V-S) be a cut

of G that respects A, and (u,v) be a light edge that crosses the cut. Then A + {(u,v)} is also included in some minimum spanning tree.

 

Proof. See CLRS Chapter 23.1

 

We call the (u,v) edge in theorem 1 a safe edge for A. This theorem can give us a very simple generic algorithm for computing a minimum spaning tree:

 

GENERIC-MST(G,w)
1.    A = {}
2.    while A does not form a spanning tree
3.        find an edge (u,v) that is safe for A
4.        A = A + {(u,v)}
5.    return A
 

Of course, the above algorithm is not very useful. The trick is really in finding a safe edge. We’ll look at two greedy algorithms for doing so

 

Kruskal’s algorithm

 

MST-KRUSKAL(G,w)
1.    A = {}
2.    for each vertex v in V
3.        MAKE-SET(v)
4.    sort the edges of E into nondecreasing order by their weight w
5.    for each edge (u,v) in E, taken in nondecreasing order by weight
6.        if FIND-SET(u) != FIND-SET(v)
7.            A = A + {(u,v))}
8.            UNION(u,v)
9.    return A
 

For this algorithm, we will need a disjoint set implementation that implements the MAKE-SET (creating a set out of a single element),

FIND-SET (finding the set to which the given element belongs)  and UNION (creating a union of two sets) efficiently. One such implementation

is the union-by-rank with path-compression set implementation, which is explained in CLRS 21.3 and which we explained in class.

 

An example of Kruskal’s minimum spanning tree algorithm can be found here.

 

The algorithm creates disjoint sets out of each vertex. In the loop, it adds an edge to the tree being constructed by finding a light edge that connects two disjoint sets.

It flows directly from the generic algorithm. When the edge (u,v) is chosen, vertex u belongs to a tree A (component) of the graph G, where A is part of the final minimum

spanning tree, to some other component. Since cut (A, V-A) respects A, and (u,v) is the light edge crossing the cut (A,V-A), then (u,v) is a safe edge for A.

 

The running time is computed as follows: MAKE-SET will take O(V) time for all vertices. Sorting edges takes O(E log(E)). Looking at every edge and finding sets

takes O(E a(V)). Union takes O(1). So the total time is O(E log(E)) = O(E log(V)).

 

Prim’s algorithm

 

Prim takes a different approach. He starts with an arbitrary root of the tree, and then in a loop adds a light edge that connects the current tree with a vertex that is not currently in the tree.

He uses a min-priority queue for all the vertices that are not currently connected to the tree that is being constructed

 

MST-PRIM(G,w,r)
1.    for each u in V
2.       key[u] = INFINITY
3.       p[u] = NULL
4.    key[r] = 0
5.    Q = CREATE-MIN-QUEUE(V)
6.    while Q != {}
7.       u = EXTRACT-MIN(Q)
8.       for each v adjacent to u
9.          if v is in Q and w(u,v) < key[v]
10.         p[v] = u
11.         key[v] = w(u,v)
12. A = {(v,p[v]): v is in V – {r}}
13. return A
 
 
 

Prim’s algorithm also follows the generic algorithm closely. It maintains implicitly a current tree A of all the vertices NOT in the queue Q.

A is a part of the final minimum spanning tree. key[v] for every vertex v in the Q denotes the currently computed minimum distance from that

vertex to the closest vertex (p[v]) in A. Line 7 ensures that the vertex removed from the queue (and added to the implicit current solution A)

is safe for A, since the cut (A, V-A) respects A, and edge (p[u], u) is the light edge that crosses the cut (A, V-A).

 

An example of Prim’s minimal spaning tree construction can be found here.

 

The running time for Prim’s algorithm is computed as follows:

 

Initialization on lines 1-5 takes O(V) time, since all keys except for the root one are INFINITY. The while loop is executed |V| times. EXTRACT-MIN

takes O(log(V)) time, and the for loop in lines 8-11 is executed total 2|E| times.

Key assignment on line 11 involves an implicit DECREASE-KEY operation which takes O(log(V)) time.

So the total execution time is O(V log(V) + E log(V)) = O(E log(V)). You can get an even better asymptotic time if you use Fibonacci’s heaps

for queue implementation, which have O(log (V)) time for EXTRACT-MIN and O(1) for DECREASE-KEY, for a total of O(E + V log(V)) time.