Kruskal's Algorithm: Analysis

  • Compare various approaches to checking for cycles and the resulting time/space tradeoffs between them for Kruskal's Algorithm.

Exercise Based on your understanding of Kruskal's algorithm, how can we efficiently implement the step which involves finding the next min-weight edge in GG?

Solution
  • Keep a sorted array of edges. Keep a pointer to the next position (edge).
  • Keep edges in a (min-heap) priority queue.

With an optimal sorting algorithm (to sort edges of the input graph by increasing weight), both approaches are O(MlgM)\Omicron(M \lg M) runtime.

We would spend O(MlgM)\Omicron(M \lg M) to sort the edges and then get the next edge in O(1)\Omicron(1) time. Whereas, we can build the PriorityQueue in O(M)\Omicron(M) time and remove the next "best" edge in O(lgM)\Omicron(\lg M). We would have to do the "remove" O(M)\Omicron(M) times because some edges may have to be disregarded (they cause cycle).

Exercise Once the next min-weight edge (v,w)(v, w) is found, how can we efficiently check if adding it to the MST would create a cycle?

Solution

We cannot check for a cycle by simply checking if the endpoints are already in TT (why?). We can run BFS/DFS on TT, start at vv and check if ww is reachable.

Exercise Based on your answers to the previous questions, analyze the asymptotic complexity of Kruskal's algorithm.

Solution
OperationFrequencyCost per operation
build PQ11O(M)\Omicron(M)
extract minO(M)\Omicron(M)O(lgM)\Omicron(\lg M)
run BFS/DFSO(M)\Omicron(M)O(N+M)\Omicron(N+M)

From the table, it can be seen that Kruskal's algorithm is quadratic. However, we can improve the performance by using another data structure called Union-Find for efficiently checking/preventing cycles. We will explore Union-Find in the next chapter!