Kruskal’s algorithm is used to find the minimum spanning tree. Kruskal's Algorithm creates the spanning tree by gradually adding edges to an expanding spanning tree. Let’s first understand the meaning of a spanning tree and a minimum spanning tree.
What is a Spanning Tree?
Suppose you are given a connected and undirected graph G where graph G has V vertices and E edges. The tree which spans all the vertices of graph G is called the spanning tree. In contrast, the spanning tree is a subgraph of graph G. Given that a graph doesn't involve cycles, it is referred to as a tree. A tree is similar to a graph with N nodes and N-1 edges.
What is the Minimum Spanning Tree?
The weights of all the edges in the tree are added to determine the cost of the spanning tree. Many spanning trees may exist. The spanning tree with the lowest cost among all other spanning trees is known as a minimum spanning tree. There may be a lot of minimum spanning trees as well.
Algorithm
In order to find the minimum spanning tree of the undirected graph, we can use two algorithms, i.e., Kruskal's and Prim’s. In this article, we will learn about Kruskal’s algorithm.
Kruskal’s Algorithm
Kruskal's Algorithm creates the spanning tree by gradually adding edges to an expanding spanning tree. The greedy strategy used by Kruskal's algorithm is to find the edge with the least weight in each iteration and add it to the expanding spanning tree.
Algorithm
Sort the graph edges according to their relative weights.
So after sorting, our adjacency list will look something like this(the above example is taken):
Weight
Initial node
Final node
1
1
2
2
2
5
2
3
4
3
1
5
5
5
4
7
2
3
10
3
5
From the edge with the smallest weight up until the edge with the heaviest weight is added to the MST.
Add only edges that don't form cycles or connect only detached parts.
What To Do?
How can I determine whether vertices are connected or not, then?
DFS Algorithm, which starts at the first vertex and determines if the second vertex is visited or not, might be used to do this. However, because DFS has an order of where is the number of vertices and the number of edges, it will have a high time complexity. So "Disjoint Sets" is the ideal solution.
Disjoint Sets have no elements in common because their intersection is the empty set.
Implementation
C++
#include <bits/stdc++.h>
using namespace std;
// DSU data structure
// path compression + rank by union
class DSU {
int* parent;
int* rank;
public:
DSU(int n)
{
parent = new int[n];
rank = new int[n];
for (int i = 0; i < n; i++)
{
parent[i] = -1;
rank[i] = 1;
}
}
int find(int i)
{
if (parent[i] == -1)
return i;
return parent[i] = find(parent[i]);
}
// Union function
void unite(int x, int y)
{
int s1 = find(x);
int s2 = find(y);
if (s1 != s2)
{
if (rank[s1] < rank[s2])
{
parent[s1] = s2;
rank[s2] += rank[s1];
}
else
{
parent[s2] = s1;
rank[s1] += rank[s2];
}
}
}
};
class Graph {
vector<vector<int> > adjlist;
int V;
public:
Graph(int V)
{ this->V = V; }
void addEdge(int x, int y, int w)
{
adjlist.push_back({ w, x, y });
}
void kruskals()
{
sort(adjlist.begin(), adjlist.end());
// Initialize the DSU
DSU s(V);
int ans = 0;
cout << "Following are the edges in the constructed MST"<< endl;
for (auto x : adjlist) {
int w = x[0];
int x = x[1];
int y = x[2];
// We will take this edge in MST if it does
// not form a cycle
if (s.find(x) != s.find(y)) {
s.unite(x, y);
ans += w;
cout << x << "->" << y << " == " << w<< endl;
}
}
cout << "Minimum Cost Spanning Tree: " << ans;
}
};
int main()
{
Graph g(4);
g.addEdge(0, 1, 10);
g.addEdge(1, 3, 15);
g.addEdge(2, 3, 4);
g.addEdge(2, 0, 6);
g.addEdge(0, 3, 5);
// Function call
g.kruskals();
return 0;
}
You can also try this code with Online C++ Compiler
from collections import defaultdict
class Graph:
def __init__(self, vertices):
self.V = vertices
self.graph = []
# function to add an edge to the graph
def addEdges(self, u, v, w):
self.graph.append([u, v, w])
# A utility function to find set of an element i.
def find(self, parent, i):
if parent[i] == i:
return i
return self.find(parent, parent[i])
# A function that does the union of two sets of x and y
# (uses union by rank)
def union(self, parent, rank, x, y):
xroot = self.find(parent, x)
yroot = self.find(parent, y)
# Attach a smaller rank tree under the root of
# high-rank tree (Union by Rank)
if rank[xroot] < rank[yroot]:
parent[xroot] = yroot
elif rank[xroot] > rank[yroot]:
parent[yroot] = xroot
# If ranks are the same, then make one as a root
# and increment it's rank by one
else:
parent[yroot] = xroot
rank[xroot] += 1
def KruskalMST(self):
result = [] # This will store the resultant MST
# An index variable used for sorted edges
i = 0
# An index variable used for result[]
e = 0
self.graph = sorted(self.graph,
key=lambda item: item[2])
parent = []
rank = []
# Create V subsets with single elements
for x in range(self.V):
parent.append(x)
rank.append(0)
# Number of edges to be taken is equal to V-1
while e < self.V - 1:
u, v, w = self.graph[i]
i = i + 1
x = self.find(parent, u)
y = self.find(parent, v)
/*Include this edge in the result and raise the result's index for the following
edge if including it doesn't result in a cycle.*/
if x != y:
e = e + 1
result.append([u, v, w])
self.union(parent, rank, x, y)
# Else, discard the edge
mincost = 0
print("Edges in the constructed MST")
for u, v, wt in result:
mincost += wt
print("%d -- %d == %d" % (u, v, wt))
print("Minimum Spanning Tree is:", mincost)
# Driver's code
if __name__ == '__main__':
g = Graph(4)
g.addEdges(0, 1, 10)
g.addEdges(0, 2, 6)
g.addEdges(0, 3, 5)
g.addEdges(1, 3, 15)
g.addEdges(2, 3, 4)
# Function call
g.KruskalMST()
You can also try this code with Online Python Compiler
In Kruskal’s algorithm, the most time-consuming operation is sorting because the total complexity of the Disjoint-Set operations will be O(E log V), which is the overall Time Complexity of the algorithm.
Space Complexity
In Kruskal's algorithm, we store the parent and rank of each node in an array. So the space complexity reduces to O(N). Check out this problem - No of Spanning Trees in a Graph
Frequently Asked Questions
What are a tree and a spanning tree?
One sort of graph is a tree. A subgraph of the graph that is a tree and spans every vertex is called a spanning tree.
What are the properties of a spanning tree?
There is the same number of edges and vertices in each of a graph's potential spanning trees. A cycle can never be found inside a spanning tree. If we remove one edge from the spanning tree, it will become unconnected since spanning trees are always minimally linked.
Which is faster Prim or Kruskal?
The Prim algorithm returns connected components and solely utilizes connected graphs. In dense graphs, Prim's method operates more quickly. In sparse graphs, Kruskal's algorithm operates more quickly.
Why does Kruskal's algorithm work?
The Kruskal algorithm is greedy in graph theory since it adds increasing cost arcs at each step while it searches for the least spanning tree for a linked weighted network. The overall weight of all the edges in the tree is reduced, and as a result, it discovers a subset of the edges that creates a tree with every vertex.
Why is the Kruskal algorithm called greedy?
It is a greedy method since you decided to union two sets of vertices at each stage based on the smallest weight available. You also decided to utilize the edge that appeared to be the best option at the time. The algorithm is said to be greedy since this phase is greedy.
Conclusion
In the above article, we have discussed the following topics: