Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction 
2.
Disjoint Set Data Structure
2.1.
Representative Member
3.
Disjoint Set Operations in DAA
3.1.
Union/Merge
3.2.
Find
4.
Find a Cycle in an Undirected Graph using Disjoint Set Data Structure
4.1.
Algorithm
4.2.
Walkthrough
4.3.
Code in C++
4.4.
Complexities for Disjoint Set Operations in DAA
5.
Frequently Asked Questions
5.1.
Is there a way to decrease the time complexity of the disjoint set operations in daa?
5.2.
Are disjoint set operations in daa used practically?
5.3.
Why do we merge the parents instead of the actual nodes in the merge operation for a disjoint set?
6.
Conclusion
Last Updated: Mar 27, 2024
Medium

What are operations on Disjoint set in DAA?

Author Satvik Gupta
0 upvote

Introduction 

Disjoint sets in mathematics are two sets that don't have any element in common. Sets can contain any number of elements, and those elements can be of any type. We can have a set of cars, a set of integers, a set of colors, etc. Sets also have various operations that can be performed on them, such as union, intersection, difference, etc. 

Operations on Disjoint Set In DAA

In computer science and DAA, we focus on Disjoint Set Data Structure and the operations we can perform. In this article, we will go through what a disjoint set data structure is, the disjoint set operations in daa, as well as an example application of it.

Disjoint Set Data Structure

Before we look at disjoint set operations in daa, let us understand what the disjoint set data structure itself is.

A disjoint set data structure is a data structure that stores a list of disjoint sets. In other words, this data structure divides a set into multiple subsets - such that no 2 subsets contain any common element. They are also called union-find data structures or merge-find sets.

For example: if the initial set is [1,2,3,4,5,6,7,8].

A Disjoint Set Data Structure might partition it as - [(1,2,4), (3,5), (6,8),(7)].

This contains all of the elements of the original set, and no 2 subsets have any element in common.

The following partitions would be invalid:

[(1,2,3),(3,4,5),(6,8),(7)] - invalid because 3 occurs in 2 subsets.

[(1,2,3),(5,6),(7,8)] -invalid as 4 is missing.

Representative Member

Each subset in a Disjoint Set Data Structure has a representative member. More commonly, we call this the parent. This is simply the member responsible for identifying the subset. 

Let's understand this with an example. 

Suppose this is the partitioning of sets in our Disjoint Set Data structure.

Example partitioning in Disjoint Set Data Structure

We wish to identify each subset, so we assign each subset a value by which we can identify it. The easiest way to do this is to choose a member of the subset and make it the representative member. This representative member will become the parent of all values in the subset. 

Assigning representative in Disjoint Set Data Structure

Here, we have assigned representative members to each subset. Thus, if we wish to know which subset 3 is a part of, we can simply say - 3 is a part of the subset with representative member 1. More simply, we can say that the parent of 3 is 1

Here, 1 is its own parent. 

Also see, Longest Common Substring and Rabin Karp Algorithm

Disjoint Set Operations in DAA

There are 2 major disjoint set operations in daa: 

  1. Union/Merge - this is used to merge 2 subsets into a single subset.
  2. Find - This is used to find which subset a particular value belongs to.

We will take a look at both of them. To make things easier, we will change our sets from integers to Cities. 

Example

Let's say this Data Structure represents the places connected via express trains/ metro lines or any other means. 

Union/Merge

Suppose, in our example - a new express train started from Delhi to Agra. This would mean that all the cities connected to Delhi are now connected to all the cities connected to Agra. 

This simply means that we can merge the subsets of Agra and Delhi into a single subset. It will look like this. 

Merge Delhi and Agra subsets

One key issue here is which out of Agra and Delhi will be the new Parent? For now, we will just choose any of them, as it will be sufficient for our purposes. Let's take Delhi for now. 

Further optimizations can be made to this by choosing the parent based on which subset has a larger number of values.

Choose representative element for the merged subsets

This is the new state of our Disjoint Set Data Structure. Delhi is the parent of Delhi, Gurgaon, Noida, Agra, Vrindavan, and Mathura.

This would mean that we would have to change the parent of Agra, Vrindavan, and Mathura (basically, all the children of Agra), to Delhi. This would be linear in time. If we perform the union operation m times, and each union takes O(n) times, we would get a quadratic O(mn) time complexity. This is not optimized enough.

Instead, we structure it like this:

Structure of new merged subsets

Earlier, Agra was its own parent. Now, Delhi is the parent of Agra. 

Now, you might be wondering - The parent of Vrindavan is still Agra. It should have been Delhi now. 

This is where the next operation helps us - the find operation.

Find

The find operation helps us find the parent of a node. As we saw above, the direct parent of a node might not be its actual (logical) parent. E.g., the logical parent of Vrindavan should be Delhi in the above example. But its direct parent is Agra. 

So, how do we find the actual parent?

The find operation helps us to find the actual parent. In pseudocode, the find operation looks like this:

find(node):

if (parent(node)==node) return node;

else return find(parent(node));

We don't return the direct parent of the node. We keep going up the tree of parents until we find a node that is its own parent. 

Let's understand this via our example. 

find example

Suppose we have to find the parent of Mathura. 

  1. Check the direct parent of Mathura. It's Agra. Is Agra == Mathura?
    The answer is false. So, now we call the find operation on Agra. 
  2. Check the direct parent of Agra. It's Delhi. Is Delhi ==Agra?
    The answer is false. So now we call the find operation on Delhi.
  3. Check the direct parent of Delhi. It's Delhi. Is Delhi==Delhi.
    The answer is true! So, our final answer for the parent of Mathura is Delhi.

This way, we keep going "up" the tree until we reach a root element. A root element is a node with its own parent - in our case, Delhi.

Example of Disjoint Set Data Structure with different parents

Let's try to find the parent of K in this example. (A and D are their own parents)

  1. ParentOf(K) is I. Is I == K? 
    False. So, we move upward, now using I.
  2. ParentOf(I) is C. Is C== I? 
    False. Move upward using C.
  3. ParentOf(C) is A. Is A==C?
    False. Move upward using A.
  4. ParentOf(A) is A. Is A==A?
    True! Our final answer is A.

Let's understand the use of a Disjoint Set Data Structure through a simple application - Finding a cycle in an undirected graph.

 

Also see, Morris Traversal for Inorder, hash function in data structure

Find a Cycle in an Undirected Graph using Disjoint Set Data Structure

Let us understand the disjoint set operations in daa more clearly with the help of an example. Here, we will use a disjoint set data structure to find whether an undirected graph contains a cycle or not. We are assuming that our graph does not contain any self-loops. 

Example graph

Let this be our graph. Each node is initialized as its own parent. The list of edges is shown on the top right, and the parents are shown at the bottom. We have shown each edge only once for simplicity (i.e., 1-2 and 2-1 aren't shown separately)

Algorithm

We will go through each edge and do the following. Let the edge be u-v.

  1. Find the parent of u and v. If both parents are the same, it means u and v are part of the same subset, and we have found a cycle.v
  2. If they are not the same, merge u and v. We don't merge u and v directly. Instead, we merge the parents of u and v. This ensures that all the siblings of u and all the siblings of v have the same common parent. 

We do this for all edges. 

Walkthrough

1. 1-2 Edge. 

Parent of 1 is 1, parent of 2 is 2. They are different, so we will merge them. Set the parent of 2 as (this is chosen randomly, setting the parent of 1 as 2 would also have been correct ).

Go through first edge

2. 1-4 Edge. 
Parent of 1 is 1. Parent of 4 is 4. They are not the same, so merge 1 and 4. Set parent of 4 as 1. 

Go through second edge

3.  2-5 Edge. 
The parent of 2 is 1, and the parent of 5 is 5. They aren't the same, so we'll merge them. Set the parent of 5 as 1.

Go through third edge

4. 3-6 Edge. 
The parent of 3 is 3, and the parent of 6 is 6. They aren't the same, so we'll merge them. Set the parent of 6 as 3.
 

Go through 4th edge

5. 5-6 Edge. 
The parent of 5 is 1, and the parent of 6 is 3. They aren't the same, so we'll merge them. Set the parent of 3 as 1.

Go through 5th edge

Note here - the direct parent of 6 is still 3. But, the actual parent becomes 1, because the parent of 3 is 1. (The dotted line is not an edge in the graph, it shows that the parent of 3 is 1)
 

6. 6-2 Edge. 
The parent of 6 is 1 (actual parent), and the parent of 2 is also 1. These are both the same. Thus, our graph contains a cycle.

Code in C++

#include <iostream>
#include <cstdio>
#include <vector>
#include <algorithm>
using namespace std;

// An edge in the graph
struct Edge {
    // the endpoints of the edge
    int u, v;
    // constructor for Edge
    Edge(int u, int v) : u(u), v(v) {}
};

// This class represents a graph using edge list representation
class Graph {
  public:
    int V;              // no of vertices in the graph
    vector<Edge> edges; // list of edges in the graph
 
    Graph(int v) { // constructor
        V = v;
    }

    // adds an edge (u,v) to the graph
    void addEdge(int u, int v) { edges.push_back(Edge(u, v)); }
};

// A disjoint set data structure that supports merge and find operations
class DisjointSet {
//The array that stores the parents of each node
    vector<int> parent;

  public:
    //the number of nodes
    int V;
    // constructor
    DisjointSet(int V) : V(V) {
        //initialise with V+1 values so our nodes can be 1-indexed
        parent = vector<int>(V + 1);
        for (int i = 1; i <= V; i++) {
            parent[i] = i;
        }
    }
    //find operation of Disjoint Set Data Structure
    int find(int u) {
        if (u == parent[u])
            return u;
        return find(parent[u]);
    }
    //merge or union operation of Disjoint Set Data Structure
    void merge(int u, int v) {
        int parentU = find(u);
        int parentV = find(v);
        if (parentU != parentV) {
            parent[parentU] = parentV;
        }
    }
};

// function to detect whether cycle exists in graph, using Disjoint Set Data Structure
void cycleExists(Graph g) {
    DisjointSet ds(g.V);
    for (Edge e : g.edges) {
        int u = e.u;
        int v = e.v;
        if (ds.find(u) == ds.find(v)) {
            cout << "Cycle exists between " << u << " and " << v << endl;
            return;
        }
        ds.merge(u, v);
    }
    cout << "No cycle exists" << endl;
}

//main driver function
int main() {
    Graph g(6);
    g.addEdge(1, 2);
    g.addEdge(1, 4);
    g.addEdge(2, 5);
    g.addEdge(3, 6);
    g.addEdge(5, 6);
    g.addEdge(6, 2);

    cycleExists(g);

    return 0;
}

 

Output:

output

Complexities for Disjoint Set Operations in DAA

Regardless of the example or the purpose in which they are used, disjoint set operations in daa always have the same space and time complexities. The complexities for both find and union operations are:

The space complexity is O(n), where n is the number of vertices. The parent array stores the parent of each vertex. Hence we get O(n) space complexity. We aren't considering the space complexity of the graph in this because the graph is the input here.

The time complexity is O(log N) in the average case because the find operation will have to go up a tree with n nodes. The height of a tree in the average case will be log N. In the worst case, we can have a skewed tree, and our time complexity will become O(n). 
Check out this problem - Largest BST In Binary Tree

Frequently Asked Questions

Is there a way to decrease the time complexity of the disjoint set operations in daa?

Yes, we can perform union by path-compression and rank operations to reduce the time complexity. With this, we don't change how the tree is traversed - but we make sure the skew case can never occur. 

Are disjoint set operations in daa used practically?

Disjoint set operations are optimal and efficient in practical uses. They have a lot of uses in real-life applications - e.g., finding connected components, Kruskal's algorithm, etc.

Why do we merge the parents instead of the actual nodes in the merge operation for a disjoint set?

We merge parents instead of nodes so that all the children of both parents come under a single parent. E.g., let's say we need to merge B and Y. A is the parent of B, and X is the parent of Y. Another node C, also has its parent as A. And another node, Z, also has its parent as Y. If we merge B and Y directly by setting the parent of B as Y - then we will disconnect B from A and from C. However, if we merge A and X, by setting the parent of X as A, then all the nodes (A, B, C, X, Y, Z) will have a single parent A. This is what we want.

Conclusion

In this article, we have discussed the disjoint sets, disjoint set operations in daa, methods to detect a cycle in a graph, and implementation using union-find method in detail. 

After reading about how to find the Closest element to a target value in Binary Search Tree, are you not feeling excited to read/explore more articles on the topic of BST? Don't worry; Coding Ninjas has you covered. If you want to practice questions on the binary tree then follow these links: disjoint set data-structure, Check if two sets are disjointunion-find algorithm.

Refer to our Guided Path on Coding Ninjas Studio to upskill yourself in Data Structures and AlgorithmsCompetitive ProgrammingJavaScriptSystem Design, and many more! If you want to test your competency in coding, you may check out the mock test series and participate in the contests hosted on Coding Ninjas Studio! But suppose you have just started your learning process and are looking for questions asked by tech giants like Amazon, Microsoft, Uber, etc. In that case, you must look at the problemsinterview experiences, and interview bundle for placement preparations.

Nevertheless, you may consider our paid courses to give your career an edge over others!

Do upvote our blogs if you find them helpful and engaging!

Happy Learning!

Live masterclass