
Introduction
This blog will discuss the “subset (ii)” problem in which we have to find all the non-empty unique subsets of the set of elements of a given array that may contain duplicate elements. Before jumping into the problem and the approaches to solve it, let’s discuss “what is a set and a subset?”.
A set is a collection of unique elements.
A subset of a set A is defined as a set B whose each element is also an element of set A.
Now, let’s understand the “subsets (ii)” problem. In this problem, we have to print all the non-empty unique subsets of the set formed by the elements of a given array of integers. The array may contain duplicate elements, but the repeated subset should be considered only once while printing the unique subsets.
Now, let’s take an example to make it more clear. Let us assume the given array containing duplicate elements is arr = [0, 1, 1].
We have to print a 2-dimensional array of all the unique subsets of the set formed by the elements of "arr".
All the subsets of {0, 1, 1} are { }, {0}, {1}, {1}, {0, 1}, {0, 1}, {1, 1}, {0, 1, 1}. But {1} and {0, 1} are repeated and {} is an empty subset.
So, all the required unique subsets of the set {0, 1, 1} are - {0}, {1}, {0, 1}, {1, 1}, {0, 1, 1}
and we have to print them.
Also see, Data Structures
Recommended: Try the Problem yourself before moving on to the Solution
Solution Intuition and Approach
If an array contains “n” elements, then the total number of subsets will be “2 ^ n” as for each element of the array, we have two choices - either take it into the subset or not take it. But as we know in this question, the given array may contain duplicate elements, so some subsets will be repeated.
For finding all the subsets, we can generate binary number representations from 0 to (2 ^ n - 1) and create a subset corresponding to each number. If the ith bit of the binary number is set, then take the ith element in the subset, else not take it in the subset. We can understand this in a more precise way by taking an example.
Let’s consider the same example as above.
arr = [0, 1, 1]
It has 3 elements and so the total number of subsets will be ‘8’. But as the array is containing duplicate elements, there will be some repeated subset.
Numbers from o to 8 | Binary Representation | Subset formed |
0 | 000 | Empty subset |
1 | 001 | {1} |
2 | 010 | {1} |
3 | 011 | {1, 1} |
4 | 100 | {0} |
5 | 101 | {0, 1} |
6 | 110 | {0, 1} |
7 | 111 | {0, 1, 1} |
Here we can see that {1} and {0, 1} are repeating, but we have to print them only once. We can create an unordered set of strings to store the subsets by generating them in the form of strings. We can use a unique character to separate two elements of the subset in their corresponding string form. If we insert the same string multiple times, the unordered set will store it only once. In this way, we will get all the unique subsets. So, we will finally print all the unique subsets after recovering them from their string form.
Let’s demonstrate this idea on our example which we have taken-
So, let S be the unordered set. It will be initially empty.
Numbers from 0 to 8 |
Binary representation | Subset formed | String form for each subset by separating each element with ‘@’ | Set |
0 | 000 | Empty subset | “” | An empty subset, so don’t add it into our solution set |
1 | 001 | {1} | “1@” | {“1@”} |
2 | 010 | {1} | “1@” | This string is already present in the set. So, no change |
3 | 011 | {1, 1} | “1@1@” | { “1@”, “1@1@”} |
4 | 100 | {0} | “0@” | { “1@”, “1@1@”, “0@1@”} |
5 | 101 | {0, 1} | “0@1@” | { “1@”, “1@1@”, “0@1@”, “0@1@”} |
6 | 110 | {0, 1} | “0@1@” | This string is already present in the set. So, no change |
7 | 111 | {0, 1, 1} | “0@1@1@” | {“1@”, “1@1@”, “0@1@”, “0@1@”, “0@1@1@”} |
Thus, we can see that in this way, we have got the set containing all the unique subsets in their string form. So, now we can print them by recovering them.