Last Updated: 9 Oct, 2020

Distinct Subsets Count

Moderate
Asked in companies
MicrosoftPocket FMLowe's India

Problem statement

Given an array arr of N integers that may contain duplicate integers. The task is to return the count of subsets of the given array such that each subset contains only distinct elements.

Note:
As the answer can be large, return your answer modulo 10^9  + 7. 
Input format:
The first line contains an integer 'T' which denotes the number of test cases or queries to be run. Then, the T test cases follow.

The first line of each test case or query contains an integer 'N' representing the size of the array (arr).

The second line contains 'N' single space-separated integers, representing the elements in the array.
Output format:
For each test case, print the count of subsets modulo 10^9+7 in a separate line.
Note:
You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10 
1 <= N <= 10^5 
1 <= arr[i] <= 10^5 

Time limit: 1 second

Approaches

01 Approach

  • The idea is to call a helper function so as to create all possible subsets of the array.
    • vector<int> current = the vector storing the current possible subset of the array.
    • Recursive calls for generating all possible subsets.
  • We maintain a count variable to count the subsets which have no duplicate elements. Thus, in addition to the base case, we maintain a map so as to check if the frequency of any element exceeds 1.
  • We return the count of distinct subsets received from the helper function.

02 Approach

  • The idea is to take an unordered map and maintain the frequency of each integer that is a part of the array.
     
  • We maintain a count variable to count the subsets step by step. Initialize the count to 0.
     
  • As we now know the frequency of each element. We can work on the first element initially and then update it for the second element and so on.
     
  • Now, for the very first element in the map, the number of subsets that can be formed by using only that element will be the frequency of that element. Thus, we update count with the frequency of that element. 
     
  • Moving on to the second element in the map, the number of subsets that can be made by using only the second element will be the frequency of the second element but, when we move to the second element, we also need to take care of the subsets that can be formed by using the first element as well as the second element, one way of doing this is to add the second element into the previously made subsets. If the second element occurs multiple times, we too add multiple times. 

    For example: If the first element is 1 having frequency 2, the count will be 2 which comprises subsets {1A}, {1B}. 

    Now, if the second element is 3 having frequency 1, the temporary count will be 1(if we take subsets having the second element as the only element) + 2(if we take subsets having first element as well as a second element).

    The temporary count will be 1 + 2 where 1 is the count for {3} and 2 is the count for {1A, 3} and {1B, 3}. 
     
  • Let us make the above picture more clear. 

    For example: If the first element is 1 having frequency 2, the count will be 2 which consists of subsets {1A}, {1B}. 

    Now, if the second element is 5 having frequency 3, the temporary count will be 3 (if we take subsets having second element i.e 5 as the only element) + 6 (if we take subsets having first element (1) as well as the second element(5)).

    The temporary count will be 3 + 6 where ‘3’ is the count for {5A}, {5B}, {5C}  and ‘6’ is the count for:-
    {1A,5A}, {1B, 5A}, 
    {1A,5B}, {1B, 5B}, 
    {1A,5C}, {1B, 5C}.  
     
  • We can now add the previously calculated count with the newly calculated temporary count. 
     
  • We may infer that the formulae become:-
    temporaryCount= frequencyOfElement + frequencyOfElement*count.
    Count +=temporaryCount