Last Updated: 12 Nov, 2021

Subarrays With Equal 0’s, 1’s and 2’s

Hard
Asked in company
Oracle

Problem statement

You are given an array containing ‘N’ integers. In the array, the elements are 0, 1 and 2. You have a simple task, find the count of non-empty subarrays containing an equal number of 0’s, 1’s and 2’s.

The subarray of ARR is a contiguous part of the array ARR, i. e. the array ARRi, ARRi+1, . . . . . , ARRj for some 0 ≤ i ≤ j ≤ N - 1.

Follow Up :
Can you solve it in linear time and space complexities?
For Example :
If ‘N’ = 5, ‘ARR’ = {1, 1, 0, 2, 1}

There are exactly two subarrays that contain an equal number of 0’s, 1’s and 2’s. 

The first subarray is from ‘ARR[1]’ to ‘ARR[3]’, ie: {1, 0, 2}.

The second subarray is from ‘ARR[2]’ to ‘ARR[4]’, ie: {0, 2, 1}.
Input Format :
The first line contains a single integer ‘T’ denoting the number of test cases, then each test case follows:

The first line of each test case contains a single integer ‘N’, denoting the number of elements in the array.

The second line of each test case contains N integers ‘ARR’, denoting the array elements.
Output Format :
For each test case, print the maximum number of subarrays that contain equal number of 0’s, 1’s and 2’s.

Output for each test case will be printed in a separate line.
Note :
You are not required to print anything; it has already been taken care of. Just implement the function.
Constraints :
1 ≤ T ≤ 10      
1 ≤ N ≤ 1000
ARR[i] = {0, 1, 2}

Time limit: 1 sec

Approaches

01 Approach

A naive solution is to check for all the subarrays one by one, generating all the subarrays take quadratic (~O(N2)) time, and then for each generated subarray we need to iterate all the elements to calculate for the count of the number of 0’s, 1’s and 2’s. This method takes a ~O(N3) time in total.

 

We can optimize the above solution by removing the need to count the number of 0’s, 1’s and 2’s in each subarray generated, this can be done by precalculating prefix array to store the count of 0’s from ARR[0] to ARR[i] in PREF0[i], and also storing similar information for 1’s and 2’s.

 

The steps are as follows :

  1. Initialize ‘ans’ equal to 0, this variable stores the count of subarrays with an equal number of 0’s, 1’s and 2’s.
  2. Create three prefix arrays, prefix0, prefix1 and prefix2, and fill them up.
  3. Run an outer FOR loop for variable ‘i’ from 0 to N - 1, and run an inner FOR loop for variable ‘j’ from to i - 1. Each time calculate the count of 0’s, 1’s and 2’s in the subarray from j + 1 to i using the prefix arrays.
    • Note that: Count of 0’s in subarray from j + 1 to i equal to prefix0[i] - prefix0[j]. Similarly, calculate the count of 1's and 2's also.
  4. Finally, return the value of ‘ans’.

02 Approach

If we have precalculated the prefix sum arrays, then for a subarray from j + 1 to i, the following condition should hold true: (Prefix0[i] - Prefix0[j]) = (Prefix1[i] - Prefix1[j]) = (Prefix2[i] - Prefix2[j]).

 

Rearranging the above equation gives us: Prefix0[i] - Prefix1[i] = Prefix0[j] - Prefix1[j] and Prefix0[i] - Prefix2[i] = Prefix0[j] - Prefix2[j], this equation clearly tells us that for an jth index and an ith index, the difference between prefix sums of 0’s and 1’s at that ith index should be equal to the difference between prefix sums of 0’s and 1’s at that jth index. Additionally, the difference between prefix sums of 0’s and 2’s at that ith should be equal to the difference between prefix sums of 0’s and 2’s at that SAME jth index.

 

We are done with the logic, now comes the tricky part, ie: implementation. One may use two hash-maps and use Prefix0[i] - Prefix1[i] as the key for the first hash-map and store a container like C++’s std::set<int>, and similarly for second hash-map using Prefix0[i] - Prefix2[i] as the key, but it’s easy to notice that the time complexity might end up being O(N2*log(N)) in a worst-case scenario, this is because we will have to iterate all the elements of the set data-structure inside the hash-map, ignore this if you are a little confused, we will discuss an elegant yet simple approach to handle this problem more efficiently.

 

Often we can solve some hard implementation problems of hash-maps by choosing the key appropriately, in general: using a string as the key of the hash-map comes to a great rescue in many of the problems that require storing x-coordinate and y-coordinates together, as a standard hash-map won’t allow us to use a pair of integers as the key, we can rather use a string with x-coordinate and y-coordinate separated by some random variable (let’s say ‘$’) as our custom hash function for the key value of that hash-map. On a similar note, we can avoid using two different hash maps in our problem if we use a custom hash string as a key for our hash-map. This is done by using “(Prefix0[i]-Prefix1[i])” + “$” + “(Prefix0[i]-Prefix2[i])” as our key, note that: we used a random variable ‘$’, you may use any other variable. Also, a clever observation is that one may avoid the need for calculation prefix array, as we no longer need to calculate the count the number of 0’s for some particular subarray, we are just using the count of 0’s till ith index, this can be easily stored in a variable.

 

The steps are as follows :

  1. Initialize ‘ans equal to 0, this variable stores the count of subarrays with an equal number of 0’s, 1’s and 2’s.
  2. Initialize three variables count0, count1 and count2 equal to 0, these variables store the count of 0’s, 1’s and 2’s so far.
  3. Create a hash-map ‘mp’ with the key of type string and the value of type integer. And set the value of mp[“0$0”] equal to 1, as this will help us to consider subarrays starting from the 0th index later on.
  4. Run a FOR loop for variable ‘i’ from 0 to N - 1. In each iteration:
    • Increment the count of the variable corresponding to the current element.
    • Set the value of ‘diff01’ equal to count0 - count1, and set the value of ‘diff02’ equal to count0 - count2.
    • Create a hash key ‘hashKey’ of type string equal to “diff01” + “$” + “diff02”.
    • Increment the ‘ans’ by the value stored in mp[hashKey], this is because this value stores the count of previous indices that had the same ‘diff01’ and ‘diff02’ values. This means that we have now included all the valid subarrays that end at the ith index.
    • Increment the value of mp[hashKey] by 1, so that the subarrays that start at the ith index are considered in the future calculation.
  5. Finally, return the value of ‘ans’.