Last Updated: 25 Feb, 2021

Find Duplicates Using Bit Array

Moderate
Asked in companies
MicrosoftD.E.Shaw

Problem statement

You are given an array/list ‘ARR’ of N elements where N <= 32000. ‘ARR’ can have duplicate entries. Your task is to find duplicate elements, provided you can use a maximum of 4 KiloBytes of extra memory.

Note :
Any element can have a maximum of two entries, or we can say a duplicate element will have two entries.

You need to return an array/list of duplicate elements where the elements in the array/list will be in increasing order.
For example:
Given ‘ARR’  = [1, 3, 2, 7, 4, 2, 1] 

In this ‘ARR’, 1 and 2 have duplicate entries.

So, the output array/list = [1, 2]
Input Format:
The first line of input contains a single integer T, representing the number of test cases.
Then the T test cases follow.

The first line of each test case contains a number N denoting the size of the array/list ‘ARR’.

The second line contains N space-separated distinct integers representing the ‘ARR’ elements.
Output format:
For each test case print the output array/list where elements are separated by a single space.

The output of every test case will be printed in a separate line. 
Note :
You don’t have to print anything, it has already been taken care of. Just implement the given function.
Constraints
1<= T <=100
1 <= N <= 32000
0 <= ‘ARR[i]’ <= 32000

Time limit: 1 second

Approaches

01 Approach

The idea is very simple. For each element, we will iterate over the whole ‘ARR’ to check whether duplicate exits or not.

 

  • Initialize an output array/list of integers say ‘ANS’ which will store duplicates.
  • Iterate over the ‘ARR’ for 0 <= i < ‘N’ and do :
    • Iterate over the ‘ARR’ for i+1 <= j < ‘N’ and do:
      • If ‘ARR[i]’ is equal to ‘ARR[j]’ :
        • Append ‘ARR[i]’ into ‘ANS’
  • Return ‘ANS’.

02 Approach

The idea is to use a bit array as we can have a maximum of 32000 elements but with given memory constraints 4KB which will evaluate to 32778 bits ( 4*8*1024 ), we can address up to 32778 addresses.

 

So we will make a Bits array for the purpose of hashing. Actually, an integer has a size of 4 Byte or we can say 32 Bits. So we will make an integer array/list say ‘BIT_ARRAY’ of size 1000 and each ‘BIT_ARRAY’ index will be used to address 32 elements so now we can address up to 32000 elements with help of ‘BIT_ARRAY’.

 

  • Initialize an array/list of integers  ‘BIT_ARRAY’  of size 1000 with 0.
  • Initialize an output array/list of integers ‘ANS’ which will store duplicates.
  • Iterate over the ‘ARR’ for 0 <= i < ‘N’ and do :
    • Find the index of ‘BIT_ARRAY’ which has the address for ‘ARR[i]’ as :
      • The index will be equal to ‘ARR[i]’ divided by 32.
    • Find the bit of index (BIT_NO) which have the address for ‘ARR[i]’ as :
      • Taking AND operation of ‘ARR[i]’ with 31(as each element has 32 Bits).
    • Let ‘VAL’ = ‘BIT_ARRAY[ 2^BIT_NO]’ .
    • If ‘VAL’ is equal to 1 :
      • It means the current element is duplicate so append ‘ARR[i]’ in ‘ANS’.
    • If ‘VAL’ is equal to 1 :
      • Set ‘BIT_ARRAY[ 2^BIT_NO]’ = 1.
  • Return ‘ANS’.