Last Updated: 6 Apr, 2021

Online Majority Element In Subarray

Moderate
Asked in companies
SamsungAmdocs

Problem statement

You are given an array ‘ARR’, and you are supposed to answer 'Q' queries. In each query you have three integers, ‘L’, ‘R’, and ‘X’. You are supposed to find if there exists an element in the subarray [ARR[L], ARR[L+1], …, ARR[R-1], ARR[R] that occurs at least ‘X’ number of times under the following conditions:

1. 0 <= L <= R < N, where L is the left-most index of the subarray, R is the right-most index of the subarray, and N is the number of elements in the array ‘ARR’.

2. 2 * X> R - L + 1 i.e. the threshold is always greater than the majority length of the subarray.

Input Format:
The first line of input contains an integer ‘T’, denoting the number of test cases. The test cases follow.

The first line contains an integer ‘N’, which denotes the number of elements in the array ‘ARR’.

The second line contains 'N' integers denoting the elements of the array ‘ARR’.

The third line contains an integer ‘Q’, which denotes the number of queries for the given array.

The next 'Q' lines contain three integers, ‘L’, ‘R’, and ‘X’, where 'L' is the left-most index of the subarray, 'R' is the right-most index of the subarray, and ‘X’ is the threshold value for the given query.
Output Format:
For each test case, return the majority elements if it exists. Otherwise, return -1.
Note:
You are not required to print the expected output, it has already been taken care of. Just implement the function.
Constraints:
1<= T <= 50
1 <= N <= 2* 10^4
1 <= ARR[i] <= 2* 10^4 
1 <= Q <= 10^4
0 <= L <= R < N

Time Limit: 1 sec

Approaches

01 Approach

The idea is to traverse over the given sub-array and store the count of every element in a hash map. Then, traverse over the hashmap, if the count of any element is greater than the threshold, return that element.

  1. Preprocessing(arr):
  • Declare an array 'NUMS'.
  • Equate 'NUMS' equals to 'ARR' that is passed into this function.
  1. findMajorityElement(L, R, X):
  • Declare a hashmap 'COUNT_OCCURENCES' that stores the count of all integers in the sub-array ‘L’ to ‘R’.
  • Iterate over the array from i = L to R:
    • Increment the count of the current element in the hashmap 'COUNT_OCCURENCES'.
  • Iterate over the hashmap 'COUNT_OCCURENCES' and if the occurrence of any integer is more than ‘X’, return the integer as the final answer.
  • Otherwise, return -1.

02 Approach

The idea is to use Moore’s Voting Algorithm

As we are given the condition 2 * X > r - l + 1, it means that we have to return the majority element if the count of the majority element is greater than ‘X’.

An additional check we will have at the last will be if the count of the majority element is greater than or equal to ‘X’. Otherwise, return -1.

Moore’s Voting Algorithm: It is based on the fact that since the majority element occurs more than floor(N/2) times, its frequency will be greater than the combined frequencies of all other elements. 

The algorithm gives the correct answer only if the majority element exists in the ARRay. So, in the end, we have to check the frequency of the majority element to confirm.


 

 

  1. Preprocessing/ Checker(ARR):
  • Declare an array 'NUMS'.
  • Equate 'NUMS' equals to ‘ARR’ that is passed into this function.
  1. findMajorityElement(l, r, X):
  • We will maintain 'MAJORITY_ELEMENT' to keep track of the possible candidate of the majority element.
  • We will initialize ‘count’ to 0 to store the count of 'MAJORITY_ELEMENT'.
  • Iterate over the ARRay from i = l to r:
    • If count is 0: Set 'MAJORITY_ELEMENT' to the current element, set count to 1, and continue iterating.
    • If the current element is equal to the 'MAJORITY_ELEMENT', we increment the count by 1.
    • Else, we decrement the count by 1.
  • Now, we again traverse through the array and find the count of 'MAJORITY_ELEMENT'.
  • If the count is greater than ‘X’, we return 'MAJORITY_ELEMENT' as the answer.
  • Otherwise, return -1.

03 Approach

The idea is to pre-process the given array and break it down and represent it in the form of a tree.

 

The information which will be stored in the nodes of the segment tree will be the majority element ie. a given node will store the information about the majority element in the given subtree. The answer for a particular node will depend on its left and right child node.

 

The data structure which provides these functionalities is known as a segment tree. It is also used for a number of queries like finding the minimum, maximum, GCD, etc in log(N) time.

 

How to create the segment tree?

The segment tree is represented in the form of an array. Each node contains information answers for a range or subarray. For any index ‘i’(not a leaf node), its children are at index 2*i and 2*i + 1 respectively. The index will be starting from 1 for normalization(ie. Index 0 will be unused).

We are eliminating some of the recursive calls by checking if the majority element exists in either of its left child or right child, then only we will find the majority element for the current node. 

  1. Build Segment tree/ Pre-processing:
  • Declare an array 'SEGMENT_TREE' for the segment tree that we are creating for storing the majority of elements of the subtrees.
  • Store the position of every element in a hash map 'POSITIONS' with key as an integer and value as an array of integers(storing the indices).
  • Iterate over all the array 'ARR':
    • Insert ‘i’ as one of the values behind arr[i] as the key.
  • Declare an integer ‘n’ for the class which stores the size of the given array 'ARR'.
  • The bottom-up approach is used to build a segment tree. An iterative approach can also be used but generally, the bottom-up approach is used. It takes O(N) time.
  • Define a recursive function 'BUILD_SEGMENT_TREE', that takes 3 arguments - 'ARR', 'CURRENT_INDEX', 'LOW', 'HIGH' that denotes the given array 'ARR',  the index at which we are storing the sum of the range 'LOW' to 'HIGH', and 'LOW' and 'HIGH' are the ranges of the indexes of the current subtree respectively.
    • Base Condition is when low and high indexes are the same, then make the segmentTree[currentIndex] equal to arr[l].
    • Find the middle 'MID' of 'LOW' and 'HIGH' as the average of 'LOW' and 'HIGH'.
    • Recursively call 'BUILD_SEGMENT_TREE' for the left subtree.i.e buildSegmentTree(arr, currentIndex*2+1,low,mid).
    • Recursively call 'BUILD_SEGMENT_TREE' for the right subtree.i.e buildSegmentTree(arr, currentIndex*2+2,mid+1,high).
    • Now, if the majority element exists in its left node, find the majority element for the current node. For finding the majority element for the current node - Define a function ‘findMajorityElement’ which takes 3 arguments - 'CURRENT_VALUE', 'LOW', 'HIGH' that denotes the current value in the segment tree(value o, the lowest and highest indices in which we are looking for the majority element. It returns the count of the majority element in the given subarray.
      • If we can’t find any occurrences of the 'CURRENT_VALUE' in the hash map 'POSITIONS', return 0 from the function as the number of occurrences of the majority element will be 0.
      • If we found the occurrence of 'CURRENT_VALUE', we find an iterator 'FIRST_IT' pointing to an element that has a value not less than ‘left’(lower_bound in C++).
      • If 'FIRST_IT' points to the end of the vector, return 0.
      • Now, we find an iterator 'SECOND_IT' pointing to an element that has a value greater than ‘right’.
      • Return secondIt - firstIt as the count of number of occurrences of the majority element.
         
  1. findMajorityElement(l, r, x):
  • Define a function 'RANGE_MINIMUM_QUERY' which takes 5 arguments - 'LOW', 'HIGH', 'REQUIREDL', 'REQUIREDH', 'CURRENT_INDEX' that denotes the lowest and highest indices of the current for the current recursive call, lowest and highest indices that we are looking for in the array, the current index that we are iterating for in the segment tree respectively. It returns a pair of integers that stores the majority element and its count respectively. If the majority element doesn’t exist, store {-1, -1}.
    • If the range completely lies inside the desired range, then check if it is a majority element and return the element with its count. Otherwise, return {-1, -1}.
    • If the range completely lies outside the desired range, return {-1, -1}.
    • Find the middle 'MID' of 'LOW' and 'HIGH' as the average of 'LOW' and 'HIGH'.
    • Recursively, find the majority element for the left sub tree ie. RANGE_MIN_QUERY(LOW, MID, REQUIREDL, REQUIREDH, 2* CURRENT_INDEX + 1).
    • Recrusively find the majority element for the right sub tree ie. RANGE_MIN_QUERY(MID + 1, HIGH, REQUIREDL, REQUIREDH, 2 * CURRENT_INDEX + 1).
    • If the majority element exists in the left subtree, then return the answer we got from the left recursive call.
    • If the majority element exists in the right subtree, then return the answer we got from the right recursive call.
    • Otherwise, return {-1, -1}.
  • Store it in a variable ‘TEMP’.
  • Check if the occurrences of the majority element are more than the threshold, return it as the majority element. Otherwise, return -1.