Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction 
1.1.
Example
1.1.1.
Input 
1.1.2.
Output
1.1.3.
Explanation
1.1.4.
Input 
1.1.5.
Explanation
2.
Solution Approach
2.1.
Naive Approach 
2.2.
Efficient Approach 
2.3.
Idea
2.4.
Algorithm 
2.5.
C++ Code implementation
2.6.
Output
2.7.
Complexity Analysis
2.7.1.
Time complexity: O(Q * NlogN)
2.7.2.
Space Complexity: O(N ^ 2)
3.
FAQs
4.
Key Takeaways 
Last Updated: Mar 27, 2024

Merge Sort Tree For Range Order Statistics

Author Ayush Prakash
1 upvote

Introduction 

The problem statement is: Given an input array ‘input[]’ of size ‘n and ‘q’ queries. Every query has three arguments, ‘l’, ‘r’ and ‘k’. We need to print the Kth smallest element in the range from l to r. l and r mark the start and end of the subarray, respectively. 

Example

Input 

INPUT[] = {3, 4, 0, 6, 8, 1} 
Q = [4, 5, 2] 

Output

8

Explanation

The 2nd smallest element in the subarray {8, 1} is 8.

Input 

INPUT[] = {1, 5, 9, 0, 9, 1, 8, 3, 3, 2} 
Q = [0, 5, 3] 

Output

1

Explanation

Clearly, the 3rd smallest element in the subarray {1, 5, 9, 0, 9, 1} is 1.

Also see, Euclid GCD Algorithm

Solution Approach

Naive Approach 

For any query(l, r, k), we first create a temporary array ‘temp’. We copy the elements of the array from i = l to r into temp. Then we sort temp. Finally, we print temp[k - 1] as the answer. This approach is quite easy to implement. 

The time complexity of this approach is O(NlogN) as we are sorting. The space complexity is O(N) as we are using temp as an auxiliary array. 

Efficient Approach 

Idea

The idea is to build a merge sort tree(segment tree) in which every node contains an array of sorted elements of the subarray that is represented by that node. For more clarity, refer to the image attached below. For querying, the same traditional methodology is followed. However, everything is discussed in the next section. 

Algorithm 

  1. Declare a global 2D array ‘seg’, that refers to the merge sort tree. As already mentioned, every node contains a sorted subarray, ranging from ‘low’ to ‘high’.
  2. Define a function buildST(idx, low, high, input), that builds the merge sort tree recursively:
    1. Understanding the parameters: A node is defined on the range ‘low’ to ‘high’. The node's data is stored in seg[idx] or, essentially, the node = seg[idx]. ‘input’ is the input array.
    2. Base case: when low = high, this means, that the range consists of only one element. Hence, simply set seg[i] = input[low]. 
    3. Else, calculate mid = (low + high) / 2.
    4. Recursively, call ‘buildST(2 * idx + 1, low, mid, input)’ and buildST(2 * idx + 2, mid + 1, high, input) to build the left and right subtrees. If the current node is seg[idx], the left node is seg[2 * idx + 1] and the right node is seg[2 * idx + 2]. This is an important property of merge sort trees.  
    5. After recursively building the left and right subtrees, the values of left and right nodes are used to build the current node. FYI, the left and right node store sorted subarrays in a range [low, mid] and [mid + 1, high], which are merged to construct a new bigger subarray that is stored in seg[idx], i.e. the current node. 
  3. Define a wrapper function ‘query(l, r, k, n)’ as: 
    1. ‘l’ is the start, and ‘r’ is the end of the query range. ‘k’ denotes that the Kth smallest number in the given range has to be returned. ‘n’ is the total number of elements in the ‘input’ array. 
    2. This function calls ‘queryHelper(idx, l, r, low, high)’, which does all the heavy lifting and returns a sorted subarray in the range [l, r]. ‘idx’ points to the node corresponding to the range [low, high], a query range is bounded by [l, r]. 
    3. This function is defined as: 
      1. When the node lies completely inside the range [l, r] return seg[idx].
      2. When the node lies completely outside the range [l, r], return an empty array.
      3. Else, recursively call for the left child i.e. queryHelper(2 * idx + 1, l, r, low, mid) and queryHelper(2 * idx + 2, l, r, mid + 1, high), store the answers in ‘left’ and ‘right’ arrays, respectively, merge them and return the merger array. 
    4. The wrapper function makes a call to queryHelper(0, l, r, 0, n - 1), stores the answer of the call into an array and returns the element at the (k - 1)th index. 

C++ Code implementation

#include <bits/stdc++.h>
using namespace std;
 
// Globally declared
vector<vector<int>> seg;
 
// This function mergers two sorted arrays
vector<int> merge(vector<int> &a1, vector<int> &a2)
{
  int n = a1.size(), m = a2.size();
  int i = 0, j = 0;
  vector<int> res;
  while (i < n and j < m)
  {
      if (a1[i] < a2[j])
      {
          res.push_back(a1[i]);
          i += 1;
      }
      else
      {
          res.push_back(a2[j]);
          j += 1;
      }
  }
  while (i < n)
  {
      res.push_back(a1[i++]);
  }
  while (j < m)
  {
      res.push_back(a2[j++]);
  }
  return res;
}
 
// This function builds the segment tree
void buildST(int idx, int low, int high, vector<int> &input)
{

  // Base case: when there is a single elment in the range [LOW, HIGH]
  if (low == high)
  {
      seg[idx].push_back({input[low]});
      return;
  }
  int mid = (low + high) / 2;

  // Recursive calls to build the left and right subtrees
  buildST(2 * idx + 1, low, mid, input);
  buildST(2 * idx + 2, mid + 1, high, input);

  /*
  The current node's value is calculated by merging
  the values of the left and the right nodes.
  */
  seg[idx] = merge(seg[2 * idx + 1], seg[2 * idx + 2]);
}
 
vector<int> queryHelper(int idx, int l, int r, int low, int high)
{

  // When a node is completely inside
  if (low >= l and high <= r)
  {
      return seg[idx];
  }

 
   // When a node is completely outside
  if (r < low or l > high)
  {
      return vector<int>();
  }

  int mid = (low + high) / 2;

  /*
  Fetch sorted subarray from the left and the right subtrees
  and merge them to get a sorted array in the range [l, r]
  */
  vector<int> left = queryHelper(2 * idx + 1, l, r, low, mid);
  vector<int> right = queryHelper(2 * idx + 2, l, r, mid + 1, high);
  return merge(left, right);
}
 
/*
Wrapper function which triggers queryHelper,
returns the Kth smallest element in the given range [l, r]
*/
int query(int l, int r, int k, int n)
{
  vector<int> res = queryHelper(0, l, r, 0, n - 1);
  return res[k - 1];
}
 
int main()
{
  vector<int> arr{3, 4, 0, 6, 8, 1};
  int n = arr.size();

  // Setting up globals
  seg.clear();
  seg.resize(4 * n);

  // Building the segment tree
  buildST(0, 0, n - 1, arr);

 
   // Querying and returning the answer
  cout << query(4, 5, 2, n) << endl;
}

Output

8

Complexity Analysis

Time complexity: O(Q * NlogN)

For a single query, the most that we can go down in the merge sort tree is the height. The height of the tree ~logN, and at every call, we perform a merge operation that takes O(N) time in the worst case. If there are Q queries to answer, the overall time complexity becomes O(Q * NlogN).

Space Complexity: O(N ^ 2)

We need some auxiliary space to store the algorithm's merge sort tree that we build. The size of the merge sort tree is 4 * N, and every element in the tree is an array itself. The size of this array at max can be N because the nodes in the merge sort tree store a subarray of the given input array. Therefore the overall space complexity is O(N ^ 2).

Read More - Time Complexity of Sorting Algorithms

FAQs

  1. What is a merge sort tree?
    A merge sort tree is a special kind of data structure that is used to store information about intervals or segments of an array. It is represented by an array.
     
  2. Is a merge sort tree a complete tree or a full tree?
    The merge sort tree is a full binary tree as we always divide segments of the given array into halves at every level.
     
  3. What is the time taken for query and update operations on a merge sort tree? 
    O(logN) for both operations.

Key Takeaways 

In this blog, we discussed a very interesting problem: Merge Sort tree for range order statistics. We learned about merge sort trees. How does this data structure optimize query and update operations? We learned to build a merge sort tree based on the requirement and write the query function to answer the queries efficiently. We also analyzed the space and time complexity of the algorithm thoroughly. 

Recommended Problem - Merge K Sorted Arrays

Application of merge sort trees can be found a lot in competitive programming. To practice such questions, you can check out this amazing platform Coding Ninjas Studio. Also, to master competitive programming, enrol in the Best Competitive Programming Course Online.

If you learned anything new or you feel that this blog is informative, please do share this with your friends. Happy coding!

Live masterclass