Introduction
In this blog, we will discuss how to find the number of overlapping intervals to be removed to maximize the Non-Overlapping Intervals. It is an important problem to get a strong grip over greedy algorithms and has also been asked in many interviews.
A common prerequisite is having a basic knowledge of sorting, dp, and how comparators are written. Without knowing this, it is recommended to learn sorting, dp, and comparators. Nevertheless, we’ll try to cover each point in-depth that is required to solve the problem, Non-Overlapping Intervals.
What do we mean by the Overlapping Intervals?
Well, it’s a trivial question, but still, for the sake of clarity, we’ll define that let
Two intervals be [a, b], and [c, d] where a<=b and c <= d, are said to be overlapping iff their intersection is not empty. In the mathematical form, we can write as:
Two intervals [a, b] and [c, d] are said to be overlapping if,
→ a <= c <= b <= d
Or,
→ [a, b] ⋂ [c, d] ≠ 𝟇
Let’s look at the problem statement:
You are given a set of intervals [ai, bi ] of length N, where i 𝞊 { 1, 2, ….. N }. Return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.
Let’s understand the problem using an example.
Input: N and a set of intervals of length N,
Intervals = {[1,2], [2,3], [3,4], [1,3]}
Output: 1
Explanation: The minimum number of intervals that can be removed is 1 i.e, [1, 3].
Approach(Naive)
First, let’s look at the naive approach that would probably strike anybody’s mind before jumping to an optimal approach.
So, the basic approach considers all subsets of the given set of intervals and then finds the maximum length of the subset of non-overlapping intervals using the definition mentioned earlier.
Hence the minimum number of intervals removed = total number of intervals - the maximum size of the subset of non-overlapping intervals
So the naive/basic approach could be formulated as:
- For each subset in the given set of intervals, check if the subset contains only non-overlapping intervals.
- If the subset contains only non-overlapping intervals then maximize the size.
- Finally, return the answer as the total number of intervals - the maximum size of the subset of non-overlapping intervals.
Now let’s look at the PseudoCode.
PseudoCode
Algorithm
___________________________________________________________________
procedure removeMinimumOverlappingIntervals(intervals):
___________________________________________________________________
1. subsets ← computeSubsets(intervals) #compute all subsets of intervals
2. maxLength ← 0 # initially set to 0 in case the intervals size is 0
3. for subset in subsets do
4. if containsNonOverlappingIntervals(subset) == True do
5. maxLength ← max(maxLength, subset.size())
6. end if
7. return intervals.size() - maxLength
8. end procedure
___________________________________________________________________
CODE IN C++
//C++ program to find the minimum number of overlapping intervals to be removed
#include <bits/stdc++.h>
using namespace std;
vector<vector<vector<int>>> computeSubsets(vector<vector<int>> & intervals, int N){
vector<vector<vector<int>>> subsets;
// we know that subsets can be represented using an integer
// and there are at most 1<<N subsets
// hence iterating over all integers from 0 to 1<<N
for(int i=0; i<(1<<N); ++i){
vector<vector<int>> v; // current subset
int j = 0; // variable to check if the bit at a position is set in i
while((1<<j)<=i){
if(((1<<j)&i)>0){ // if bit at current position is set in both i and j
v.push_back(intervals[j]); // push it in the current subset
}
j++; // shift to the next bit
}
if(v.size()>0) // if the subset is not empty push it
subsets.push_back(v); // in the subsets vector
}
// return the subsets
return subsets;
}
// boolean function that finds whether the set
// of intervals contain the set of non-overlapping intervals
bool containsNonOverlappingIntervals(vector<vector<int>> subset){
sort(subset.begin(), subset.end());
for(int i=0;i<subset.size()-1;++i){
// if intersection is not null
if(subset[i][1]>=subset[i+1][0] or subset[i][1]>=subset[i+1][0]){
return false; // this subset doesn't contain non-overlapping intervals
}
}
return true;
}
// finds the minimum number of overlapping intervals to be removed
int removeMinimumOverlappingIntervals(vector<vector<int>> & intervals, int N){
int maxLength = 0; // maximum length of the non-overlapping intervals set
// subsets
vector<vector<vector<int>>> subsets = computeSubsets(intervals, N);
// iterate over all subsets
for(vector<vector<int>> subset : subsets){
// if it contains non-overlapping Intervals
if(containsNonOverlappingIntervals(subset)){
//maximise the length
maxLength = max(maxLength, (int)subset.size());
}
}
// return maxi_length
return N - maxLength;
}
int main() {
// number of intervals
int N = 4;
vector<vector<int>> v{{1, 2}, {2, 3}, {3, 4}, {1, 3}};
cout<<"The minimum number of intervals to be removed: ";
cout<<removeMinimumOverlappingIntervals(v, N);
return 0;
}
Output
The minimum number of intervals to be removed: 1
Complexity Analysis
Time Complexity: O((N*logN) *2N)
This is because we are computing all the subsets, and for each subset, we are checking if it contains all non-overlapping intervals. So to compute the number of subsets, it takes O(2N) time. The maximum length can be N. Sorting the N length subset takes O(NlogN) time. Hence in total, it takes O((NlogN) *2N) time.
Space complexity: O(2N) at the worst case as we are computing all the subsets.
The above algorithm has some loopholes that we need to tackle.
The foremost problem is the exponential time complexity the algorithm is taking, and then, the space complexity is also worse, i.e., exponential.
This gives us the motivation to improve our algorithm.
So let’s think about an efficient approach to solve the problem.