Solution Approach
The brute force algorithm for the given problem is to traverse the complete array using two loops and check the number of distinct elements for each subarray and then output the longest subarray.
The efficient solution is to use hashing and two-pointers. The two pointers are used to track the subarray we are considering at any moment. We will keep track of the frequency of the occurrence of items using hashing. Hashing will help us remove all the occurrences of the element from the subarray to ensure that there are at most K distinct elements in the subarray at any time.
Algorithm
-
Initialize two pointers first and last that mark the subarray's starting and ending point being considered. Also, create a dictionary for hashing. This dictionary will store the frequency of elements occurring between first and last. Initialize a counter to zero to store the number of distinct elements between first and last.
-
Loop over the given array considering each element at a time. Increment the frequency of the current element. If it is the first time the element has occurred, then increment the counter.
-
Continue until the counter becomes greater than K. If the counter becomes greater than K, remove the element from the left side. If the occurrence of any element becomes zero, decrement the counter.
- Print the longest such subarray.
Python Implementation
''' Collections library is used to create a dictionary to store the frequency of elements in the subarray being considered'''
import collections
''' Function to find the longest subarray having not more than K distinct elements '''
def LongestSubarray(arr, K):
size = len(arr)
frequency = collections.defaultdict(int)
first = 0
last = 0
counter = 0
temp =0
for idx in range(size):
''' Increment the frequency of arr[idx] by 1'''
frequency[arr[idx]] += 1
''' If it is the first time arr[idx] is occuring in the considered subarray then increment the counter '''
if frequency[arr[idx]] == 1:
counter += 1
''' Remove elements from the subarray until the counter becomes less than or equal to K'''
while counter > K:
frequency[arr[temp]] -= 1
''' If no other occurences remain for arr[temp] then decrement the counter'''
if frequency[arr[temp]] == 0:
counter -= 1
temp += 1
''' Update first and last for longest subarray'''
if (idx - temp + 1 >= last - first + 1):
last = idx
first = temp
''' Print the Longest subarray not having more than K distinct element'''
print(*arr[first: last+1])
''' Driver Function '''
def main():
arr = [1, 2, 3, 6, 3, 3, 2, 1 ,4, 5]
K = 3
LongestSubarray(arr, K)
''' Executing Main '''
main()
You can also try this code with Online Python Compiler
Run Code
Output
2 3 6 3 3 2
Complexities
Time Complexity
The time complexity of the given solution is O(N), where N is the number of elements in the given array.
Reason: We are traversing the entire array, hence the complexity of O(N).
Auxiliary space complexity
The Auxiliary space complexity of the given solution is O(N), where N is the number of elements in the given array.
Reason: We are creating a dictionary to store the frequency of occurrence of elements. Hence, the complexity of the given solution is O(N).
You can also read about the Longest Consecutive Sequence.
Frequently Asked Questions
What is an array?
An array is a container to store the elements in non-contiguous memory locations. In C++, an array can store only homogeneous elements, but in python, it is implemented using a list, and a list can store heterogeneous elements in python.
What is the subarray of an array?
A subarray of an array is a part or section of an array. It has to be continuous. For example if you are given an array arr[] = {1, 2, 3, 4, 5, 6, 7} then some of the subarrays of this array are {2, 3, 4}, {7}, {5, 6}. But {1, 3, 5} is not a subarray of the given array as the elements do not occur consecutively in the given array.
What is the significance of complexity calculation?
There are two main factors on which the performance depends time and space. If we have two solutions to a problem, then to determine the better solution, we can compare the time and space complexity of the two solutions.
What is the time complexity of a program?
The time complexity is the amount of run time of the program in terms of input data. While calculating the time complexity, the constant time is ignored, and the input data value is assumed to be huge.
What is the auxiliary space complexity of a program?
Auxiliary Space Complexity is the amount of extra space required by a program to solve a problem. All the constant amount of value is ignored. It is the space occupied by the program other than the input and output data.
Conclusion
This article discussed the problem of finding the longest subarray not having more than K Distinct Elements. We also discussed the implementation and complexities of the solution to the problem using hashing.
I hope you would have gained a better understanding of these topics now!
Recommended problems -
Are you planning to ace the interviews of reputed product-based companies like Amazon, Google, Microsoft, and more?
Attempt our Online Mock Test Series on Coding Ninjas Studio now!
Happy Coding!