Introduction
In this article, we are going to learn how to write a program to find top k numbers in a stream.
Problem Statement
The problem statement is that we are given an array of ‘n’ elements and an integer ‘K’.
Every time a new number is read, your objective is to keep at most K numbers at the top (as per their frequency in decreasing order). When the input stream has k different items, we just need to print the top k integers sorted by frequency; otherwise, we need to display all distinct elements sorted by frequency.
Sample Input 1
{5, 2, 1, 3, 4} |
Sample Output:
5 2 5 1 2 5 1 2 3 5 1 2 3 4 |
Explanation:
|
Sample Input 2
{5, 2, 1, 3, 2} |
Sample Output:
5 2 5 1 2 5 1 2 3 5 2 1 3 5 |
Explanation:
|
Idea
Here, the approach begins with saving the top k elements that have the highest frequency. They can be kept in a vector or an array. Create a hashmap to hold element-frequency pairings in order to keep track of element frequencies. When a new number is added to a stream of numbers, the frequency of that number is updated in a hash table, and the new number is added at the end of the list of K numbers (making a total of k+1 elements). Next, the adjacent elements of the list are compared, and if a higher frequency element is stored next to one, the higher frequency element is swapped out.
Let's see an example:

Algorithm
- Create a Hashmap hm, and an array of k + 1 length.
- Traverse the input array from start to end.
- Insert the element at k+1 th position of the array, and update the frequency of that element in HashMap.
- Now, traverse the temp array from start to end – 1.
- For every element, compare the frequency and swap if a higher frequency element is stored next to it; if the frequency is the same, then swap if the next element is greater.
-
Print the top k element in each traversal of the original array.
Implementation in Java:
import java.io.*;
import java.util.*;
class HelloWorld {
// This function is created to find in top vector for element.
static int look(int[] arr, int ele)
{
for (int i=0; i<arr.length; i++)
if (arr[i] == ele)
return i;
return -1;
}
// This function is created to print top k integers.
static void topKint(int[] a, int n, int k)
{
// This vector is of size 'k+1' for storing elements.
int[] top = new int[k + 1];
// This array is created to keep track of the frequency.
HashMap<Integer, Integer> fre = new HashMap<>();
for (int i=0; i<k+1; i++)
fre.put(i, 0);
// Iteration till the end of the stream.
for (int m=0; m<n; m++) {
// Incrementing the frequency.
if (fre.containsKey(a[m]))
fre.put(a[m], fre.get(a[m]) + 1);
else
fre.put(a[m], 1);
top[k] = a[m];
// find in top vector for the same element.
int i = look(top, a[m]);
i -= 1;
// iteration from the position of the element to zero.
while (i>=0) {
/* comparing the frequency and then swapping
if higher frequency element is stored next to it.
*/
if (fre.get(top[i]) < fre.get(top[i + 1])) {
int temp = top[i];
top[i] = top[i + 1];
top[i + 1] = temp;
}
/* if the frequency is same, compare the elements and
swap if next element is higher.
*/
else if ((fre.get(top[i]) == fre.get(top[i + 1])) && (top[i] > top[i + 1])) {
int temp = top[i];
top[i] = top[i + 1];
top[i + 1] = temp;
}
else
break;
i -= 1;
}
// Now print the top k elements.
for (int j=0; j<k && top[j]!=0; ++j)
System.out.print(top[j] + " ");
}
System.out.println();
}
public static void main(String args[])
{
int k = 4;
int[] arr = {5, 2, 1, 3, 4};
int n = arr.length;
topKint(arr, n, k);
}
}
Input:
int arr[] = {5, 2, 1, 3, 4}; |
Output:
5 2 5 1 2 5 1 2 3 5 1 2 3 4 |
Implementation in C++:
// C++ program to find top_arr k elements in a stream
#include <bits/stdc++.h>
using namespace std;
// This function is created to print top_arr k integers.
void topKint(int a[], int n, int k)
{
// This vector is of size 'k+1' for storing elements.
vector<int> top_arr(k + 1);
// This array is created to keep track of the frequency.
unordered_map<int, int> fre;
// Iteration till the end of the stream.
for (int m=0; m<n; m++) {
// incrementing the frequency.
fre[a[m]]++;
// storing that element in the top_arr vector.
top_arr[k] = a[m];
auto it = find(top_arr.begin(), top_arr.end() - 1, a[m]);
// iteration from the position of the element to zero.
for (int i=distance(top_arr.begin(), it)-1; i>=0; --i) {
/* comparing the frequency and then swapping
if higher frequency element is stored just next to it. */
if (fre[top_arr[i]]<fre[top_arr[i + 1]]) {
swap(top_arr[i], top_arr[i + 1]);
}
/* if the frequency is same, compare the elements and
swap if next element is higher. */
else if ((fre[top_arr[i]]==fre[top_arr[i+1]])&&(top_arr[i]>top_arr[i+1])) {
swap(top_arr[i], top_arr[i + 1]);
}
else {
break;
}
}
// print top_arr k elements
for (int i = 0; i < k && top_arr[i] != 0; ++i)
cout << top_arr[i] << ' ';
}
cout << endl;
}
// Driver program to test above function
int main()
{
int k = 4;
int arr[] = {5, 2, 1, 3, 4};
int n = sizeof(arr) / sizeof(arr[0]);
topKint(arr, n, k);
return 0;
}
Output:
5 2 5 1 2 5 1 2 3 5 1 2 3 4 |
Complexity Analysis
Time Complexity: O(n*k), the temp array of size k is traversed once for each traversal.
Space Complexity: O(n), for storing the elements in the HashMap.
You can also read about the Longest Consecutive Sequence.
Frequently Asked Questions
What is a hashtable?
A Hash table is one of the essential data structures that uses a particular function which is known as a hash function that maps a given value with a key to retrieve the elements faster.
How do we store data in a hashmap?
In hashmap, data is stored in the form of the key-value pair. This means every key will have some value or data mapped to it.
What is reduced form?
The reduced form is replacing the elements in an array with 0 to n-1 according to the order of the elements means the smallest element will be 0, and the largest element will be n-1.
Why is HashMap used?
Hashmaps can be said to as the most commonly used implementation of the concept of a map. Hashmaps allow arbitrary objects to be associated with other arbitrary objects. This can be very useful for doing things like grouping or joining data together by some common attribute.
Which is better: HashMap or ArrayList?
ArrayList stores the elements only as values and maintains the indexing for every element. While HashMap stores elements with key and value pairs, that means two objects. So HashMap takes more memory comparatively.