Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
1.1.
Problem Statement
1.2.
Idea
1.3.
Algorithm
1.4.
Complexity Analysis 
2.
Frequently Asked Questions
2.1.
What is a hashtable?
2.2.
How do we store data in a hashmap?
2.3.
What is reduced form?
2.4.
Why is HashMap used?
2.5.
Which is better: HashMap or ArrayList?
3.
Conclusion
Last Updated: Mar 27, 2024

Find top k numbers in a stream

Author Amit Singh
0 upvote
gp-icon
Data structures & algorithms (Beginner to Intermediate)
Free guided path
13 chapters
99+ problems
gp-badge
Earn badges and level up

Introduction

In this article, we are going to learn how to write a program to find top k numbers in a stream. 

Problem Statement

The problem statement is that we are given an array of ‘n’ elements and an integer ‘K’. 

Every time a new number is read, your objective is to keep at most K numbers at the top (as per their frequency in decreasing order). When the input stream has k different items, we just need to print the top k integers sorted by frequency; otherwise, we need to display all distinct elements sorted by frequency.
 

Sample Input 1

{5, 2, 1, 3, 4}

Sample Output:

5 2 5 1 2 5 1 2 3 5 1 2 3 4

Explanation: 

  1. We will start by reading 5. So, there is only a single element 5 whose frequency is highest till now. 
    So print 5.
  2. As soon as we read 2, the number of elements is two, 2, and 5. Both have the same frequency. 
    Since 2 is smaller than 5 yet has the same frequency, we shall print 2 5.
  3. Again after we read 1, there will be 3 elements 1, 2, and 5. All of the elements of the array will have the same frequency, 
    So we will print 1 2 5. 
    In a similar way, print 1 2 3 5 after reading 3.
  4. After we read the last element, 4, the frequency of all the elements will be the same. 
    So we will print 1 2 3 4.

 

Sample Input 2

{5, 2, 1, 3, 2}

Sample Output:

5 2 5 1 2 5 1 2 3 5 2 1 3 5

Explanation: 

  1. We will start by reading 5. So, there is only a single element 5 whose frequency is highest till now. 
    So print 5.
  2. As soon as we read 2, the number of elements is two, 2, and 5. Both have the same frequency. 
    Since 2 is smaller than 5 yet has the same frequency, we shall print 2 5.
  3. Again after we read 1, there will be 3 elements 1, 2, and 5. All of the elements of the array will have the same frequency, 
    So we will print 1 2 5. 
    In a similar way, print 1 2 3 5 after reading 3.
  4. After we read the last element, 2, the frequency of all the elements will be the same except for 2. The frequency of 2 will become 2. So, it will come at the top and the rest that have same frequency will come in sorted order. 
    So, we will print 2 1 3 5. 

Idea

Here, the approach begins with saving the top k elements that have the highest frequency. They can be kept in a vector or an array. Create a hashmap to hold element-frequency pairings in order to keep track of element frequencies. When a new number is added to a stream of numbers, the frequency of that number is updated in a hash table, and the new number is added at the end of the list of K numbers (making a total of k+1 elements). Next, the adjacent elements of the list are compared, and if a higher frequency element is stored next to one, the higher frequency element is swapped out. 

Let's see an example:

Dry-run using Hashing

Algorithm

  1. Create a Hashmap hm, and an array of k + 1 length.
  2. Traverse the input array from start to end.
  3. Insert the element at k+1 th position of the array, and update the frequency of that element in HashMap.
  4. Now, traverse the temp array from start to end – 1.
  5. For every element, compare the frequency and swap if a higher frequency element is stored next to it; if the frequency is the same, then swap if the next element is greater.
  6. Print the top k element in each traversal of the original array.
     

Implementation in Java:

import java.io.*;
import java.util.*;

class HelloWorld {

    // This function is created to find in top vector for element.
    static int look(int[] arr, int ele)
    {
        for (int i=0; i<arr.length; i++)
        if (arr[i] == ele)
        return i;
        return -1;
    }

    // This function is created to print top k integers.
    static void topKint(int[] a, int n, int k)
    {
        // This vector is of size 'k+1' for storing elements.
        int[] top = new int[k + 1];

        // This array is created to keep track of the frequency.
        HashMap<Integer, Integer> fre = new HashMap<>();
        for (int i=0; i<k+1; i++)
            fre.put(i, 0);

        // Iteration till the end of the stream.
        for (int m=0; m<n; m++) {
            // Incrementing the frequency.
            if (fre.containsKey(a[m]))
                fre.put(a[m], fre.get(a[m]) + 1);
            else
                fre.put(a[m], 1);

            top[k] = a[m];

            // find in top vector for the same element.
            int i = look(top, a[m]);
            i -= 1;

            // iteration from the position of the element to zero.
            while (i>=0) {
                /*  comparing the frequency and then swapping
                    if higher frequency element is stored next to it.
                */

                if (fre.get(top[i]) < fre.get(top[i + 1])) {
                    int temp = top[i];
                    top[i] = top[i + 1];
                    top[i + 1] = temp;
                }

                /*  if the frequency is same, compare the elements and
                    swap if next element is higher.
                */
                else if ((fre.get(top[i]) == fre.get(top[i + 1])) && (top[i] > top[i + 1])) {
                    int temp = top[i];
                    top[i] = top[i + 1];
                    top[i + 1] = temp;
                }
                else
                    break;
                i -= 1;
            }

            // Now print the top k elements.
            for (int j=0; j<k && top[j]!=0; ++j)
                System.out.print(top[j] + " ");
        }
        System.out.println();
    }

    public static void main(String args[])
    {
        int k = 4;
        int[] arr = {5, 2, 1, 3, 4};
        int n = arr.length;
        topKint(arr, n, k);
    }
}

 

Input:

int arr[] = {5, 2, 1, 3, 4};

Output:

5 2 5 1 2 5 1 2 3 5 1 2 3 4

 

Implementation in C++:

// C++ program to find top_arr k elements in a stream
#include <bits/stdc++.h>
using namespace std;

// This function is created to print top_arr k integers.
void topKint(int a[], int n, int k)
{
    // This vector is of size 'k+1' for storing elements.
    vector<int> top_arr(k + 1);


    // This array is created to keep track of the frequency.
    unordered_map<int, int> fre;


    // Iteration till the end of the stream.
    for (int m=0; m<n; m++) {
        // incrementing the frequency.
        fre[a[m]]++;


        // storing that element in the top_arr vector.
        top_arr[k] = a[m];


        auto it = find(top_arr.begin(), top_arr.end() - 1, a[m]);


        // iteration from the position of the element to zero.
        for (int i=distance(top_arr.begin(), it)-1; i>=0; --i) {
            /*  comparing the frequency and then swapping
                if higher frequency element is stored just next to it. */
            if (fre[top_arr[i]]<fre[top_arr[i + 1]]) {
                swap(top_arr[i], top_arr[i + 1]);
            }
           
            /* if the frequency is same, compare the elements and
                swap if next element is higher. */
            else if ((fre[top_arr[i]]==fre[top_arr[i+1]])&&(top_arr[i]>top_arr[i+1])) {
                swap(top_arr[i], top_arr[i + 1]);
            }
            else {
                break;
            }
        }


        // print top_arr k elements
        for (int i = 0; i < k && top_arr[i] != 0; ++i)
            cout << top_arr[i] << ' ';
    }
    cout << endl;
}


// Driver program to test above function
int main()
{
    int k = 4;
    int arr[] = {5, 2, 1, 3, 4};
    int n = sizeof(arr) / sizeof(arr[0]);
    topKint(arr, n, k);
    return 0;
}

 

Output:

5 2 5 1 2 5 1 2 3 5 1 2 3 4

Complexity Analysis 

Time Complexity: O(n*k), the temp array of size k is traversed once for each traversal.

Space Complexity: O(n), for storing the elements in the HashMap.

You can also read about the Longest Consecutive Sequence.

Frequently Asked Questions

What is a hashtable?

A Hash table is one of the essential data structures that uses a particular function which is known as a hash function that maps a given value with a key to retrieve the elements faster.

How do we store data in a hashmap?

In hashmap, data is stored in the form of the key-value pair. This means every key will have some value or data mapped to it.

What is reduced form?

The reduced form is replacing the elements in an array with 0 to n-1 according to the order of the elements means the smallest element will be 0, and the largest element will be n-1.

Why is HashMap used?

Hashmaps can be said to as the most commonly used implementation of the concept of a map. Hashmaps allow arbitrary objects to be associated with other arbitrary objects. This can be very useful for doing things like grouping or joining data together by some common attribute.

Which is better: HashMap or ArrayList?

ArrayList stores the elements only as values and maintains the indexing for every element. While HashMap stores elements with key and value pairs, that means two objects. So HashMap takes more memory comparatively.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Conclusion

In this article, we have studied how to write a program to find top k numbers in a stream. We hope that this article has provided you with the help to enhance your knowledge regarding hashmap and if you would like to learn more, check out our articles on ArraysHashmap and Problems on hashmap.

Check out this problem - Next Smaller Element

Refer to our guided paths on Coding Ninjas Studio to learn more about DSA, Competitive Programming, JavaScript, System Design, etc. Enrol in our courses and refer to the mock test and problems available, Take a look at the interview experiences and interview bundle for placement preparations.

Do upvote our blog to help other ninjas grow.

Merry Learning!

Previous article
Find The Top Three Repeated Elements In An Array
Next article
Find elements which are present in one array but not in other
Guided path
Free
gridgp-icon
Data structures & algorithms (Beginner to Intermediate)
13 chapters
109+ Problems
gp-badge
Earn badges and level up
Live masterclass