Minimum Deletions To Make Character Frequencies Unique

Moderate
0/80
Average time to solve is 25m
profile
Contributed by
26 upvotes
Asked in companies
OlaSAP LabsMicrosoft

Problem statement

You are given a string ‘STR’ consisting of lowercase English alphabets. Your take is to find out the minimum number of character deletions required such that each character in the final string has a unique frequency.

Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first line contains an integer ‘T’, which denotes the number of test cases to be run. Then, the T test cases follow.

The first line of each test case contains a single string ‘STR’ denoting the initial string.
Output Format :
For each test case, print a single integer, denoting the minimum number of character deletions required such that each character in the final string has a unique frequency.

Output for each test case will be printed in a separate line.
Note :
You don’t have to print anything, it has already been taken care of. Just implement the given function.
Constraints :
1 <= T <= 10
2 <= |STR| <= 10^5

Where |STR| denotes the length of the initial string ‘STR’. 

Time limit: 1 second
Sample Input 1 :
2
babccbc
gttqtq
Sample Output 1 :
1
0
Explanation For Sample Input 1 :
Test Case 1 :
The frequency of ‘a’ is 1, of ‘b’ is 3 and of ‘c’ is 3. Therefore after deletion of one character of either ‘b’ or ‘c’ makes the frequencies 1,2 and 3.

Test Case 2 :
The frequency of 'g' is 1, 't' is 3 and 'q' is 2. Therefore there is no need for a delete operation.
Sample Input 2 :
2
abbccc
cbbd
Sample Output 2 :
0
1
Explanation For Sample Input 2 :
For the first test case, each character already has a unique frequency(1 for ‘a’, 2 for ‘b’ and 3 for ‘c’), hence we don’t need to do any deletions.

For the second test case frequency of ‘c’ and ‘d’ is equal hence we need to delete either of them. Therefore the answer is 1.
Hint

Without any loss of generality, if two characters have the same frequency(say ‘FREQ’) in the initial string, then the optimal step here is to delete one instance of either of these characters such that now their frequencies are ‘X’ and ‘FREQ’-1.

Approaches (1)
Greedy

 

  • We need to think greedily. For each character in the initial string ‘S’, count its frequency. Let’s call this frequency array as ‘FREQ[26]’, the size of this array will be 26 as the string only consists of lowercase English alphabets.
  • Then we need to count how many characters have the same frequency. To do that, create an array ‘COUNT[N+1]’, where ‘COUNT[i]’ denotes how many characters have frequency ‘i’. For i goes from 0 to 26 increase ‘count[freq[i]]’ by 1.
  • If ‘COUNT[i]’ is greater than 1, that means more than 1 characters, at this instant, have equal frequency(which is equal to i). Hence in those cases we need to delete ‘COUNT[i] - 1’ characters and increase ‘COUNT[i-1]’ by ‘COUNT[i] - 1’.
  • Repeat the above step for ‘N’ times.
  • The answer is the sum of the number of deletions we made in the ‘COUNT’ array.
Time Complexity

O(‘N’), where ‘N’ is the length of the string ‘STR’.

 

First, we need to count each character’s frequency, which takes O(‘N’) time according to the algorithm described above. Then, after knowing the count of each frequency, we make that count at max 1, which again takes O(‘N’); hence the final time complexity will be O(‘N’).

Space Complexity

O(‘N’), where ‘N’ is the length of the string ‘STR.’

 

The size of ‘freq’ array can be at max 26, and that of the ‘count’ array can be at max ‘N’; hence the net space complexity is O(’N’).

Code Solution
(100% EXP penalty)
Minimum Deletions To Make Character Frequencies Unique
Full screen
Console