Last Updated: 25 Dec, 2020

Bundles

Easy
Asked in companies
AmazonMicrosoftZoho Corporation

Problem statement

You are given integers ‘N’ and ‘K’, and a list 'STRLIST' of ‘N’ strings. You are supposed to divide the string into exactly ‘N/K’(N will be divisible by K) groups and each group must consist of exactly ‘K’ strings. Each string must belong to exactly one group. The score of each group is equal to the length of the longest prefix that is common to all strings of that group.

You are supposed to find the maximum sum of scores of the groups.

Example :
{ “abcdty”, “abcdqwse”, “abcfhjgt”, “abcdtt” } 
The score for this group of strings is 3, as all strings have the longest common prefix as “abc”.
Input Format :
The first line contains a single integer ‘T’ denoting the number of test cases. The test cases are as follows.

The first line of each test case contains two integers ‘N’ and ‘K’ separated by a single space. ‘N’ denotes the number of strings and ‘K’ denotes the size of each group.

The next ‘N’ lines contain a single string.
Output Format:
For each test case, print a single line containing a single integer denoting the maximum sum of the scores of all groups.

The output of each test case will be printed in a separate line.
Note :
You don’t need to print anything; It has already been taken care of. Just implement the given function.
Constraints :
1 <= T <= 5
1 <= K <= N <= 100
1 <= |STRLIST[i]| <= 1000
“STRLIST[i]” contains only lowercase english letters.

Where ‘T’ denotes the number of test cases, ‘N’ denotes the number of strings, ‘K’ denotes the size of each group of the strings, and |STRLIST[i]| denotes the length of the i’th string.

Time Limit: 1 sec

Approaches

01 Approach

The idea is to store all the prefixes into the HashMap and then calculate the contribution of each prefix in the answer. 

Consider a group of strings which has a common prefix of length ‘L’. The score for this group will be ‘L’, we can see that each prefix from 1 to ‘L’ contributes exactly 1 to the final answer. So if any prefix is present in ‘P’ groups then it will contribute exactly ‘P’ to the final answer. The problem is now reduced to finding the individual contribution of each prefix to the final answer.

 

The steps are as follows :

  1. Declare a HashMap for mapping each prefix to the number of times it occurs. The key of the HashMap will be a string and it’s value will be an integer that denotes the occurrence of the key.
  2. Iterate through the given strings
    1. For each prefix of the current string, increment the value in the HashMap by 1 corresponding to the prefix as the key.
  3. Declare a variable “finalScore” and initialize it to 0. We will use this variable to store the sum of scores of each group.
  4. Now iterate through the HashMap, let’s say the value of the current key is “count”.
    1. The contribution of the current key will be floor(count/K) because we need K strings with a common prefix to form one group.
    2. Add floor(count/k) to “finalScore”.
  5. Return “finalScore” as this will be the maximum sum of scores of all groups.

02 Approach

Since the problem is dealing with prefixes of multiple strings and trie is an efficient data structure to compare prefixes of different strings, we can use a trie to solve this problem. The idea is to store all the strings in a trie so that we can efficiently calculate for each prefix the number of times it occurs. 

Consider a group of strings which has a common prefix of length ‘L’. The score for this group will be ‘L’, we can see that each prefix from 1 to ‘L’ contributes exactly 1 to the final answer. So if any prefix is present in ‘P’ groups then it will contribute exactly ‘P’ to the final answer. The problem is now reduced to finding the individual contribution of each prefix to the final answer.

 

The steps are as follows :

 

1. Define a trie class which has pointers to the children nodes and an additional member “count” to store the number of times each prefix occurs.

2. The structure of the trie node class will be as follows:

class TrieNode
{
public:
    unordered_map<char, TrieNode*> children;
    int count;
    TrieNode()
    {
        count = 0;
    }
};

The HashMap “children” stores the pointers to the children nodes of the current node. 

 

3. Iterate through all the strings and insert each string into the trie and while inserting the string increment “count” member of the class of the current node of the trie.

 

4. Now we will traverse all the nodes of the trie.

 

     1. Define a function countPrefix(curr), here “curr” is the current node of the trie. This function will return the total score of all nodes in the subtree of “curr”.

 

     2. Initialize a variable “contribution” to the floor(curr->count/K). This will store the total contribution of all nodes in the subtree of “curr”.

 

     3. Now iterate through all the children nodes, let’s say the current child is ‘i’.

 

              1. Recur for the children node and add to “contribution” the value returned by countPrefix(curr->child[i]).

 

              2. Return “contribution”.

 

5. Let’s say “head” is the root node of the trie, then the final answer will be the value returned by countPrefix(head).

03 Approach

The approach is very similar to the previous approach, however, in this approach, we will use an array to store the children of a node to improve the access time of a node.

Each node can have a maximum of 26 children nodes as there are a maximum of 26 distinct characters. Hence we can use an array to store the child nodes.

Consider a group of strings that has a common prefix of length ‘L’. The score for this group will be ‘L’, we can see that each prefix from 1 to ‘L’ contributes exactly 1 to the final answer. So if any prefix is present in ‘P’ groups then it will contribute exactly ‘P’ to the final answer. The problem is now reduced to finding the individual contribution of each prefix to the final answer.

 

The steps are as follows :

 

1. Define a trie class that has pointers to the children nodes and an additional member “count” to store the number of times each prefix occurs.

2. The structure of the trie class will be

class TrieNode
{
public:
    TrieNode *children[26];
    int count;
    TrieNode()
    {
        count = 0;
        for (int i = 0; i < 26; i++)
        {
            children[i] = NULL;
        }
    }
};

The array “children” stores the pointers to the children nodes of the current node. 

 

3. Iterate through all the strings and insert each string into the trie and while inserting the string increment “count” member of the class of the current node of the trie.

 

4. Now we will traverse all the nodes of the trie.

 

       1. Define a function countPrefix(curr), here “curr” is the current node of the trie. This function will return the total score of all nodes in the subtree of “curr”.

 

       2. Initialize a variable “contribution” to the floor(curr->count/K). This will store the total contribution of all nodes in the subtree of “curr”.

 

       3. Now iterate through all the children nodes, let’s say the current child is ‘i’.

                1. Recur for the child node and add to “contribution” the value returned by countPrefix(curr->child[i]).

 

       4. Return “contribution”.

 

5. Let’s say “head” is the root node of the trie, then the final answer will be the value returned by countPrefix(head).