Baby Names

Moderate
0/80
Average time to solve is 15m
profile
Contributed by
9 upvotes
Asked in company
Google inc

Problem statement

You’re given a list “names” of people’s names with the number of times these names appear in the list. Some names have multiple spellings. For example, “John” and “Jon” are the same name but are listed separately. You’re given another list of “synonyms” containing the names that have a synonymic relationship.

Your task is to print a list of each name’s true frequency and, in the final list, for each synonym group, use the name that appears first in the original names list.

Note :
1. If John and Jon are synonyms, and Jon and Johnny are synonyms, then John and Johnny are also synonyms. 
2. While printing the true frequency, out of the names with synonymic relationships, use the names list first.

Example:

Names = 5, Synonyms = 3
5
john 15
jon 12
chris 13
kris 4
christopher 19
3
jon john
chris kris
chris Christopher

The output for the above example will be: 
john 27
chris 36

Since “jon” and “john” are synonyms, the final true frequency will be 15 + 12 = 27, and since john appears first in the first list, we’ll use “john” as the real name. Also, “chris” and “kris” are synonyms, and since “chris” and “christopher” are also synonyms, “chris” and “christopher” will also be synonyms, and the true frequency will be 13 + 4 + 19 = 36, and since “chris” appears first in the names list, we’ll use it as the real name.
Detailed explanation ( Input/output format, Notes, Images )
Input Format:
The first line of the input contains an integer 'T' denoting the number of test cases.

The first line of each test contains an integer 'N', representing the total number of names.

The next 'N' lines contain a string and an integer, representing a name and the number of times this name appears in the list.

The next line contains an integer 'M', representing the number of synonym relationships.

The next 'M' lines contain two strings that have a synonym relationship.
Output Format:
For every test case, print a new list of names with their true frequencies.

Print the output of each test case in a separate line.
Note :
You do not need to print anything; it has already been taken care of. Just implement the given function. 
Constraints:
1 <= T <= 5
1 <= N, M <= 10^4
1 <= |Name| <= 10
1 <= Frequency <= 100

Where “Name” contains only lowercase English alphabets.

Time limit: 1 sec
Sample Input 1:
1
5
john 15
jon 12
chris 13
kris 4
christopher 19
3
jon john
chris kris
chris christopher
Sample Output 1:
john 27
chris 36
Explanation Of Sample Input 1:
For Test Case 1: As “john” and “jon” are synonyms of each other. Hence, total frequency will be 15 + 12 = 27. Similarly, “chris,” “kris,” and “christopher” are synonyms, and their total count of frequency will be 13 + 4 + 19 = 36.
Sample Input 2:
1
5
krish 15
kriss 12
chris 13
kris 4
christopher 19
4
krish kriss
kriss kris 
chris kris
chris christopher
Sample Output 2:
krish 63
Explanation Of Sample Input 2:
For Test Case 1: All the given input names are synonyms of each other. Hence, the total count of frequency will be 15 + 12 + 13 + 4 + 19 = 63
Hint

Try to combine the synonyms of a particular name into a single group.

Approaches (1)
Connected Components

Approach: We can visualize this question as a graph where each name will represent a particular node, and synonymic relations will connect two nodes. Each connected component of this graph will represent synonyms of a specific name. We will traverse through each connected component using DFS.

 

Algorithm:

  1. For simplicity, we will represent names as numbers, i.e., 1 to ‘N’, and make an adjacency list ‘ADJ’ for storing the graph.
  2. We will maintain an array ‘VISITED’ to check whether the node has been visited or not.
  3. We will check each node from 1 to N whether it is visited or not, and if it is not visited, we’ll perform a DFS on it.
  4. While doing DFS at a particular node, we will mark that node as visited, and we will traverse all the unvisited nodes connected to this node.
  5. At each step of DFS, we will also consider the frequencies of the nodes (as given in the question - frequency of strings) that we are traversing. After traversing all the unvisited nodes connected to the current node at which we are, we will have a count of total frequencies.
  6. When we have visited all the nodes of a connected component, we will have a total count of frequencies.
  7. Hence, the answer will be the node from which are performing DFS, and the DFS will give the total count of the frequency.
  8. Similarly, we will find the answers of the other connected components also.
Time Complexity

O(N + M), where N is the number of different names and M is the number of relations.

 

Making an adjacency matrix will take O(N + M) time, and DFS takes O(N + M) time too. Hence, the overall complexity will be O(N + M).

Space Complexity

O(N + M), where N is the number of different names and M is the number of relations.

 

The adjacency matrix will take O(N + M). Hence, the space complexity will be O(N + M).

Code Solution
(100% EXP penalty)
Baby Names
Full screen
Console