Accounts Merge

Hard
0/120
Average time to solve is 15m
23 upvotes
Asked in companies
FacebookPhonePeAmazon

Problem statement

You have been given an array/list 'accounts' where each element, i.e. 'accounts'[i] contains a list of strings.


In 'accounts'[i], the first element is the name of the account holder, and the rest of the strings are emails representing the emails of the account.


Now, you are supposed to merge the accounts. Two accounts definitely belong to the same person if there is some email that is common to both accounts. Note that it may be possible that two accounts belong to the same name, but they may belong to different people, as people could have the same name.


A person could have any number of accounts initially, but all their accounts definitely have the same name.


After merging the accounts, you have to return an array/list of merged accounts where each account, the first element is the person's name, and the rest elements are the email addresses in sorted order (non-decreasing). Accounts themselves can be in any order.


Example:
 Input:  'n' = 4,
 'accounts' = [
    ["Rohan", "rohan123@gmail.com", "1279ro@gmail.com"],
    ["Rohit", "rohit101@yahoo.com", "hitman30487@gmail.com"],
    ["Rohan", "1279ro@gmail.com",  "niemann01@gmail.com"],
    ["Rohan", "kaushik@outlook.com"],
 ]
 Output: [
    ["Rohan", "rohan123@gmail.com", "1279ro@gmail.com", "niemann01@gmail.com"],
    ["Rohit", "rohit101@yahoo.com", "hitman30487@gmail.com"],
    ["Rohan", "kaushik@outlook.com"],
 ]

 Explanation: The first and third "Rohan" are the same person as they have a shared email address, “1279ro@gmail.com”. The rest of the accounts are of different persons, as they don’t share any shared email addresses. So, we merge the first and third accounts.
Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first input line contains an integer ‘n’ denoting the number of accounts.

Each of the next ‘n’ lines contains the length of 'accounts[i]' followed by space-separated strings where the first string denotes the account holder’s name, and the rest of the strings represent the email addresses.
Output Format :
In the first line, print the number of accounts after merging. 

Print the space-separated name and email address associated with each account, each in a separate line.
Note:
You are not required to print anything; it has already been taken care of. Just implement the function.
Sample Input 1 :
4
3 Rahul rahul1@gmail.com rahul2@gmail.com 
2 Amit amit1@gmail.com
2 Ankur ankur9773@yahoo.com
3 Rahul rahul1@gmail.com rahul1998@yahoo.com
Sample output 1 :
3
Amit amit1@gmail.com 
Ankur ankur9773@gmail.com 
Rahul rahul1998@yahoo.com rahul1@gmail.com rahul2@gmail.com 
Explanation of Sample Input 1 :
The first and fourth "Rahul" are the same person as they have a shared email address, “rahul1@gmail.com”. The rest of the accounts are of different persons, as they don’t share any shared email addresses. So, we merge the first and fourth accounts.
Sample Input 2 :
2
2 Atul atul@mail.com
2 Atul atul1@mail.com
Sample output 2 :
2
Atul atul1@mail.com 
Atul atul@mail.com 
Explanation of Sample Input 2 :
The first and second Atul are two different people since they don’t have any common email addresses.
Constraints :
1 <= 'n' <= 5000
1 <= |'accounts'[i]| <= 10
1 <= |'accounts'[i][j]| <= 30

Time limit: 1 sec
Hint

Can you think of converting this problem into a graph problem?

Approaches (2)
DFS based

The basic idea of this approach is to model this problem as a graph problem. Let us create a graph of emails with an edge between two emails if they occur in the same account. 

 

The figure below shows the graph constructed for the first test case of sample test case 1.

 

 

We can observe that the set of accounts which needs to be merged will be a part of connected components. So the problem is now reduced to finding the connected components in an undirected graph. We can easily get the connected component by doing a depth-first search of the graph. You can refer to this tutorial of cp-algorithms for a more detailed explanation. Search for connected components in a graph - https://cp-algorithms.com/graph/search-for-connected-components.html#:~:text=Given%20an%20undirected%20graph%20G,path%20exists%20between%20different%20groups.

 

Let unordered_map<string, string> ‘EMAIL_TO_NAME’ be HashMap to store the mapping of emails to the name of the person having it. This mapping will help us in getting the name of the person while creating the merged accounts.

 

And also let unorderd_map<string, vector<string>> ‘GRAPH’ stores the adjacency list of the constructed graph.

 

Now the consider the following steps :

  1. Iterate through each ‘ACCOUNT’ in the array/list of ‘ACCOUNTS'. And for each ‘ACCOUNT’:
    • Store the mapping of email to the name for each email address.
    • And also create an edge between the first email address and the rest of the email address.
  2. Now create a set<string> ‘VISITED’ which stores the email address which is visited during the depth-first search. Also, create a vector<vector<string>> ‘MERGED_ACCOUNTS’ to store the name and email addresses of the merged accounts.
  3. Now, start doing the depth-first search from each non-visited email address.
    • And store the email addresses belonging to the current connected component in a new array/list.
    • Insert the name of the person having the email addresses in a new array/list ‘ACCOUNT’ and then insert all the addresses. Then, insert the current account details into the ‘MERGED_ACCOUNTS’.
Time Complexity

O( sum(Mi * log(Mi)) ), where sum() denotes the summation and ‘Mi’ denotes the length of accounts[i].

 

Since we are doing a depth-first search to find the connected component of the graph, and the time complexity of doing that will be O(sum(Mi))  because there will be at max O(sum(Mi)) edges in the constructed graph and since it takes O(1) time on average to access the elements from the HashMap. Since we need to sort the emails of the merged accounts in non-decreasing order. So, the overall time complexity will be O(sum(Mi * log(Mi) ).

Space Complexity

 O( sum(Mi) ), where ‘Mi’ denotes the length of accounts[i].

 

Since we are using the extra space to store the graph and the merged accounts. The overall space complexity will be O( sum(Mi) ),

Code Solution
(100% EXP penalty)
Accounts Merge
Full screen
Console