Similar String Groups

Moderate
0/80
Average time to solve is 35m
profile
Contributed by
1 upvote
Asked in companies
ApplePayPal

Problem statement

Two strings ‘S1’ and ‘S2’ are considered similar if either ‘S1’ equals ‘S2’ or we can swap two letters of ‘S1’ (at different positions) so that it equals ‘S2’.

For Example :

“code” and “eodc” are similar because we can swap letters at positions 0 and 3 in ‘code’ to get “eodc”.

A group of strings is called the Similar String Group, if either it has only one string or each string is similar to at least one other string in the group.

For Example :

Group of strings  [“code”, “eodc”, “edoc”]  is a Similar String Group, because “code” is similar to “eodc”, “eodc” is similar to both “code” and “edoc” and “edoc” is similar to “eodc” i.e each string is similar to at least one other string of group.

You are given an array/list ‘STRS’ consisting of ‘N’ strings. Every string in ‘STRS’ is an anagram of every other string in ‘STRS’. Your task is to find and return the number of similar string groups in ‘STRS’.

Note :
1. Two strings are an anagram of each other if they have the same characters but these characters can be arranged in a different order. For example, “LISTEN” and “SILENT” are anagrams.
For Example :
Consider ‘STRS’ = [“code”, “doce”, ”code”, “ceod”, “eodc”, “edoc”, “odce”].    

There are 2 Similar String Groups in ‘STRS’ -:
1. “code”, “code”, “doce“, “odce”, eodc”, “edoc”
2. “ceod”

Thus, we should return 2 in this case.
Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first line of input contains an integer ‘T’ denoting the number of test cases. then ‘T’ test cases follow.

The first line of each test case consists of a single integer ‘N’ representing the size of the list/array ‘STRS’.

The second line of each test case consists of ‘N’ single space-separated strings representing strings in list/array ‘STRS’.
Output Format :
For each test case, print a single integer representing the number of Similar String Groups in ‘STRS’. 

Print output of each test case in a separate line.

Note :

You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints :
1 <= T <= 50
1 <= N <= 100
1 <= |STRS[i]| <= 5
‘STRS[i]’ has only lowercase english letters.

Time limit: 1 sec
Sample Input 1 :
2
1
ninja
7
code doce code ceod eodc edoc odce
Sample Output 1 :
1
2
Explanation Of Sample Input 1 :
Test case 1:
There is only one string “ninja”. A group having one string is also considered a Similar String Group. Thus, the output will be 1.

Test case 2:
See the problem statement for an explanation.
Sample Input 2 :
2
4
tars rats arts star
2
omv ovm
Sample Output 2 :
2
1
Hint

Can you reduce this problem to a graph problem?

Approaches (2)
Graph(I)

Consider that each string in ‘STRS’ represents a node in the graph, and there is an undirected edge between two nodes if strings represented by them are similar. After that, the problem reduces to finding the number of connected components in a graph, which can be solved either by Depth First Search or Breadth-First Search

 

A naive algorithm based on Depth First Search (DFS), in which for each string in ‘STRS’ we one by one check for other string whether it is similar to it or not is described below -:

 

Algorithm

  • Create a HashSet ‘VISITED’.
  • Create a recursive function DFS( ‘CURSTR’ ), where ‘CURSTR’ is the current source/root string, In each recursive call do the following -:
    • Insert ‘CURSTR’ in ‘VISITED’.
    • Run a loop where ‘i’ ranges from 0 to ‘N’ - 1 and for each ‘i’ do the following -:
      • If ‘STRS[i]’ is in ‘VISITED’, then skip the iteration.
      • Otherwise, iterate over characters of ‘STRS[i]’. If either there are exactly 2 positions where characters in ‘STRS[i]’ differ from ‘CURSTR’, then recursively call DFS(‘STRS[i]’).
  • Initialize an integer variable ‘RESULT’:= 0.
  • Run a loop where ‘i’ ranges from 0 to ‘N’ - 1. and for each ‘i’ if ‘STRS[i]’ is not in ‘VISITED’, then increment ‘RESULT’ by 1 and call DFS(‘STRS[i]’).
  • Return ‘RESULT’.

 

Time Complexity

O((N ^ 2) * |S|), where ‘N’ is the size of array/list ‘STRS’ and |S| is the length of a string in ‘STRS’

 

Here, there will be at most ‘N’ recursive calls to ‘DFS()’ and in each function call, it takes O(N * |S|) to check for all strings in ‘STRS’ whether ‘CURSTR’ is similar to it or not. Insertion in HashSet takes O(1) time. Thus overall complexity will be O((N ^ 2) * |S|).

Space Complexity

O(N * |S|),  where ‘N’ is the size of array/list ‘STRS’ and |S| is the maximum length of a string in ‘STRS’

 

Recursion depth can go up to ‘N’, so the extra space used by the recursion stack is O(N) and HashSet can have N spring of length |S|, so space used by it is O(N * |S|). Thus space complexity will be O(N * |S|) + O(N) = O(N * |S|).

Code Solution
(100% EXP penalty)
Similar String Groups
Full screen
Console