Similar String Groups

Group of strings [“code”, “eodc”, “edoc”] is a Similar String Group, because “code” is similar to “eodc”, “eodc” is similar to both “code” and “edoc” and “edoc” is similar to “eodc” i.e each string is similar to at least one other string of group.

Consider ‘STRS’ = [“code”, “doce”, ”code”, “ceod”, “eodc”, “edoc”, “odce”]. There are 2 Similar String Groups in ‘STRS’ -: 1. “code”, “code”, “doce“, “odce”, eodc”, “edoc” 2. “ceod” Thus, we should return 2 in this case.

The first line of input contains an integer ‘T’ denoting the number of test cases. then ‘T’ test cases follow. The first line of each test case consists of a single integer ‘N’ representing the size of the list/array ‘STRS’. The second line of each test case consists of ‘N’ single space-separated strings representing strings in list/array ‘STRS’.

Test case 1: There is only one string “ninja”. A group having one string is also considered a Similar String Group. Thus, the output will be 1. Test case 2: See the problem statement for an explanation.

Graph(I)

Consider that each string in ‘STRS’ represents a node in the graph, and there is an undirected edge between two nodes if strings represented by them are similar. After that, the problem reduces to finding the number of connected components in a graph, which can be solved either by Depth First Search or Breadth-First Search.

A naive algorithm based on Depth First Search (DFS), in which for each string in ‘STRS’ we one by one check for other string whether it is similar to it or not is described below -:

Algorithm

Create a HashSet ‘VISITED’.
Create a recursive function DFS( ‘CURSTR’ ), where ‘CURSTR’ is the current source/root string, In each recursive call do the following -:
- Insert ‘CURSTR’ in ‘VISITED’.
- Run a loop where ‘i’ ranges from 0 to ‘N’ - 1 and for each ‘i’ do the following -:
  - If ‘STRS[i]’ is in ‘VISITED’, then skip the iteration.
  - Otherwise, iterate over characters of ‘STRS[i]’. If either there are exactly 2 positions where characters in ‘STRS[i]’ differ from ‘CURSTR’, then recursively call DFS(‘STRS[i]’).
Initialize an integer variable ‘RESULT’:= 0.
Run a loop where ‘i’ ranges from 0 to ‘N’ - 1. and for each ‘i’ if ‘STRS[i]’ is not in ‘VISITED’, then increment ‘RESULT’ by 1 and call DFS(‘STRS[i]’).
Return ‘RESULT’.

Time Complexity

O((N ^ 2) * |S|), where ‘N’ is the size of array/list ‘STRS’ and |S| is the length of a string in ‘STRS’.

Here, there will be at most ‘N’ recursive calls to ‘DFS()’ and in each function call, it takes O(N * |S|) to check for all strings in ‘STRS’ whether ‘CURSTR’ is similar to it or not. Insertion in HashSet takes O(1) time. Thus overall complexity will be O((N ^ 2) * |S|).

Space Complexity

O(N * |S|), where ‘N’ is the size of array/list ‘STRS’ and |S| is the maximum length of a string in ‘STRS’.

Recursion depth can go up to ‘N’, so the extra space used by the recursion stack is O(N) and HashSet can have N spring of length |S|, so space used by it is O(N * |S|). Thus space complexity will be O(N * |S|) + O(N) = O(N * |S|).

Problem statement

Sample Input 1 :

Sample Output 1 :

Explanation Of Sample Input 1 :

Sample Input 2 :

Sample Output 2 :