Last Updated: 3 Apr, 2021

Identical sentences

Hard
Asked in companies
UberVisaExpedia Group

Problem statement

You are given two sentences, ‘word1’ and ‘word2’, represented as an array of strings of size ‘n’ and ‘m’, respectively. You are also given an array called ‘pairs’. Each element in ‘pairs’ is of the form ‘[u, v]’ where ‘u’ and ‘v’ are strings.

Properties of the array ‘pairs’:
  1. If ‘[u, v]’ or ‘[v, u]’ is present in ‘pairs’, then the words (or strings) ‘u’ and ‘v’ are identical.
  2. If ‘u’ and ‘x’ are identical, and ‘v’ and ‘x’ are identical, then the words ‘u’ and ‘v’ are identical.
  3. Every word is identical to itself, i.e., the word ‘u’ and ‘u’ are always identical.

For two sentences, ‘word1’ and ‘word2’ to be identical:
  1. For every word (‘word1[i]’) in ‘word1’, the words ‘word1[i]’ and ‘word2[i]’ must be identical.
  2. ‘word1’ and ‘word2’ must have the same number of words.

Using the array ‘pairs’, you have to determine if ‘word1’ and ‘word2’ are identical.

Example :
‘word1’ = [“exceptional”, “coding”, “skills”]
‘word2’ = [“great”, “coding”, “talent”]
‘pairs’ = [ [“exceptional”, “good”], [“great”, “good”], [“skills”, “talent”] ]

For each word in ‘word1’, we have:
1. “exceptional” = “great”, because “exceptional” = “good” and “good” = “great”
2. “coding” = “coding”, as every word is identical to itself.
3. “skills” = “talent”, because [“skills”, “talent”] is present in ‘pairs’.

As every word in ‘word1’ is identical to the corresponding word in ‘word2’, the given sentences are identical.
Note :
1. The array ‘pairs’ can have words that are not present in ‘word1’ and ‘word2’.
2. Each pair ‘[u, v]’ in ‘pairs’ is unique, and if a pair ‘[u, v]’ is present, then there will never be a pair ‘[v, u]’.
3. You do not need to print anything; it has already been taken care of. Just implement the function.
Input format :
The first line of input contains an integer ‘T’ which denotes the number of test cases. Then, the ‘T’ test cases follow.

The first line of each test case contains three integers, ‘n’, ‘m’, and ‘p’ denoting the number of elements in array ‘word1’, ‘word2’, and ‘pairs’.

The second line of each test case contains ‘n’ space-separated words denoting ‘word1’.

The third line of each test case contains ‘m’ space-separated words denoting ‘word2’.

The following ‘p’ lines of each test case contain two space-separated words denoting an element in ‘pairs’.

For more clarity, please refer to the sample inputs. 
Output format:
For every test case, return true if the sentences ‘word1’ and ‘word2’ are identical, else return false.
Constraints:
1 <= T <= 1
1 <= n, m <= 1000
0 <= p <= 2000
Each element of ‘pairs’ contains exactly two words.
Length of each word in ‘word1’, ‘word2’ and ‘pairs[i]’ is in the range [1, 20].
Value of each word in ‘word1’, ‘word2’ and ‘pairs[i]’ varies from [a, z].

Where ‘T’ is the number of test cases, ‘n’ is the number of words in ‘word1’, ‘m’ is the number of words in ‘word2’, and ‘p’ is the number of elements in ‘pairs’.

Time limit: 1 second

Approaches

01 Approach

Consider each element ‘[u, v]’ in ‘pairs’ as an edge of an undirected graph (as ‘[u, v]’ and ‘[v, u]’ are the same). If a node ‘v’ is reachable from a node ‘u’, then the string ‘u’ and ‘v’ are identical. Now, traverse the graph using DFS (or BFS) traversal starting from ‘word1[i]’, and if we can reach the node ‘word2[i]’ then ‘word1[i]’ and ‘word2[i]’ are identical. Following is a solution using DFS traversal:

 

  • If ‘n’ is not equal to ‘m’, then return false.
  • Create a hash map ‘graph’ that maps strings to a list of strings. Use ‘graph[str]’ to store the nodes connected to string ‘str’.
  • Run a loop where ‘i’ ranges from 0 to ‘p-1’ and create an edge in ‘graph’ for each element in ‘pairs’:
    • ‘u = pairs[i][0]’ and ‘v = pairs[i][1]’.
    • Insert the string ‘u’ into ‘graph[v]’.
    • Insert the string ‘v’ into ‘graph[u]’.
  • Run a loop where ‘i’ ranges from 0 to ‘n-1’:
    • If ‘word1[i] = word2[i]’, then the two strings are identical:
      • Continue the loop from next iteration.
    • Create a hash map ‘visited’. It’s used to keep track of visited nodes in the DFS traversal.
    • If ‘dfs(word1[i], word2[i])’ is false, the two strings are not identical:
      • Return false for the answer.
  • Return true for the answer (All words in ‘word1’ and ‘word2’ are identical).

 

The ‘dfs(string s1, string s2)’ function:

  • Add the string ‘s1’ into ‘visited’.
  • Run a loop where ‘i’ ranges from 0 to ‘length(graph[s1])’:
    • If ‘graph[s1][i] = s2’, then the strings are identical:
      • Return true.
    • If we have not visited ‘graph[s1][i]’ and ‘dfs(graph[s1][i], s2)’ is true, then:
      • Return true.
  • Return false, as we have not found the string ‘s2’.

02 Approach

After constructing a graph with the ‘pairs’ array as an edge, the problem reduces to finding if the node with ‘word2[i]’ is reachable from the node with ‘word2[i]’, i.e., to check if both these nodes belong to the same connected component.

Connected component: A subgraph in which each pair of nodes is connected via a path.

We can find the connected components by performing a DFS (or BFS) traversal starting from every unvisited node. Each DFS traversal will mark each node in a single connected component as visited. Here, the visited array will store the component number of each node. If two nodes have the same component number (they belong to the same component), they are connected through a path. Following is a solution using DFS traversal:

 

  • If ‘n’ is not equal to ‘m’, then return false.
  • Create a hash map ‘graph’ that maps strings to a list of strings. Use ‘graph[str]’ to store the nodes connected to string ‘str’.
  • Run a loop where ‘i’ ranges from 0 to ‘p-1’ and create an edge in ‘graph’ for each element in ‘pairs’:
    • ‘u = pairs[i][0]’ and ‘v = pairs[i][1]’.
    • Insert the string ‘u’ into ‘graph[v]’.
    • Insert the string ‘v’ into ‘graph[u]’.
  • Set ‘num = 1’. Use it to mark each node with its corresponding component number.
  • Create a hash map ‘comp’ that maps strings to integers. Use it to keep track of visited nodes in the DFS traversal and store each node’s component number.
  • Run a loop where ‘i’ ranges from 0 to ‘n-1’:
    • If ‘word1[i]’ is not present in ‘comp’, then:
      • Call the function ‘dfs(word1[i], num)’. Mark all nodes that are reachable from ‘word1[i]’ into ‘comp’ with the value ‘num’.
      • ‘compNum += 1’
  • Run a loop where ‘i’ ranges from 0 to ‘n’:
    • ‘u = word1[i]’ and ‘v = word2[i]’.
    • If (‘u’ is not equal to ‘v’) and (‘comp[u]’ is not equal to ‘comp[v]’), then ‘u’ is not reachable from ‘v’:
      • Return false as the answer.
  • Return true for the answer (All words in ‘word1’ and ‘word2’ are identical).

 

The ‘dfs(string s1, integer num)’ function:

  • Add ‘(s1, num)’ into ‘comp’.
  • Run a loop where ‘i’ ranges from 0 to ‘length(graph[s1])’:
    • If ‘graph[s1][i]’ is not present in ‘comp’ then we have not visited this word:
      • Call the function ‘dfs(graph[s1][i], num)’.

03 Approach

Disjoint-set (or Union-find) is a data structure that stores a collection of disjoint (non-overlapping) sets. This data structure consists of three basic operations:

  • makeSet(v): Create a new set containing the element ‘v’.
  • unionSet(a, b): Merge the two specified sets, the set containing element ‘a’ and the set containing element ‘b’.
  • findSet(v): Return the representative (also called leader) of the set containing the element ‘v’.

 

The idea is to use a Disjoint-set ‘ds’ to store all nodes in a connected component into a single set. If two nodes are in the same set, they are identical (as they are reachable from one another). Following is the implementation:

 

  • If ‘n’ is not equal to ‘m’, then return false.
  • Initialize the Disjoint-set ‘ds’ with ‘2*p+1’ sets. In the worst-case, ‘pairs’ can have ‘2*p’ different string.
  • Create a hash map ‘words’ that maps strings to integers.
  • Map each string in ‘pairs’ to a unique integer, starting from ‘1’ (this will allow us to use the Disjoint-set data structure’s standard template).
  • Set ‘id = 1’
  • Run a loop where ‘i’ ranges from ‘0’ to ‘p-1’:
    • If ‘pairs[i][0]’ is not present in ‘words’, then:
      • Insert ‘(pairs[i][0], id)’ into ‘words’
      • ‘id += 1’
    • If ‘pairs[i][1]’ is not present in ‘words’, then:
      • Insert ‘(pairs[i][1], id)’ into ‘words’
      • ‘id += 1’
    • ‘u = words[pairs[i][0]]’ and ‘v = words[pairs[i][1]]’
    • ‘ds.unionSet(u, v)’. We merge the sets containing the words ‘u’ and ‘v’, as each node reachable from ‘u’ is also reachable from ‘v’ and vice-versa.
  • Run a loop where ‘i’ ranges from ‘0’ to ‘n-1’:
    • ‘u = word1[i]’ and ‘v = word2[i]’.
    • If ‘u = v’, then the strings are identical:
      • Continue to the next loop iteration.
    • If (‘u’ or ‘v’ is not present in ‘words’) or (‘ds.findSet(words[u]) != ds.findSet(words[v])’), then the strings are not identical:
      • Return false as the answer.
  • Return true as the answer.