Last Updated: 18 Dec, 2020

Match Specific Pattern

Easy
Asked in company
HSBC

Problem statement

Ninja has given you a list of WORDS and a PATTERN string. Your task is to find all such words in the list which match the given pattern. A valid match is found if and only if every character in the pattern is uniquely mapped to a character in a word.

Example:

Let the list of words be {cod, zcz} and the pattern be ‘nin’.
For each word in the list, we will check whether the word matches the pattern or not:

For the word ‘cod’:
Letter ‘n’ in the pattern maps to letter ‘c’ in the word.
Letter ‘i’ in the pattern maps to letter ‘o’ in the word.
Letter ‘n’ in the pattern maps to letter ‘d’ in the word.

As the same letter ‘n’ in the pattern, maps to two different letters ‘c’ and ‘d’ in the word. Hence, ‘cod’ is not a valid match.

For the word 'zcz':
Letter ‘n’ in the pattern maps to letter ‘z’ in the word.
Letter ‘i’ in the pattern maps to letter ‘c’ in the word.
Letter ‘n’ in the pattern maps to letter ‘z’ in the word.

As every letter in the pattern maps uniquely to a corresponding letter in the word. Hence 'zcz' is a valid match.
Input format:
The very first line of input contains an integer ‘T’ denoting the number of test cases. 

The first line of every test case contains an integer ‘N’ denoting the number of words present in the list. 

The second line of every test case contains ‘N’ space-separated strings denoting the words in the list.

The third line of every test case contains the pattern string.
Output format:
For each test case, print space-separated words which are a valid match with the given pattern. 
Note:
Print the words in the same order in which they occur in the list.

You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10 
1 <= N <= 10 ^ 2
1 <= Length of pattern, Length of each word <= 10 ^ 3

Time Limit: 1 sec

Approaches

01 Approach

The idea is to generate hashes of each word present in the list and compare this hash value with the hash of the pattern string. 

  • Firstly, It should be noted that a word can match the pattern only if the length of the word is equal to the length of the pattern. So, before generating the hash value, we must compare the lengths to check whether they are equal or not to get the answer in ease.
  • Otherwise, If the hash value of the pattern as well as the word matches, it tells us that the pattern and the word have the same structure i.e. it’s a valid word.

How do we generate a hash for a given word or pattern?

  • A simple way to generate the hash would be to assign a unique integer to every distinct character in the word (and pattern) one by one, thereby generating a string of integers. This string of integers acts as a hash value for the word (or pattern).
  • If the hash value of the word matches with the pattern, the word added to the list of valid words. Hence, we can use this check for the similarity between the word and the pattern.

Let us understand this by an example

Let the word be “aabaddeeecf” and the pattern be “ppqpsstttvu”.

Generating the string of integers (hash) for the word:

a -> 1

b -> 2

d -> 3

e -> 4

c -> 5

f -> 6

Hash for the word: 11213344456

Generating the string of integers (hash) for the pattern:

p -> 1

q -> 2

s -> 3

t -> 4

v -> 5

u -> 6

Hash for the pattern: 11213344456

As hashes for word and pattern match each other. Hence, the word is a valid match.

02 Approach

Instead of generating a hash for every word and comparing it with the hash of pattern, a better approach is to map the letters of pattern and the corresponding letters in the word, to each other.

Then, check whether every character in the pattern maps uniquely to a character in a word. And also check whether every character in the word maps uniquely to a character in the pattern.


During the process of mapping three cases may arise:
 

  • Case 1: The character in the pattern, as well as the character in the word, are not mapped to any other character.
    • In such a case map these characters to each other.
  • Case 2: The character in the word is already mapped to some character in the pattern. 
    In such a case, check whether the new value to which it is being mapped is the same as the value to which it is already mapped.
    • If it's the same, then no problem, we can continue.
    • Otherwise, the same character in the word is getting mapped to two different characters of the pattern, this is not possible. Hence the word is not a valid match.
  • Case 3: The character in the pattern is already mapped to some character in the word. 
    In such a case, check whether the new value to which it is being mapped is the same as the value to which it is already mapped.
    • If it's the same, then no problem.
    • Otherwise, the same character in the pattern is getting mapped to two different characters of the word, this is not possible. Hence the word is not a valid match.


The algorithm for the approach is as follows:
 

  1. Create two maps, mapPat to store the mapping between the character of the pattern to character of the word and another map, mapWord, to store the mapping between the character of the word to character of the pattern.
  2. Loop 1: For every word in the list:
    • Check if length(word) = length(pattern):
      • If true: move to step 2.b
      • Otherwise, the current word is not a valid match, move to the next word.
    • Loop: For i = 0 to length(pattern):
      • Let chW = word[i] and chP = pattern[i].
      • Check for Case 1: If mapPat[chP] = NULL and mapWord[chW] = NULL: Then set mapPat[chP] = chW and mapWord[chW] = chP.
      • Otherwise, check for Case 2: If mapPat[chP] != NULL and mapPat[chP] != chW: Then the current word is not a valid match, break.
      • Otherwise, check for Case 3: If mapWord[chW] != NULL and mapWord[chW] != chP: Then the current word is not a valid match, break.
    • If the word is valid, add it to the list of valid words.
  3. Return the list of valid words.

 

Let us understand this by an example:

Let’s assume the list of words is {abb, pqr} and the pattern given is “zxx”.

Mapping the pattern and the word “abb”:

z <-> a

x <-> b

x <-> b

As every letter in the pattern maps uniquely to a corresponding letter in the word. Hence “abb” is a valid match.

Mapping the pattern and the word “pqr”:

z <-> p

x <-> q

x <-> r

As the same letter ‘x’, in the pattern is getting mapped to two different letters ‘q’ and ‘r’ in the word. Hence “pqr” is not a valid match.