Spell Checker

Hard
0/120
Average time to solve is 50m
profile
Contributed by
19 upvotes
Asked in companies
AtlassianFlipkart limitedGoogle inc

Problem statement

You are given a list of strings, ‘DICTIONARY[]’ that represents the correct spelling of words and a query string ‘QUERY’ that may have incorrect spelling. You have to check whether the spelling of the string ‘QUERY’ is correct or not. If not, then return the list of suggestions from the list ‘DICTIONARY’. Otherwise, return an empty list which will be internally interpreted as the correct string.

Note:

1) The suggested correct strings for the string  ‘QUERY’ will be all those strings present in the ‘DICTIONARY[]’ that have the prefix same as the longest prefix of string ‘QUERY’.

2) The ‘DICTIONARY[]’ contains only distinct strings.

For example:

Given 'DICTIONARY[]' = {“ninja”, “ninjas”, “nineteen”, “ninny”} and query = “ninjk”. Since “ninjk” is not present in the ‘DICTIONARY[]’, it is an incorrect string. The suggested current spellings for “ninjk” are “ninja” and “ninjas”. It is because “ninj” is the longest prefix of “ninjk” which is present as the prefix in ‘DICTIONARY’.
Detailed explanation ( Input/output format, Notes, Images )
Input Format
The first line of input contains an integer 'T' representing the number of test cases.

The first line of each test case contains an integer ‘N’ representing the number of strings in the list, ‘DICTIONARY[].’

The second line of each test case contains ‘N’ space separated strings present in the list, ‘DICTIONARY[]’.

The last line of each test case contains the ‘QUERY’ string.
Output Format:
For each test case, return a list of strings containing all suggested correct spellings from the list, ‘DICTIONARY[]’, if the spelling of the ‘query’ string is incorrect. Otherwise, return an empty list that will be internally interpreted as the correct string and will print “CORRECT”.

The output of each test case will be printed in a separate line.
Note:
You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 5
1 <= N <= 100
1 <= |DICTIONARY[i]| <= 100
1 <= |QUERY| <= 100
‘DICTIONARY[i]’ and ‘QUERY’ contains only lowercase english letters.

Where ‘|DICTIONARY[i]|’ is the length of the string in the list ‘DICTIONARY' and ‘|QUERY|’ is the length of the ‘QUERY’ string.

Time limit: 1 sec
Sample Input 1:
2
5
class coder coding college ninjas
spell
4
code with coding ninjas
ninjas
Sample Output 1:
class coder coding college ninjas 
CORRECT
Explanation for Sample Output 1:
In test case 1, Given 'DICTIONARY[]' = {"class", "coder", "coding", "college", "ninjas"} and 'QUERY' = “spell”. ‘QUERY’ string is incorrect as it is not present in the ‘DICTIONARY[]’. The longest prefix which is also present as the prefix in ‘DICTIONARY’ is “”, i.e., empty string.

So, all the strings of ‘DICTIONARY’ will be considered as the correct suggested spellings. 

Therefore, the output is "class", "coder", "coding", "college" and "ninjas".

In test case 2, Given 'DICTIONARY[]' = {“code”, “with”, “coding”, “ninjas”} and 'QUERY' = “ninjas”. Since the ‘QUERY’ string is present in the dictionary, its spelling is correct. Thus, at the output, ‘CORRECT’ is printed.

Sample Input 2:

2
6
abcde abfd abcxyp mnbs aaaa pkl
abc
5
abc def ghi jkl mnop
pqr

Sample Output 2:

abcde abcxyp
abc def ghi jkl mnop
Explanation for Sample Output 2:
In test case 1, Given 'DICTIONARY[]' = {"abcde ", "abfd", "abcxyp", "mnbs", "aaaa", "pkl"} and 'QUERY' = “abc”. ‘QUERY’ string is matched with "abcde" and "abcxyp" present in the ‘DICTIONARY[]’. 

So, "abcde" and "abcxyp" strings of ‘DICTIONARY’ will be considered as the correct suggested spellings. 

Therefore, the output is  "abcde" and "abcxyp".

In test case 2, Given 'DICTIONARY[]' = {"abc", "def", "ghi", "jkl", "mnop"} and 'QUERY' = “pqr”. ‘QUERY’ string is incorrect as it is not present in the ‘DICTIONARY[]’. The longest prefix which is also present as the prefix in ‘DICTIONARY’ is “”, i.e., empty string.

So, all the strings of ‘DICTIONARY’ will be considered as the correct suggested spellings. 

Therefore, the output is "abc", "def", "ghi", "jkl", "mnop".
Hint

This is a typical search problem where you have to check whether the ‘QUERY’ string matches with any other string in the ‘DICTIONARY’ or prefix of the string in the ‘DICTIONARY’ using TRIE.

Approaches (1)
Trie

The entire problem is reduced to a prefix matching problem when the ‘QUERY' string is not present in the ‘DICTIONARY’. The idea is to use trie data structure as it is the most suitable and fastest data structure in prefix matching.

 

First, we will build a trie and insert all the strings of ‘DICTIONARY’ in the trie. Then, we will traverse the string, ‘QUERY and match it with the character present in the trie. At any moment, the character does not match, the substring before that character will be the longest prefix of string, ‘QUERY. So, we will store all the strings from the ‘DICTIONARY’ that matches with the longest prefix of string, ‘QUERY, in a list and then return the list.

 

If the string, ‘QUERY matches completely with any string in the trie. It suggests that ‘QUERY is spelt correctly. So, we will return an empty list.

 

Structure of ‘Trie’:

 

  • A pointer array, ‘CHILD’ of type ‘Trie’ of size 26 to point to the string’s next character.
  • A boolean variable, ‘isEnd’, to mark the end of the string.
  • Constructor of ‘Trie’ initializes all the elements of ‘CHILD’ by NULL and ‘isEnd’ by false.

 

The steps are as follows:

 

  • Create a node ‘ROOT' of type ‘Trie’.
  • Run a loop from ‘i’ = 0 to the size of ‘DICTIONARY’:
    • Call the ‘INSERT’ function and pass ‘ROOT’ and string at ‘i’th position in the dictionary to insert the string in the ‘Trie’.
  • Declare a vector of string, ‘RESULT’ to store all possible spelling suggestions.
  • Run a loop from ‘i’ = 0 to the length of ‘QUERY':
    • Declare an integer variable, ‘INDEX’ and assign ‘QUERY[i]’ - ‘a’, that stores the index of the character in the ‘Trie’.
      • If 'ROOT'->‘CHILD’[index] == NULL that shows that character at index ‘i’ in ‘QUERY' is not present in the ‘Trie’.
        • Declare a string, ‘prefixQuery’, and store the substring of ‘QUERY' up to index ‘i’ - 1 in ‘prefixQuery’.
        • Pass ‘ROOT’, ‘prefixQuery’ and ‘RESULT’ to the ‘findSuggestions’ function to store all the suggested correct spellings starting with prefix ‘prefixQuery’ in the ‘RESULT’.
        • Return ‘RESULT’.
  • If ‘ROOT’->isEnd = true, that shows that the string, ‘QUERY' is present in the ‘Trie’ (i.e., ‘query’ is present in the ‘dictionary’). So, ‘QUERY is correctly spelt. Therefore, return ‘RESULT’, i.e., an empty list.
  • Otherwise, the longest prefix of ‘QUERY' to be searched in ‘DICTIONARY’ is ‘QUERY’ itself. So, pass ‘ROOT’, ‘QUERY’ and ‘RESULT’ to the ‘findSuggestions’ function to store all the suggested correct spellings starting with the prefix ‘QUERY' in the ‘RESULT’.
  • Return the ‘RESULT’.

 

Description of ‘INSERT’ function:

 

This function will take two parameters :

  • ‘ROOT’: A pointer to the root node of the ‘Trie’.
  • ‘STR’: A string variable to store the string from ‘DICTIONARY’.

 

void insert(‘ROOT’, ‘STR’):

  • Declare an integer variable, ‘i’ and initialize with 0 to traverse the string, ‘STR’.
  • Run a loop while i < length of ‘STR’
    • Declare an integer variable, ‘INDEX’ and assign ‘STR[i]’ - ‘a’, that stores the index of the current character of ‘STR’ in the ‘Trie’.
      • If ‘ROOT’->‘CHILD’[INDEX] == NULL that shows that the current character of ‘STR’ is not present in the ‘Trie’. So, create a new node and update ‘‘ROOT’->‘CHILD’[i]’ by address of the new node.
    • Update ‘ROOT’ by ‘‘ROOT’->‘CHILD’[i]’.
    • Increment ‘i’ by 1.
  • Update ‘ROOT’->'isEnd' by true.

 

Description of ‘findSuggestions’ function:

 

This function will take three parameters :

  • ‘ROOT’: A pointer to the root node of the ‘Trie’.
  • ‘possibleAnswer’: A string variable to store the possible answer prefixed with the longest prefix of the string, ‘QUERY’.
  • ‘RES’: A vector of string to store all the suggested correct spellings.

 

void findSuggestions(‘ROOT’, ‘possibleAnswer', ‘RES’):

  • If ‘ROOT’->'isEnd' = true, that shows we have reached the end of the string. So, insert ‘possibleAnswer’ in ‘RES’ and return.
  • Otherwise, run a loop from ‘i’ = 0 to 26.
    • If ‘ROOT’->‘CHILD’[i] != NULL that shows that the character at index ‘i’ in ‘Trie’ is present.
      • Insert the character at the end of the string, ‘possibleAnswer’.
      • Recursively call ‘findSuggestions’ function with updated ‘possibleAnswer’.
      • Pop the last character of the string, ‘possibleAnswer’.
Time Complexity

O(N * M), Where ‘N’ is the number of strings in the ’DICTIONARY’ and ‘M’ is the maximum length of the string present in the ‘DICTIONARY’.

 

Since the longest prefix of ‘QUERY’ is an empty string, we have to traverse all the strings present in the ‘Trie’ that will lead to O(N * M) time complexity.

Space Complexity

O(26 * N), Where ‘N’ is the number of strings in the ’dictionary’. 

 

Since we are storing all the ‘N’ strings in ‘Trie’ and at every ‘Trie' node, we have 26 pointers. So, overall space complexity will be O(26 * N).

Code Solution
(100% EXP penalty)
Spell Checker
Full screen
Console