Design Search Autocomplete System

Hard
0/120
Average time to solve is 25m
profile
Contributed by
11 upvotes
Asked in companies
UberTwitterMicrosoft

Problem statement

Ninja has enrolled in a system design course at Coding Ninjas, and his first assignment is to create a search autocomplete system for a search engine.

Ninja is given ‘N’ sentences ‘SENTENCES’ and ‘N’ integers ‘TIMES’, where ‘TIMES’[i] is the number of times the ‘SENTENCES’[i] was typed. Now, the instructor will input a test sentence(at least one word and end with ‘#’). As Ninja’s best friend, can you help him with the assignment?

Your task is to implement the AutocompleteSystemclass:

  • AutocompleteSystem(String[] sentences, int[] times)Initializes the object with the sentences and times arrays.
  • List input(char c)This indicates that the user typed the character c.
    • Returns an empty array [], if c == '#' and stores the inputted sentence in the system.
    • Returns the top 3 historical hot sentences with the same prefix as the sentence already typed. If there are fewer than 3matches, return them all.

Here are the specific rules:

  • A sentence’s hot degree is defined as the number of times a user typed the same sentence before.
  • The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same hot degree, use ASCII-code order (smaller one appears first).
  • If less than three hot sentences exist, return as many as possible.
  • When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.

For Example:

Input:
["AutocompleteSystem","input","input","input"]
[[["i want to join faang","icecream","i love coding ninjas"],[4,3,1],["i"],["s"],["#"]]

Output:
[null,["i want to join faang","icecream","i love coding ninjas"],[],[]]

Explanation:
* AutocompleteSystem obj = new AutocompleteSystem(["i want to join faang","icecream","i love coding ninjas"],[4,3,1]); 
* obj.input(“i”); // return ["i want to join faang","icecream","i love coding ninjas"]. There are three sentences that have prefix "I".
* obj.input(“s”); // return []. There is no sentence with prefix “is”
* obj.input(“#”); // return [].
Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first line of the input contains a single integer 'T', representing the number of test cases.

The first line contains an integer ‘N’, representing the number of sentences.

The next line of each test case contains ‘N’ + 1 string where the first string is always “AutocompleteSystem” and the following ‘N’ strings are “input”.

The next ‘N’ lines contain a String, representing a sentence.

The next line contains ‘N’ space-separated integers representing the elements of the ‘TIMES’ array.

The next line contains a sentence inputted by the user for the testing purpose.
Output Format:
For each test case, print all pairs of the distinct indices (i, j) in ‘WORDS’, such that the concatenation of WORDS[i] and WORDS[j] is a palindrome.

Print the output of each test case in a separate line.
Note:
You do not need to print anything. It has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10
1 <= N <= 100
1 <= SENTENCES[i].LENGTH <= 100
1 <= TIMES[i] <= 50
SENTENCES[i] consists of lowercase English letters
‘C’ is a lowercase English letter, a hash '#', or space ' '.
Each tested sentence will be a sequence of characters c that end with the character '#'.
Each tested sentence will have a length in the range [1, 200].
The words in each input sentence are separated by single spaces.
At most 5000 calls will be made to input.


Time Limit: 1 sec
Sample Input 1:
2
3
i want to join faang
icecream
i love coding ninjas
4 3 1
is#
3
batman
superman
spiderman
4 3 2
sp#
Sample Output 1:
[i want to join faang, icecream, i love coding ninjas]
[]
[]
[superman, spiderman]
[spiderman]
[]
Explanation of Sample Input 1:
For the first test case, 
AutocompleteSystem obj = new AutocompleteSystem(["i want to join faang","icecream","i love coding ninjas"],[4,3,1]); 
obj.input(“i”); // return ["i want to join faang","icecream","i love coding ninjas"]. There are three sentences that have prefix "i".
obj.input(“s”); // return []. There is no sentence with prefix “is”
obj.input(“#”); // return [].

For the second test case, 
AutocompleteSystem obj = new AutocompleteSystem([["batman","superman","spiderman"],[4,3,1]],["s"],["p"],["#"]]); 
obj.input(“s”); // return [superman, spiderman]. There are two sentences that have prefix "s".
obj.input(“p”); // return [spiderman]. There is one sentence with prefix “sp”
obj.input(“#”); // return [].
Sample Input 2:
2
2
stack and queues
trees and grpahs
3 1
t#
4
aaa
aab
abb
aac
3 2 2 1
abc#
Sample Output 2:
[trees and graphs]
[]
[aaa, aab, abb]
[aab]
[]
[]
Hint

Use a data structure to store the sentences efficiently.

Approaches (1)
Trie

Prerequisite: Trie data structure

 

In this approach, we will use a Trie. Trie is an efficient data retrieval data structure mostly used for string manipulations. It provides a way to store strings efficiently and also to search for them in a lot lesser time complexity. We will store the top 3 hot sentences directly on each node. When we get the input character, we will simply read off the answer from the node. Whenever we add a sentence at each node of the corresponding trie, we will update that node’s info appropriately if its top three answer changes. We will use a list to store our answers, which will keep track of only the top three hot sentences.

 

Each Trie Node will have four fields:

  1. TrieNode[] children: Denoting next layers.
  2. String s: Any sentence.
  3. int times: Hot degree of this sentence ‘S’.
  4. List<TrieNode> hot: To store the top 3 hot sentences at this node.

 

Comparing two Trie Nodes:

A Trie Node ‘A’ is hotter than Trie Node ‘B’ if A.TIMES is greater than B.TIMES. If A.TIMES is equal to B.TIMES, then ‘A’ is hotter than ‘B’ if A.S is lexicographically smaller than B.S. Otherwise, ‘B’ is hotter and will be placed ahead of ‘A’.
 

How to get the top 3 hot sentences for any testing sentence?

Suppose if the testing sentence is “abc” and there exits a Trie Node ‘T’, where T.s = “abc” then the top 3 hot sentences for this query will be the sentences stored in T.hot.

 

Algorithm :

Function AUTOCOMPLETESYSTEM(‘SENTENCES’, ‘TIMES’);

  • Initialize a TrieNode ‘ROOT’ and ‘CUR’.
  • Initialize a StringBuilder ‘SB’.
  • Iterate ‘I’ in ‘TIMES.LENGTH’:
    • Add ‘SENTENCES’[‘i’] and ‘TIMES’[‘I’] in the Trie.
       

Function INPUT(‘C’);

  • Initialize a list ‘RESULT’.
  • If ‘C’ is ‘#’:
    • Add ‘SB’ to Trie.
    • Re-initialize ‘SB’.
    • Set ‘CUR’ as ‘ROOT’.
    • Return ‘RESULT’.
  • Append ‘C’ to ‘SB’.
  • If ‘CUR’ is not null, set ‘CUR’ as ‘CUR.CHILDREN’[‘C’].
  • If ‘CUR’ is null, return ‘RESULT’.
  • Iterate ‘NODE’ in ‘CUR.HOT’:
    • Add ‘NODE.S’ to ‘RESULT’.
  • Return ‘RESULT’.

 

Maintain a class TrieNode:

  • Initialize a TrieNode array ‘CHILDREN’ of size 128, a string ‘S’ as null, an integer ‘TIMES’ as 0, and an empty TrieNode list ‘HOT’.
  • Inside class TrieNode, maintain a function ‘UPDATE’(TrieNode ‘NODE’):
    • Add ‘NODE’ to ‘HOT’ of this TrieNode.
    • Sort ‘HOT’.
    • If ‘HOT’ contains more than three TrieNodes, remove the extra ones and just keep the top 3.
       

Maintain a function ADD(‘SENTENCE’, ‘T’):

  • Set ‘TEMP’ as ‘ROOT’.
  • Initialize a list ‘VISITED’.
  • Iterate ‘C’ in ‘SENTENCE’:
    • Set ‘TEMP’ as ‘TEMP.CHILDREN’[‘C]’.
    • Add ‘TEMP’ to ‘VISITED’,
  • Set ‘TEMP.S’ as ‘SENTENCE’.
  • Add ‘T’ to ‘TEMP.TIMES’.
  • Iterate ‘NODE’ in ‘VISITED’:
    • Call ‘NODE.UPDATE’(‘TEMP’).
Time Complexity

O(N * L + C), where 'N' is the number of sentences, ‘L’ is the length of the longest sentence, and ‘C’ is the number of characters in the testing sentence.

 

The time for building Trie is O(N * L), and the time for answering queries is O(C). Hence, the total time complexity is O(N * K).

Space Complexity

O(N * L), where 'N' is the number of sentences and ‘L’ is the length of the longest sentence.
 

We may have to store ‘N’ sentences of length ‘L’ in Trie. Hence, the total space complexity is O(N * L).

Code Solution
(100% EXP penalty)
Design Search Autocomplete System
Full screen
Console