Last Updated: 15 Jan, 2021

Shortest Unique Prefix

Moderate
Asked in companies
OlaJUSPAYTata1mg

Problem statement

You are given an array containing ‘N’ words. For each word, you need to find its shortest prefix which can uniquely identify it. For example “abcd” and “abdc” both have the prefix “ab” in common so we can’t uniquely find a word using the prefix “ab”. To uniquely identify both the words we need the prefix “abc” from “abcd” and “abd” from “abdc”.

Note:
You can assume that the words are unique. It means that it is always possible to find a unique prefix for each word.
Input Format:
The first line of the input contains an integer ‘T’ denoting the number of test cases.
The next ‘2*T’ lines describe the ‘T’ test cases.

The first line of each test case contains single positive integers ‘N’ denoting the number of words.
The next ‘N’ lines contain a string of lower case characters.
Output Format:
The output of each test case should contain 'N' lines, in the ith line you need to print the shortest unique prefix for ith word.

Print the output of each test case in a separate line.
Note:
You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 50
1 <= N <= 10^4  

Where ‘T’ is the number of test cases, ‘N’ is the number of words and, the sum of the lengths of all the words in a test case is less than 10^4.

Time Limit: 1 sec

Approaches

01 Approach

  • For each word, we will try each prefix one by one to check if this prefix can be the shortest unique prefix or not.
  • We will check the prefixes of length shortest to longest and for each prefix, we will find if this is already a prefix of some word or not. If this prefix is not a prefix of some other word then it is the unique prefix for our current word. Also, it will be the shortest prefix because we are checking the prefixes from length shortest to longest.
  • We will store each prefix of the word in an array and return it as the answer.

02 Approach

  • In the previous approach, we are comparing a prefix from all the strings. If we carefully observe then we can notice that we don’t need to check a prefix from all the strings we just need to compare it with the strings having similar prefixes.
  • To find the strings which are most similar to a particular string, we can sort the array and strings lying adjacent to each other are similar ones. For example, if we have strings { “any”, “are”, “we”, “and”} so after sorting the array will look like {“and”,”any”,”are”,”we”} so let’s say for the string “any” the strings lying adjacent to it are “and” and “are” which is similar to “any” but the string “we” is far different ( or less similar ) from the string “any”.
  • So if we choose a prefix that is unique from the prefixes of adjacent strings it will be unique from the prefixes of the rest of the strings too. Now our comparisons are reduced and we will just check the adjacent string in the sorted array to find our answer for a specific word.
  • For implementation, if we are finding the answer of ith string in the sorted array of strings we find the position of the first different character of string[i] and string[i-1] let’s say that the position is pos1 and similarly we find the position of the first different character of string[i] and string[i+1] let’s say that position is pos2. So the prefix of length max(pos1,pos2) will be our answer to the string[i].

03 Approach

  • In the previous approach, We sort the array to compare a string with its similar strings. We can directly compare a string with its similar using Trie data structure. It is a prefix tree in which we can insert a string and perform prefix search in linear time (more precisely total operations are equal to the length of the string).
  • We will insert all the strings in a trie and also maintain a count at each trie node. The count represents the number of times this node is visited during the insertion process. It will be used to know how many strings are there having this character at a position equal to the level of that node in the trie.
  • We have to find a node having a count equal to 1 which means that there exists only one string having this character at a position equal to the level of that node in the trie.
  • To do the above process we will traverse the Trie according to the next available character in our current string. For every node, we will check its count if it’s equal to 1 which means we find our unique prefix. As this was the first node having a count equal to 1 while traversing so this guarantees the prefix to be shortest as well.
  • So our answer for the current string will be the prefix of length equal to the level of that node in the trie.
  • Similarly, we find unique prefixes for all the strings and store them in an array to return this as our answer.