Shortest Substring with all characters

Easy
0/40
profile
Contributed by
22 upvotes
Asked in companies
ZSAmazonUnthinkable Solutions

Problem statement

You have been given a string 'S' which only consists of lowercase English-Alphabet letters. Your task is to find the shortest (minimum length) substring from 'S' which contains all the characters of 'S' at least once.

Note:

If there are more than one substring with the shortest length, then find one which appears earlier in the string ‘S’ i.e. substring whose starting index is lowest.
For example:
If the given string ‘S’ = "abcba", then the possible substrings having all the characters of ‘S’ at least once and of minimum length are "abc" and "cba". 

As "abc" starts with a lower index (i.e. 0, "cba" starts with index 2), we will return the string "abc" as our shortest substring that contains all characters of 'S'.
Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first line of input contains an integer 'T' which denotes the number of test cases or queries to be run. Then each test case follows.

The first and the only line of each test case contains a single string 'S'.
Output Format :
For each test case, print a single line containing a string i.e. the shortest substring of 'S' which contains all the characters of 'S' at least once.

Note:

You don’t need to print the output, it has already been taken care of. Just implement the given function.

Constraints:

1 <= T <= 10 ^ 2
1 <= N <= 5 * (10 ^ 3)
S[i] = [a - z]

Where 'T' denotes the number of test cases and ‘N’ is the length of the given string ‘S’.
The given string 'S' consists of only English lowercase alphabets.

Time Limit: 1 sec 
Sample Input 1:
2
aabcabb
abcdcda
Sample Output 1:
abc
abcd
Explanation for Sample Output 1:
Test Case 1 :

Some of the possible substrings are "aabcabb", "aabc", "abcab", "abc" having all the different characters of ‘S’ at least once. Out of all these substrings, we will have "abc", "bca" and "cab" with the shortest length. 

As "abc" appears earliest in the string, we will return "abc" in the output.

Test Case 2 :

Some of the possible substrings are “abcd”, “bcdcda” having all the different characters of string ‘S’ at least once. “abcd” is the only substring with the shortest length. 
Sample Input 2:
2
cddeyys
bbbb
Sample Output 2:
cddeyys
b
Hint

Count distinct characters in the string and check all possible substrings.

Approaches (2)
Brute Force

The problem boils down to counting distinct characters that are present in the string and then finding the minimum length substring that contains distinct characters at least once.

 

We can check for all substrings with two nested iterations and maintain a 'COUNT' for distinct characters for each substring with the help of a hashmap. There are many methods for maintaining this 'COUNT' with a hashmap, like the size of the hashmap or maintaining the 'COUNT' variable, etc. 

 

Whenever the count of distinct characters for a substring equals the count of distinct characters in the given string, we process this substring. We will choose the minimum length substring out of all these substrings.

 

Approach :

 

  • First, we will take an unordered set 'HASHSET' to store the distinct characters of the string ‘S.’
  • Make an integer variable say 'STARTINDEX', which will store the starting index of the substring with minimum length and having all the distinct characters of string ‘S.’
  • Make two nested iterations, one with iterator pointer ‘i’, which will iterate throughout the length of the string ‘S’ and another for nested iteration with iterator pointer ‘j' iterates from ‘i’ till the length of string ‘S'
  • Make a counter 'COUNT' which will store the distinct characters (with the help of size another unordered set 'VISITED')  from the substring starting from ‘ith’ index and ending at ‘jth’ index.
  • When ‘COUNT’ becomes equal to count of all distinct characters of string ‘S’ then store the length of this substring in integer variable say, 'MINLENGTH' and updating 'STARTINDEX' with ‘i’.
  • Every time we encounter a 'COUNT' less than the previous ‘COUNT’, we will update 'MINLENGTH'.
  • After the iterations, return the substring from string ‘S’  starting from 'STARTINDEX' and with the length of 'MINLENGTH'.
Time Complexity

O(N ^ 2), where ‘N’ is the length of the given string ‘S’.

 

Since we are running two nested iterations for checking all the N * (N + 1) / 2 substrings of the given string and hashmap operations take constant time, so overall time complexity will become O(N ^ 2).

Space Complexity

O(K), where ‘K’ is the number of distinct characters in the given string ‘S’.

 

Since we are using two hashmaps and their size will not exceed the count of distinct characters in the string, so overall space complexity will become O(K). 

Code Solution
(100% EXP penalty)
Shortest Substring with all characters
Full screen
Console