Shortest Substring with all characters

If the given string ‘S’ = "abcba", then the possible substrings having all the characters of ‘S’ at least once and of minimum length are "abc" and "cba". As "abc" starts with a lower index (i.e. 0, "cba" starts with index 2), we will return the string "abc" as our shortest substring that contains all characters of 'S'.

The first line of input contains an integer 'T' which denotes the number of test cases or queries to be run. Then each test case follows. The first and the only line of each test case contains a single string 'S'.

1 <= T <= 10 ^ 2 1 <= N <= 5 * (10 ^ 3) S[i] = [a - z] Where 'T' denotes the number of test cases and ‘N’ is the length of the given string ‘S’. The given string 'S' consists of only English lowercase alphabets. Time Limit: 1 sec

Test Case 1 : Some of the possible substrings are "aabcabb", "aabc", "abcab", "abc" having all the different characters of ‘S’ at least once. Out of all these substrings, we will have "abc", "bca" and "cab" with the shortest length. As "abc" appears earliest in the string, we will return "abc" in the output. Test Case 2 : Some of the possible substrings are “abcd”, “bcdcda” having all the different characters of string ‘S’ at least once. “abcd” is the only substring with the shortest length.

Brute Force

The problem boils down to counting distinct characters that are present in the string and then finding the minimum length substring that contains distinct characters at least once.

We can check for all substrings with two nested iterations and maintain a 'COUNT' for distinct characters for each substring with the help of a hashmap. There are many methods for maintaining this 'COUNT' with a hashmap, like the size of the hashmap or maintaining the 'COUNT' variable, etc.

Whenever the count of distinct characters for a substring equals the count of distinct characters in the given string, we process this substring. We will choose the minimum length substring out of all these substrings.

Approach :

First, we will take an unordered set 'HASHSET' to store the distinct characters of the string ‘S.’
Make an integer variable say 'STARTINDEX', which will store the starting index of the substring with minimum length and having all the distinct characters of string ‘S.’
Make two nested iterations, one with iterator pointer ‘i’, which will iterate throughout the length of the string ‘S’ and another for nested iteration with iterator pointer ‘j' iterates from ‘i’ till the length of string ‘S'
Make a counter 'COUNT' which will store the distinct characters (with the help of size another unordered set 'VISITED') from the substring starting from ‘ith’ index and ending at ‘jth’ index.
When ‘COUNT’ becomes equal to count of all distinct characters of string ‘S’ then store the length of this substring in integer variable say, 'MINLENGTH' and updating 'STARTINDEX' with ‘i’.
Every time we encounter a 'COUNT' less than the previous ‘COUNT’, we will update 'MINLENGTH'.
After the iterations, return the substring from string ‘S’ starting from 'STARTINDEX' and with the length of 'MINLENGTH'.

Time Complexity

O(N ^ 2), where ‘N’ is the length of the given string ‘S’.

Since we are running two nested iterations for checking all the N * (N + 1) / 2 substrings of the given string and hashmap operations take constant time, so overall time complexity will become O(N ^ 2).

Space Complexity

O(K), where ‘K’ is the number of distinct characters in the given string ‘S’.

Since we are using two hashmaps and their size will not exceed the count of distinct characters in the string, so overall space complexity will become O(K).

Shortest Substring with all characters

Problem statement

Sample Input 1:

Sample Output 1:

Explanation for Sample Output 1:

Sample Input 2:

Sample Output 2: