Last Updated: 4 Dec, 2020

Longest Sub-string With K Distinct Characters

Easy
Asked in companies
AmazonCodenationLifesight

Problem statement

You are given a string 'S' of length 'N' consisting of lowercase English alphabet letters. You are also given a positive integer 'K'.

Now, a substring of this string is good if it contains at most 'K' distinct characters. A string 'X' is a substring of string 'Y' if it can be obtained by deletion of several continuous elements(possibly zero) from the beginning and the end from the string 'Y'.

Your task is to return the maximum size of any good substring of the string 'S'.

Example:
‘S’ = “bacda” and ‘K’ = 3.

So, the substrings having at most ‘3’ distinct characters are called good substrings. Some possible good substrings are:
1. “bac”
2. “acd”
3. “acda”

The substring “acda” is the largest possible good substring, as we cannot get any other substring of length 5 or more having distinct characters less than or equal to ‘3’. Thus, you should return ‘4’ as the answer.
Input Format:
The first line of input contains an integer ‘T’ denoting the number of test cases to run. Then the test case follows.

The first line of each test case contains a single integer ‘K’, denoting the maximum number of distinct characters.

The second line of each test case contains a string ‘S’
Output Format:
For each test case, print a single line containing an integer denoting the length of the longest substring that contains at most 'K' distinct characters.

The output of each test case will be printed in a separate line.
Note:
You do not need to print anything; it has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10
1 <= K <= 26
1 <= N <= 10^4

All the characters of the string are lowercase English alphabet letters.

Time Limit: 1 sec

Approaches

01 Approach

  • We will generate all the substrings using two nested loops and have a ‘check’ function which returns ‘true’ if the number of distinct characters in the substring is less than equal to K, otherwise ‘false’.
  • We will have a ‘longestLength’ variable initialized to 0. We will call the ‘check’ function with every substring, and if it returns true, then:
    • ‘longestLength = max(longestLength, currentSubstring.size())’
  • To implement the ‘check’ function, we will create a frequency array of size 26 ‘freq’ with each element initialized to zero.
    • Increase the frequency of every character in the substring S.
      • ‘freq[S[i] - ’a’]++’
    • Have a variable ‘distinct = 0’.
      • Then run a loop where ‘i’ ranges from ‘0’ to ‘25’ and increase the ‘distinct’ if freq[i] is non-zero.
    • Return ‘true’ if ‘distinct <= K’. Otherwise, return ‘false’.

Finally, return ‘longestLength’, which has our required answer.

02 Approach

  • We will run a loop where ‘i’ ranges from ‘0’ to ‘N - 1’ where ‘i’ will be the starting position of the substring and ‘j = i’ to ‘j = N - 1’ where ‘j’ will be the end position of the substring.
  • Before starting the second loop, we will create a hash table or an array/list of size 26. Let’s call this array ‘freq’.
  • Inside the inner loop increase the frequency of the ‘j-th’ character.
  • Now, 'freq' array will have the frequency of the substring starting from the index ‘i’ and ending at the index ‘j’.
    • We will check the number of distinct characters, which will be the number of the non-zero valued position in the 'freq' array.
  • Initialize a variable ‘longestLength = 0’, and if the number of distinct characters is less than equal to ‘K’, then:
    • ‘longestLength = max(longestLength, j - i + 1)’
  • Return ‘longestLength’, which is the required answer.

03 Approach

  • We will be doing the binary search on the answer that is the length of the substring.
    • ‘low = 1’ and ‘high = N’ will be the range of our answer. We will have a variable ‘longestLength = 0’ to store the required answer.
  • Now, we will loop till ‘low <= high’, and we will find the ‘mid’.
    • We will try to find if there is a substring of length equal to ‘mid’, which has distinct characters less than equal to ‘K’.
    • To check the number of distinct characters in a substring of length equal to 'mid', we will use the sliding window with a hash table or an array/list of size 26.
    • If it is possible to find such substring with length ‘mid’, then we can try for a bigger length substring, so:
      • ‘low = mid + 1’
      • ‘longestLength = max(longestLength, mid)’
    • Otherwise, we need to decrease the substring size, so
      • ‘high = mid - 1’

Now, after the binary search, we will have our required answer in the variable ‘longestLength’.

04 Approach

  • Create a hash table or a frequency array, and let's name it ‘freq’ and initialize all the positions with 0.
  • We will have a variable ‘start = 0’, which will be the left boundary of the sliding window.
  • We will run a loop where ‘i’ ranges from ‘0’ to ‘N - 1’, which will be the right boundary of the sliding window.
    • Increase the frequency of ‘i-th’ character i.e freq[S[i] - ‘a’]++’.
    • Now, we need to ensure that the substring has less than or equal to ‘K’ distinct characters.
      • So we will run a  loop and check if the ‘freq’ array has greater than ‘K’ distinct characters. If it has greater than ‘K’ distinct characters we do ‘freq[S[start] - ’a’]--’.
    • Once we come out of the loop, our current window (bounded by index ‘start’ and ‘i’) has less than or equal to ‘K’ distinct characters.
    • We will maintain the variable ‘longestLength’, which will be our required answer.
  • We have our required answer in ‘longestLength’, so we return it.