Last Updated: 19 Nov, 2021

Longest Substring with At Most Two Distinct Characters

Moderate
Asked in companies
Flipkart limitedUKG (Ultimate Kronos Group)

Problem statement

You are given a string ‘S’, you need to find the length of the longest substring that contains at most two distinct characters.

Note:

A string ‘B’ is a substring of a string ‘A’ if ‘B’ that can be obtained by deletion of, several characters(possibly none) from the start of ‘A’ and several characters(possibly none) from the end of ‘A’. 
Follow up :
Can you try to solve this problem in O(N) time and O(1) space.
Example :
If ‘S’ = “ninninja”

Then, “ninnin” is the longest substring that contains at most two distinct characters. We will print the length of this substring which is equal to 6.
Input Format :
The first line contains a single integer ‘T’ denoting the number of test cases, then each test case follows:

The first line of each test case contains a string ‘S’, denoting the input string.
Output Format :
For each test case, print the length of the longest substring containing at most two distinct characters.

Output for each test case will be printed in a separate line.
Note :
You are not required to print anything; it has already been taken care of. Just implement the function.
Constraints :
1 ≤ T ≤ 10      
1 ≤ |S| ≤ 1000
Where 'S' contains lowercase English alphabets

Time limit: 1 sec

Approaches

01 Approach

A standard brute force approach is to generate all the substrings, this takes ~O(N^2) time, then for each of the substring, we will check how many distinct characters it contains, therefore the total time complexity turns out to be of the order ~O(N^3).

 

We can further optimise the above approach by generating substrings starting that start from a particular index (let’s say ‘i’) and stopping the search when the number of characters is greater than two, we will require an additional data structure like a hash-set to keep a track of distinct characters for the substrings starting from index ‘i’. We can repeat this process for all the indices, and return the final answer.

 

The steps are as follows :

  1. Initialize a variable ‘ans’ equal to 0, this will store the length of the longest substring.
  2. Run a for loop for variable ‘i’ from 0 to N - 1, inside the loop:
    • Declare a hash-set ‘distinctChars’, to store the distinct characters in the substrings starting from the index ‘st’.
    • Run an inner for loop for variable ‘j’ from i to N - 1, for each iteration:
      • Insert the character at s[j] in the hash-set, if the size of the hash-set is less than 2, then continue the process, else: update the value of ‘ans’ if needed and break the inner loop.
  3. Finally, return the value stored in ‘ans’.

02 Approach

Use two pointers, the first pointer marks the start of the current substring under observation and the second pointer marks the end of that substring. Now we can move these pointers according to the constraint mentioned in the question.

 

If the substring currently marked by the pointers contains greater than two distinct characters, then we will move the first pointer to remove the characters from the beginning of the current substring until we are left with just two distinct characters, else we will move the second pointer to now include the next character in our substring. Over our entire search, we can return the maximum substring length that we were able to achieve. This is a standard two-pointer technique and people call this with many other names like window sliding etc. 

 

The steps are as follows :

  1. Initialize a variable ‘ans’ equal to 0, this will store the length of the longest substring.
  2. Initialize two variables ‘i’ and ‘j’ both equal to 0. The current substring under observation is from ‘i’ to ‘j’ - 1.
  3. Declare a hash-map ‘disticntChar’.
  4. Run a WHILE loop till ‘j’ is less than S.length, inside the loop:
    • Increment the frequency of the character at S[j] in the hash-map, we have now included the character at the jth position in our search space.
    • Now increment the value of ‘j’.
    • Check if the hash-map contains greater than two distinct characters, if so then run an inner WHILE loop, in each iteration:
      • Decrement the frequency of the character at S[i] in the hash-map, if the frequency is equal to 0 then remove that character from the hash-map, Also increment the value of ‘i’.
    • Update the ‘ans’ if needed so that it stores the maximum substring length.
  5. Finally, return the value stored in ‘ans’.