Last Updated: 16 Dec, 2020

Distinct Subsequences

Easy
Asked in companies
MakeMyTripInnovaccerMicrosoft

Problem statement

Given two strings S and T consisting of lower case English letters. The task is to count the distinct occurrences of T in S as a subsequence.

A subsequence is a sequence generated from a string after deleting some or no characters of the string without changing the order of the remaining string characters. (i.e. “ace” is a subsequence of “abcde” while “aec” is not).

For example, for the strings S = “banana” and T=”ban”, output is 3 as T appears in S as below three subsequences.

[ban], [ba n], [b an ]

Input Format:
The first line of input contains an integer ‘T’ representing the number of test cases. Then the test cases follow.

The first line of each test case contains string S consisting of lowercase English letters.

The second line of each test case contains string T consisting of lowercase English letters.
Output Format:
For each test case, print a single integer denoting the count of distinct occurrences of T in S as a subsequence.

The output for each test case is in a separate line.
Note:
You do not need to print anything; it has already been taken care of.
Constraints:
1 <= T <= 10
1 <= N <= 100

Where ‘T’ is the number of test cases, and ‘N’ is the length of the strings.

Time Limit: 1sec

Approaches

01 Approach

If we carefully analyze the given problem, we can see that it can be easily divided into sub-problems which can be solved using recursion. The idea is to process all characters of both strings one by one starting from either from left or right side. Let us traverse from the right corner, there are two possibilities for every pair of characters being traversed.

 

Base Cases:

  1. Given the string T is an empty string, return 1 as an empty string can be the subsequence of all.
  2. Given the string S is an empty string, return 0 as no string can be the subsequence of an empty string.

 

Recursive Cases:

  1. If the last character of S and T do not match, then remove the last character of S and call the recursive function again. Because the last character of S cannot be a part of the subsequence or remove it and check for other characters.
  2. If the last character of S matches then there can be two possibilities, first there can be a subsequence where the last character of S is a part of it and second where it is not a part of the subsequence. So the required value will be the sum of both. Call the recursive function once with the last character of both the strings removed and again with only the last character of S removed.

02 Approach

Dynamic Programming approach can be applied to solve this problem. The idea is to store the subproblems in a HashMap or an array and return the value when the function is called again.

 

Below is the algorithm:

  1. Create a 2-D array dp[m+1][n+1] where m is length of string T and n is length of string S. dp[i][j] denotes the number of distinct subsequence of substring S(1..i) and substring T(1..j), so dp[m][n] contains our solution.
  2. Initialize the first column with all 0s. An empty string can’t have another string as a subsequence.
  3. Initialize the first row with all 1s. An empty string is a subsequence of all.
  4. Fill the matrix in bottom up manner, i.e. all the sub problems of the current string are calculated first.
  5. Traverse the string T from start to end. (counter is i)
  6. For every iteration of the outer loop, traverse the string S from start to end. (counter is j)
  7. If the character at ith index of string T matches with jth character of string S, the value is obtained considering two cases. First, is all the substrings without last character in S and second is the substrings without last characters in both, i.e. dp[i+1][j] + dp[i][j].
  8. Else the value will be the same even if jth character of S is removed, i.e. dp[i+1][j].
  9. Return the value of dp[m][n] as the answer.