Distinct Subsequences

Moderate
0/80
Average time to solve is 10m
55 upvotes
Asked in companies
UberGoogleMicrosoft

Problem statement

You have been given string 'S' of length 'N' that may contain duplicate alphabets. Your task is to return the count of distinct subsequences of it.

For example:

For the given string “deed” :
The possible subsequences are {“”}, {“d”}, {“e”}, {“de”}, {“e”}, {“de”}, {“ee”}, {“dee”}, {“d”}, {“dd”}, {“ed”}, {“ded”}, {“ed”}, {“ded”}, {“eed”} and {“deed”}.

As, {“d”}, {“e”}, {“de”}, {“ed”} and {“ded”} are repeated. 

The distinct subsequences are {“”}, {“d”}, {“e”}, {“de”}, {“ee”}, {“dee”}, {“dd”}, {“ed”}, {“ded”}, {“eed”} and {“deed”}

Thus, the output will be 11. 

Note:

As the answer can be large, return your answer modulo 10^9  + 7.  
Detailed explanation ( Input/output format, Notes, Images )
Input format:
The first line contains an integer 'T' which denotes the number of test cases or queries to be run. Then, the T test cases follow.

The first and only line of each test case or query contains the string 'S'. 
Output format:
For each test case, print a single line containing the count of distinct subsequences modulo 10^9+7.

The output of each test case will be printed in a separate line.

Note:

You do not need to print anything, it has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10 
1 <= N <= 10 ^ 4 

where 'T' is the number of test cases and 'N' is the length of the given string 'S'.

Time limit: 1 sec.
Sample input 1:
2
abcd
dad 
Sample output 1:
16
7
Explanation
For test case 1: 


Distinct subsequence are: {“”}, {a}, {b}, {ab}, {c}, {ac}, {bc}, {abc}, {d}, {ad}, {bd}, {abd}, {cd}, {acd}, {bcd} and {abcd}. Thus, the answer will be 16. 

For test case 2:

Distinct subsequences are: {“”}, {d}, {a}, {da}, {dd}, {ad} and {dad}. Thus, the answer is 7.    

For better understanding, let us take all the subsequences of string dad. They will be: 

{“ ”}, {d}, {a}, {da}, {d}, {dd}, {ad} and {dad}
Now, {d} occurs twice and thus we will count it only once. 
Sample input 2:
2
xyzpqrs
abba
Sample output 2:
128
11
Hint

Try creating all the possible subsequences. Along with creating all the subsequences and maintain the count of all the distinct subsequences. 

Approaches (3)
Brute force
  • The IDea is to call a helper function so as to create all possible subsequences of the string which will contain:
    • String CUR = the possible subsequence of the string.
    • Int ID = the index of character which will either be included or excluded.
  • We maintain a set to count the distinct subsequences, each time a new subsequence is created, we add it into the set. Remember, the set does not contain duplicate elements.
  • We return the size of the set which is the count of distinct subsequences received from the helper function.

Algorithm:

void helper(STR, set, ID, CUR) :

  • Base Case: if ID >= STR.size()’, this means that CUR is a possible subsequence, thus, add CUR to set and return.
  • Call helper(STR, set, ID+1, CUR), this is where you exclude the current character.
  • Call helper(STR, set, ID+1, CUR + STR[ID]), this is where you include the current character.

 

int distinctSubsequences(S) :

  • Initialise an empty set.
  • Call helper function, helper(STR, set, 0, “”).
  • Return the size of the set.
Time Complexity

O(2 ^ N), where ‘N’ is the size of the given string. 

 

In the worst case, we are generating all the subsequences of a string. The possible number of subsequences can be 2 ^ N. Therefore, overall time complexity will be O(2 ^ N). 

Space Complexity

O(2 ^ N), where ‘N’ is the size of the given string.

 

In the worst case, extra space is required to store all the subsequences in a set and there can a maximum of 2 ^ N distinct subsequences when all characters of the input string are diffetrent and also an extra space is required used by a recursive stack which goes to a maximum depth of N.

Code Solution
(100% EXP penalty)
Distinct Subsequences
Full screen
Console