Table of contents
1.
Problem
1.1.
Example
2.
Naive Approach
2.1.
Implementation
2.2.
Time Complexity
2.3.
Space Complexity
3.
Efficient Approach
3.1.
Implementation
3.2.
Time Complexity
3.3.
Space Complexity
4.
FAQs
5.
Key Takeaways
Last Updated: Mar 27, 2024

Maximum Number Of Occurrences Of A String By Replacing All The Question Marks

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Problem

Given a string S of lowercase letters and question marks, find the maximum number of occurrences of the string T containing lowercase letters in S after replacing all the question marks in S.

Example

Input
S = “????b?b”
T = abab

Output
2

Explanation
We can update the string S as “zababab”. The number of occurrences of T in S:

  1. From 2nd index to 5th index.
  2. From 4th index to 7th index.

Therefore, the maximum number of occurrences of T in S is 2.

Input
S = “abc??c”
T = “a”

Output
3

Explanation
We can update the string S as “abcaac”. The maximum number of occurrences of T in S is 3.

Naive Approach

One way to solve this problem is to generate all the possible strings by replacing question marks in S. Then we can iterate over all those strings and check the maximum number of occurrences of T. We will take maximum over all the generated strings to get our final answer.

To generate all the possible strings, we can use a recursive function that will replace S[i] with all the 26 letters if it was a question mark. To find the maximum number of occurrences of T in a string, we can check for each substring of size equal to the size of T. If the substring is equal to T, we will increment the count.

Implementation

#include <bits/stdc++.h>
using namespace std;

// Vector to store strings
vector<string>v;

// Recursive function to generate all strings
void rec(string s, int pos){
   
    if (pos==-1){

        // This means we have fully replaced ?'s.
        v.push_back(s);
        return;
    }

    // If the current element is not ?
    if (s[pos]!='?') {rec(s, pos-1);}
    else{
       
        // If it is ?, replace it with all 26 letters
        for(int i=0; i<26; i++){
            s[pos] = 'a'+i;
            rec(s, pos-1);
        }
    }
}

/*
Function to count the number of
occurrences of t in s
*/
int count(string s, string t){
    int n = s.size();
    int m = t.size();
    int cnt = 0;
    for (int i=0; i<n; i++){
        for(int j=0; j<m; j++){
            if (i+j==n) break;
            if (s[i+j]!=t[j]) break;
            if (j==m-1)cnt++;
        }
    }
    return cnt;
}

void solve(string s, string t){

    // Generating all possible strings.
    rec(s, s.size()-1);
    int ans = 0;

    /*
    Iterating over each string to find the
    maximum number of occurrences of t in s
    */
    for(string i: v){

        // Updating answer
        ans = max(ans, count(i, t));
    }

    // Printing the maximum number of occurrences of t in s
    cout<<ans<<endl;
}

// Driver function
int main(){
    string s = "abc??c";
    string t = "a";
    solve(s,t);
}

Output

3

Time Complexity

In the above approach, we generated all the possible strings, and then for each string, we calculated the maximum number of occurrences of T.

Generating all possible strings will take O(26 ^ N) time complexity (when all the characters are question marks). Calculating the maximum number of occurrences of T in each string will take O(N * M) time. (Here, N is the size of S and M is the size of T).

Therefore, the total time complexity will be O(M * N * (26^N)).

Space Complexity

In the above code, we have generated O(26 ^ N) strings, and each string is of size N. Therefore, the total space complexity is O(N * (26^N)).

Efficient Approach

Prerequisite: KMP Algorithm

Let C be the string obtained by concatenating T + # + S (where # is some dividing character that isn’t part of either T or S). In the KMP algorithm, we build a prefix function for this string. Now let us define Dp[i][j] on C, where i is the position in this string and j is the value of the prefix function at this position. The value of Dp[i][j] will denote the maximum number of occurrences of T found so far.

If the (i+1)th character is a letter, we can just recalculate the prefix function for this position. Otherwise, if the (i+1)th character is a question mark, we will check all 26 alphabets, recalculate the prefix function, and update the corresponding Dp values. 

We can do this recalculation in O(1) time by precalculating the values of next[i][j], where i is the value of prefix function and j is a new character and the value of next[i][j] is the value of prefix function after adding this character.

Implementation

#include <bits/stdc++.h>
using namespace std;

#define N 100100
#define Inf 0x3f3f3f3f
using namespace std;
 
int n, m, nxt[N], to[N][26];
string s, t;
vector<vector<int> > dp;

// KMP Algorithm
void kmp(){
    nxt[1] = 0;
    int j = 0;
    for(int i=2; i<=m; i++){
        while(j && t[i] != t[j+1]) j = nxt[j];
        if(t[i] == t[j+1]) j++;
        nxt[i] = j;
    }
    for (int i=0; i<=m; i++)
        for (int j=0; j<=25; j++){
            if(i <= m && t[i+1] == 'a'+j) to[i][j] = i+1;
            else to[i][j] = to[nxt[i]][j];
    }
}

void solve(){
    s = "$"+s, t = "$"+t;
    n = s.size()-1, m = t.size()-1;

    // Calling KMP
    kmp();

    // Initialising DP
    dp.resize(n+1, vector<int>(m+1, -Inf));
    dp[0][0] = 0;

    // Iterating over S
    for(int i=0; i<=n-1; i++)

        // Iterating over T
        for(int j=0; j<=m; j++)
            if(dp[i][j] >= 0){
                if(j == m) dp[i][j]++;
                for(int c='a'; c<='z'; c++) if(s[i+1] == '?' || char(c) == s[i+1]){

                    // Updating DP
                    dp[i+1][to[j][c-'a']] = max(dp[i+1][to[j][c-'a']], dp[i][j]);
                }
            }
    int ans = 0;
   
    // Calculating the maximum number of occurrences of t in s.
    for(int j=0; j<=m; j++) ans = max(ans, dp[n][j] + (j == m));
    cout<< ans <<endl;
    return;
}

// Driver function
int main(){
    s = "abc??c";
    t = "a";
    solve();
}

Output

3

Time Complexity

In the above code, we used a nested for loop to iterate over S and T, and then for each i and j, we iterated over 26 letters to calculate dp[i][j]. Therefore, the total time complexity will be O(26 * N * M). Since 26 is a constant, we can also write the complexity as O(N * M). Here N is the size of S and M is the size of T.

Space Complexity

We have used a Dp matrix of size O(N * M) and the Next matrix of size O(N * 26). Therefore, the overall space complexity will be O(N * M).

FAQs

  1. What is the use of the KMP algorithm?
    The KMP algorithm is used to find patterns in long strings. It can be used to search a substring in a string, find the number of unique substrings in a string, find all occurrences of the pattern in the string, etc.
     
  2. What are other similar algorithms?
    Z Algorithm also searches for the given pattern in the string. Both these algorithms have the same space and time complexity, but the Z algorithm is simpler to understand. There is another algorithm called Rabin Karp, which uses O(1) space.
     
  3. What is the LPS array?
    LPS stands for Longest proper Prefix, which is also a Suffix. As the name suggests, LPS[i] stores the longest proper prefix and a substring pat suffix [0:i]. A string’s proper prefix is any other than the entire string itself.

Key Takeaways

In this article, we discussed the approach to count the maximum number of occurrences of a string in another string by replacing all the question marks. We also discussed the time and space complexity of our solution. If you want to solve more such problems, you can visit Coding Ninjas Studio.

Recommended problems -

 

If you think that this blog helped you share it with your friends!. To be more confident in data structures and algorithms, try out our DS and Algo Course.

Until then, All the best for your future endeavors, and Keep Coding.

Live masterclass