Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Problem
1.1.
Example - 
2.
Naive Approach
2.1.
Implementation
2.2.
Output
2.3.
Time Complexity
2.4.
Space Complexity
3.
Efficient Approach
3.1.
Algorithm:
4.
Visualization
5.
Implementation
5.1.
Time Complexity
5.2.
Space Complexity
6.
FAQs
7.
Key Takeaways
Last Updated: Mar 27, 2024

Find All Occurrences of the Pattern in the String

Author Jaglike Makkar
4 upvotes
Master Python: Predicting weather forecasts
Speaker
Ashwin Goyal
Product Manager @

Problem

Given a string ‘str’ and a pattern ‘pat’, you have to find all occurrences of the pattern in the string. You have to print the starting positions of all occurrences of the pattern in the string.

Example - 

Input:

‘str’ = “heyhihey”

‘pat’ = “hey”

Output:

0 5

Explanation:

The pattern in the example is of length 3. Therefore, for every substring of the original string starting from index ‘i’, we have to check if the first three characters of the substring equal the pattern. 

  1. The string starting from index 0 is = “heyhihey”. The first 3 characters of this string are “hey” and are equal to the pattern. Therefore 0 is a valid index.
  2. The string starting from index 1 is = “eyhihey”. The first 3 characters of this string are “eyh” and are not equal to the pattern. Therefore 1 is not a valid index.
  3. The string starting from index 2 is = “yhihey”. The first 3 characters of this string are “yhi” and are not equal to the pattern. Therefore 2 is not a valid index.
  4. The string starting from index 5 is = “hey”. The first 3 characters of this string are “hey” and are equal to the pattern. Therefore 5 is a valid index.

Naive Approach

Let the size of the string be ‘N’ and the size of the pattern be ‘M’. For every index ‘i’ from 0 to ‘N’-1, we iterate over the pattern starting from 0 to ‘M’-1. If every character of the pattern matches the corresponding character of the string, index ‘i’ will be the valid index. After iterating over all ‘i’ from 0 to ‘N’-1, we will have found all occurrences of the pattern in the string.

Implementation

#include<bits/stdc++.h>
using namespace std;

/*
Function to find
All occurrences of the pattern in the string
*/
void solve(string str, string pat){

    // Initialising N and M
    int n = str.size();
    int m = pat.size();

    // Iterating over the string
    for (int i = 0; i < n; i++){

        // Iterating over the pattern
        for(int j = 0; j < m; j++){

            // If any character mismatches, break
            if (i + j >= n || pat[j] != str[i + j]) break;

            /*
            If we are at the last index,
            Print the occurrences of the pattern in the string
            */
            if (j == m - 1) cout<<i<<' ';
        }
    }
    cout<<endl;
}

// Driver Code
int main(){
    string str = "heyhihey";
    string pat = "hey";
    solve(str, pat);
}

Output

0 5

Time Complexity

We are using a nested loop to first iterate over the string (size = N) and then for each index, we are iterating over the pattern (size = M). Therefore the time complexity of the approach is O(N * M).

Space Complexity

We are just using a few extra variables therefore the space complexity of the above approach is O(1).

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Efficient Approach

Prerequisites - KMP Algorithm

The idea is to use the KMP string matching algorithm to find all occurrences of the pattern in the string efficiently. The main logic of the KMP algorithm is that whenever we detect a mismatch, we already know that some of the characters of the pattern have matched with the string. We can use this information to avoid matching the characters we know have already matched.

To solve this problem with the KMP algorithm, we first need to find the LPS array of the pattern used in KMP. Here, LPS[i] is the longest proper prefix which is also a suffix for the substring pat[0:i]. If you don’t know how to find an LPS array, please refer to the above link.

Algorithm:

  1. We will first create the LPS array.
  2. Initialize two variables - ‘strIdx’ and ‘patIdx’ to iterate over the string and the pattern, respectively.
  3. If ‘pat[patIdx]’ equals ‘str[strIdx]’, we will increment both the indexes.
  4. When ‘patIdx’ equals the length of the pattern, this means that the pattern is found in the string. Therefore we print the index and set ‘patIdx’ = LPS[patIdx-1].
  5. If ‘pat[patIdx]’ is not equal to ‘str[strIdx]’, we update the patIdx witht he last index that matches with ‘str[strIdx]’ using the LPS array.

Doing this, we will find all occurrences of the pattern in the string.

Also check out - Substr C++ and Rabin Karp Algorithm

Visualization

Implementation

#include<bits/stdc++.h>
using namespace std;

vector<int> getLps(string pat){
    int m = pat.size();

    // Vector to store the LPS array.
    vector<int>lps(m);

    /*
    Prev is the last longest proper prefix
    which is also a suffix
    */
    int prev = 0;
    int ind = 1;
    while (ind < m){
 
        // If both are equal
        if (pat[ind]==pat[prev]){
            prev++;
            lps[ind]=prev;
            ind++;
        }
 
        /*
        If the current elements are unequal
        And LPS is 0
        */
        else if (prev==0){
            lps[ind]=0;
            ind++;
        }
 
        /*
        If the current elements are unequal
        But LPS is not 0
        */
        else{
            prev = lps[prev-1];
        }
    }
    return lps;
}
 
/*
Function to find
All occurrences of the pattern in the string
*/
void solve(string str, string pat){
    vector<int> lps = getLps(pat);
   
    // Initializing variables
    int n = str.size();
    int m = pat.size();
    int patIdx = 0;
    int strIdx = 0;
 
    while (strIdx < n){
 
        // If both the elements match
        if (str[strIdx] == pat[patIdx]){
            patIdx++;
            strIdx++;
        }
        if (patIdx == m){
 
            /*
            This means that the entire pattern has matched
            Printing all occurrences of the pattern in the string
            */
            cout<<strIdx - m<<' ';
 
            // Updating patIdx to the last element matched.
            patIdx = lps[patIdx-1];
        }
        else if (strIdx < n){
            if (str[strIdx]!=pat[patIdx]){
                if (patIdx != 0)
 
                    // Updating lps to the last element matched
                    patIdx = lps[patIdx-1];
                else
 
                    // If no element matched, we move to next index
                    strIdx++;
            }
        }
    }
    cout<<endl;
}
 
// Driver Code
int main(){
    string str = "heyhihey";
    string pat = "hey";
    solve(str, pat);
}

Time Complexity

Computing the LPS array takes O(M) time. The time complexity of the KMP algorithm is O(N + M). Here ‘N’ is the length of the string, and ‘M’ is the length of the pattern.

Therefore, the total time complexity of the above approach is O(N + M).

Space Complexity

We are creating a new LPS array of size O(M). Therefore the space complexity of the above approach is O(M).

 

Also see, Morris Traversal for Inorder and Application of Graph in Data Structure

FAQs

  1. What is the use of the KMP algorithm?
    The KMP algorithm is used where we have to find patterns in long strings. It can be used to search a substring in a string, find the number of unique substrings in a string, find all occurrences of the pattern in the string, etc.
     
  2. What are other similar algorithms?
    Z Algorithm also searches for the given pattern in the string. Both these algorithms have the same space and time complexity, but the Z algorithm is simpler to understand. There is another algorithm called Rabin Karp, which uses O(1) space.
     
  3. What is the LPS array?
    LPS stands for Longest proper Prefix, which is also a Suffix. As the name suggests, LPS[i] stores the longest proper prefix, also a suffix for the substring pat[0:i]. The proper prefix of a string is any prefix other than the entire string itself.

Key Takeaways

In this article, we discussed the approach to finding all occurrences of the pattern in the string using KMP Algorithm. We also discussed the time and space complexity of our solution. If you want to solve more such problems, you can visit Coding Ninjas Studio.

Check out this article - C++ String Concatenation

Recommended problems -

 

If you think that this blog helped you share it with your friends!. To be more confident in data structures and algorithms, try out our DS and Algo Course.

Until then, All the best for your future endeavors, and Keep Coding.

Previous article
Shortest Palindrome
Next article
Car Fleet
Live masterclass