Table of contents
1.
Introduction
2.
Brute Force Approach   
3.
Algorithm 
4.
C++ code:
5.
Algorithm Complexity
6.
Efficient Approach Using Rabin-Karp Algorithm   
7.
Algorithm
8.
C++ code:
9.
Algorithm Complexity: 
10.
FAQs-
11.
Key takeaways-
Last Updated: Mar 27, 2024

Substring Queries

Author Riya
1 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

This section will discuss the problem "Substring Queries." First, let's see the problem statement.

Problem statement-

We will be given a string 'S' of length 'N,' followed by 'Q' queries in this problem. Each query is described by a string 'str .'We have to print 'YES' for each query if 'str' is a substring of 'S,' else print 'NO .'Before moving further, let's recall what a substring is.

Let's take an example to understand the problem:

Suppose N = 11 and S = “abcbbaaccbb”

Next, we have Q = 3, and the strings in these three queries are:

Query 1: "cbba"

Query 2: "abb"

Query 3: "baa"

Let's find whether the string in the query is a substring of S or not-

Query 1:

We can see that "cbba" occurs in "abcbbaaccbb" (occurrence is shown by highlighting), so "cbba" is a substring of "abcbbaaccbb". Hence print "YES" for this query.

Query 2:

We can check that "abb" doesn't occur in "abcbbaaccbb," so "abb" is not a substring of "abcbbaaccbb ."Hence print "NO" for this query.

Query 3:

We can see that "bba" occurs in "abcbbaaccbb" (occurrence is shown by highlighting), so "bba" is a substring of "abcbbaaccbb ."Hence print "YES" for this query.

Now that we understand the problem, let's understand the approach to solve it in the next section.

Brute Force Approach   

This section will discuss the approach to solving the problem described in the previous section. For each query, we have to find whether 'str' is a substring of 'S' or not. The brute force approach is to check for each index 'i' in 'S' whether a substring starting from index 'i' can be equal to 'str' or not. In this way, we will check for all possible substrings that can be equal to 'str' and find out whether 'str' is a substring of 'S' or not.

Algorithm 

Step 1. First is the main function. Inside the main function, create variables to store the string whose substrings are to be checked and the given queries. Call the function "answerQueries()" to find whether the strings in the queries are substrings of the given string or not.

Step 2. Next, create the "answerQueries()" function to solve the substring queries. Inside the function, call the function "isSubstring()" for each query and print "YES" if the string in the query is a substring of the given string 'S,' else print "NO."

Step 3. Now create the function "isSubstring()," which takes the following inputs:

  1. S - The string whose substring is to be checked
  2. str - The string which is to be checked whether it is a substring of 'S' or not
  3. N - Length of the string 'S.'
  4. M - Length of the string ' str.'

It returns true if 'str' is a substring of 'S,' else return false. 

Step 4. Inside the function "isSubstring()," check whether 'str' is a substring of 'S' or not by sliding the string 'str' over 'S' by one character at each iteration.

 

C++ code:

// C++ program to find the answer of substring queries
#include <bits/stdc++.h>
using namespace std;


// Function to check whether 'str' is a substring of 'S'
bool isSubstring(string S, string str, int N, int M)
{

	// Fix the starting index in S
	for (int i = 0; i <= N - M; i++) 
	{

		// Move to the right and check for the substring
		int j = 0;
		while(j < M)
		{
			if (S[i + j] != str[j]) {
				break;
			}
			j++;
		}

		// If all the characters of S[i..i+M] is equal to str, it means str is a substring of S
		if (j == M) {
			return true;
		}
	}
 	
	return false;
}


// Function to print the answer for each of the substring queries
void answerQueries(string S, int N, vector<string> queries, int Q)
{

	// Loop for finding answer of each of the substring queries
	for(int i=0; i < Q; i++)
	{

		// String in the query
		string str = queries[i];
    		
		// Length of string in the query
		int M = str.length();
    		
		// Call the function to check whether string in the query is a substring of S or not
		bool ans = isSubstring(S, str, N, M);
    		
		// print YES if the string in the query is a substring of S
		if(ans == true) {
			cout<<"Query "<<i+1<<": YES"<<endl;
		}
		else {
			cout<<"Query "<<i+1<<": NO"<<endl;
		}
	}
}

// Main function
int main()
{
	// Given string whose substrings to be checked
	string S = "abcbbaaccbb";

	// Length of the given string whose substrings to be checked
	int N = S.length();

	// Given number of queries
	int Q = 3;

	// Vector to store the strings used for querying whether they are substring of S or not
	vector<string> queries;

	queries.push_back("cbba");
	queries.push_back("abb");
	queries.push_back("baa");

	// Call the function to answer the substring queries
	answerQueries(S, N, queries, Q);
}
You can also try this code with Online C++ Compiler
Run Code

 

Output
Query 1: YES
Query 2: NO
Query 3: YES

 

Algorithm Complexity

Time Complexity: O( Q * N * M )

In the above program to answer substring queries, for each query, two nested loops are run, which takes O(N * M) time complexity. So, the overall time complexity is O(Q * N * M), where 'Q' is the number of queries, 'N' is the length of the given string which is to be queried for the substrings, and 'M' is the length of the longest string among the query strings.

Space Complexity: O(1) 

In the above program to answer substring queries, constant space is used. So, the space complexity is O(1).

Efficient Approach Using Rabin-Karp Algorithm   

This section will discuss the approach using the Rabin-Karp algorithm to solve the above problem. The approach is similar to the brute force approach, but here we compare the strings by comparing their hash values instead of comparing each character one by one. Precompute the hash values and then move the pattern by one character at each iteration and compare their hash values. The code will be the same as the previous approach; just replace the "isSubstring()" with a new function implementing the Rabin Karp Search algorithm.

Algorithm

Step 1. Create a function "rabinKarpSearch()' for implementing the Rabin Karp algorithm that will accept the two parameters - the given string 'S' and the given pattern 'P .'First, calculate the lengths of 'S' and 'P.'

Step 2. Now, we have to choose a prime number and a value for taking modulus while calculating the hash values. For minimizing the hashing collision, we have to take the value of the prime number close to the number of characters used in the string and pattern. Assuming that the given 'S' and 'P' consist of only lower alphabets, the total number of characters will be 26, so take prime = 31. Now a value for taking modulus should be very large and prime, so take mod = 1e +9.

Step 3.  The hash function that we have used here is:

     hash(S) = (Σ((S[i] - ‘a’ + 1) * (P^(i)))) % mod

Step 4. Create a vector to store the powers of "prime" and store (prime ^ 0) to (prime ^ Ns). Now calculate the hash value of the given pattern and the first window of the string 'S.'

Step 5. Now one by one, slide the given pattern and calculate the hash value of the corresponding substring and compare it with the hash value of the pattern. If found the same, print the occurrence of the pattern at that index.

C++ code:

//C++ code for implementation of Rabin Karp algorithm
#include <bits/stdc++.h>
using namespace std;



// Function to find whether str is a substring of S or not
bool rabinKarpSearch(string S, string str)
{

   // Calculating the length of S and P 
   int N = S.length();
   int M = str.length();
    
   // Initialize the value of prime number and mod for calculating hash values
   int prime = 31;
   int mod = 1e9 + 9;
    
   // Calculating the power raise to the taken prime
   vector<long long> p_pow(N);
   p_pow[0] = 1; 
   for (int i = 1; i < N; i++) 
    {
         p_pow[i] = (p_pow[i-1] * prime) % mod;
    }
       
    vector<long long> h(N + 1, 0); 
    for (int i = 0; i < N; i++)
     {
        h[i+1] = (h[i] + (S[i] - 'a' + 1) * p_pow[i]) % mod;
     }
    
    // Calculating the hash value of P 
    long long hash_P = 0; 
    for (int i = 0; i < M; i++) 
      {
        hash_P = (hash_P + (str[i] - 'a' + 1) * p_pow[i]) % mod; 
      }
      
    /*
       Now slide the pattern by one character and check for the corresponding
       hash value to match with the hash value of the given pattern
    */
    for (int i = 0; i + M - 1 < N; i++) 
      { 
        long long curr_hash = (h[i+M] + mod - h[i]) % mod; 
        if (curr_hash == hash_P * p_pow[i] % mod)
            return true;
      }
      
    return false;
}
  
// Function to print the answer for each of the substring queries
void answerQueries(string S, int N, vector<string> queries, int Q)
{

	// Loop for finding answer of each of the substring queries
	for(int i=0; i < Q; i++)
	{
		// String in the query
		string str = queries[i];
    		
		// Length of string in the query
		int M = str.length();
    		
		// Call the function to check whether string in the query is a substring of S or not
		bool ans = rabinKarpSearch(S, str);
    		
		// print YES if the string in the query is a substring of S
		if(ans == true) {
			cout<<"Query "<<i+1<<": YES"<<endl;
		}
		else {
			cout<<"Query "<<i+1<<": NO"<<endl;
		}
	}
}


// Main function
int main()
{
	// Given string whose substrings to be checked
	string S = "abcbbaaccbb";

	// Length of the given string whose substrings to be checked
	int N = S.length();

	// Given number of queries
	int Q = 3;

	// Vector to store the strings used for querying whether they are substring of S or not
	vector<string> queries;

	queries.push_back("cbba");
	queries.push_back("abb");
	queries.push_back("baa");

	// Call the function to answer the substring queries
	answerQueries(S, N, queries, Q);
}
You can also try this code with Online C++ Compiler
Run Code
Output
Query 1: YES
Query 2: NO
Query 3: YES

Also check out - Substr C++

Algorithm Complexity: 

Time Complexity: O(Q*(N + M))

In the Rabin Karp algorithm, we have calculated the hash value of the pattern in O(N) time and traversed the given string for calculating the hash value and comparing the corresponding hash value with that of the pattern in O(N) time. So, the time complexity is O(N + M), where 'N' and 'N' are the lengths of the given string and pattern, respectively. For each query, we have called the "rabinKarpSearch()," so overall time complexity is O(Q*(N + M)), where N and M are the lengths of the given string 'S' and the string the query, respectively.

 Space Complexity: O(1) 

 We have used constant space. So, the space complexity is O(1).

FAQs-

  1. What is a substring?
    A string ‘s1’ is said to be a substring of ‘s2’ if there is an occurrence of ‘s1’ in ‘s2’.
     
  2. How string hashing has helped in reducing the time complexity of finding the occurrences of a string in another string using the Rabin Karp algorithm?
    We can find an equivalent integer of a string using string hashing, and using the hash value of the previous substring of a string; we can calculate the hash value of the next substring of that string in O(1) time only. Additionally, comparing the values of string take O(1) time, whereas to compare two strings, it will take O(N) time, where 'N' is the length of the string. In this way, we string hashing has reduced the time complexity of finding the occurrences of a string in another string using the Rabin Karp algorithm.

 

Key takeaways-

This article discussed the problem “substring queries'', the solution approach to the problem, its C++ implementation, and its time and space complexity. If you want to solve more problems for practice, you can visit Coding Ninjas Studio.


If you think that this blog helped you, then share it with your friends!. Refer to the DSA C++ course for more information.

 

Until then, All the best for your future endeavours, and Keep Coding.

Live masterclass