Table of contents
1.
Introduction
2.
Problem Statement
2.1.
Example
3.
Solution Approach
3.1.
Approach 1: Naive Approach
3.2.
Python Implementation
3.2.1.
Input 
3.2.2.
Output
3.3.
Complexities
3.3.1.
Time complexity
3.3.2.
Space Complexity
3.4.
Approach 2: KMP Algorithm
3.5.
Python Implementation
3.5.1.
Input
3.5.2.
Output
3.6.
Complexities
3.6.1.
Time complexity
3.6.2.
Space Complexity
3.7.
Approach 3: Rabin Karp Algorithm
3.8.
Python Implementation
3.8.1.
Output
3.9.
Complexities
3.9.1.
Time Complexity
3.9.2.
Space Complexity
4.
FAQs
5.
Key Takeaways
Last Updated: Mar 27, 2024

Password Substring

Author Teesha Goyal
0 upvote

Introduction

A string is given that is the password to the door you want to open, but the password is not the entire string but a substring of the given string. You need to write a program to find the password.

Problem Statement

You are given a string as input. The given string contains the password to the door as a substring. This substring is a prefix and suffix of the string, and it also comes somewhere between the string. You need to find the output of the string to open the door.
The output of the program is the password. If more than one substring is found, then return the longest substring. If none of the substrings follows all the criteria, return “Password not found”.

Example

Input

codewithcodeandcode


Output

code


Explanation
Here, code is the largest substring, a prefix, suffix, and comes in between the given string codewithcodeandcode.

Solution Approach

Approach 1: Naive Approach

The brute force approach is very straightforward. Here we discuss the brute force algorithm to solve the problem. In the approach first, we take in the string and start checking for prefix and suffix match, and only we check whether the substring is present in between the given string or not. 

The steps of implementation are:

  • Take in the input string in instring.
  • Find the string length and store it in the length variable and the substring, excluding the first and last character in the midcheck variable.
  • Initialise ans to -1 and loop over the instring and check if the prefix is equal to suffix or not.
  • If equal, check whether the substring is in midcheck or not. If present update ans to current substring and move to next iteration.
  • If ans is not equal to -1 print answer, else print “Password not found”.

Python Implementation

def main():
    #input string
    instring = input()
    length = len(instring)

    #to check whether the substringoccursr in between
    midcheck = instring[1:-1]
    ans = '-1'

    #range is 1 to length to only check for proper prefix
    for i in range(1, length):
        #check for prefix and suffix
        if instring[:i] == instring[length-i :]:
            temp = instring[0:i]
            if temp in midcheck :
                ans = temp

    #printing the output
    if ans != '-1':
        print(ans)
    else:
        print("Password not found")

#execute main
main()
You can also try this code with Online Python Compiler
Run Code


Input 

codewithcodeforcode


Output

code

Complexities

Time complexity

O(l*l), where l is the length of instring

Reason: for all the prefixes, we have to iterate over the string and then for all the matches of prefix and suffix, check for a substring in between the string takes on another iteration over the string 

Space Complexity

O(n), where n is the length of the string

Reason: The string is taken as an input; therefore, it is dynamic and hence the complexity of O(n).

Approach 2: KMP Algorithm

We have seen that the brute force approach is not good enough. KMP is the abbreviation for Knuth Morris Pratt. It makes use of the degenerating property, which reduces the algorithm's time complexity and optimizes the program. The algorithm takes into consideration the fact that while looking for a pattern, we also look at part of the characters in the following window, so this algorithm uses this information to avoid checking the windows which we already know are not going to match.

The steps of implementation are:

  • Take the input string 
  • Call the KMPalgo function, which creates the temporary array arr to store the values which are used to find prefix and suffix.
  • In KMPalgo, declare a variable tmp and initialize it to -1
  • Loop over the input string to fill in the values in the temporary array arr. For each index from 1 to length of string-1, run a while loop until either tmp is equal to -1 or string[i] is equal to string[tmp+1] and assign the value of arr[tmp] to tmp
  • If string[i] is equal to string[tmp+1], update arr[i] to tmp+1 and increment tmp
  • Else update arr[i] to -1 and exit the KMPalgo function
  • In the main function, if max is -1 update found to -1
  • Traverse over the arr from 1 to length of the input string to find the longest substring and change found to value of max
  • If found is -1, print “Password not found”, else print the longest substring that fulfills the criteria.

Python Implementation

def KMPalgo(instring, arr):
    arr[0] = -1
    tmp = -1

    #traverse over the input string
    for i in range(1,len(instring)):

        while (tmp !=-1 and instring[i] != instring[tmp+1]):
            tmp = arr[tmp]

        if (instring[i] == instring[tmp+1]):
            tmp += 1
            arr[i] = tmp
        else:
            arr[i] = -1

def main():
    instring = input()
    arr = [0] * 1000005

    #call KMPalgofunctionn to create the temporary array
    KMPalgo(instring, arr)

    l = len(instring)
    max = arr[l-1]
    found = arr[max]

    if max == -1:
        found = -1

    #to find the largest substring
    for i in range(1,l-1):
        if arr[i] == max:
            found = max

    #Printis not foundd when the no string fulfill the criteria
    if found == -1:
        print("Just a legend")
    else:
        print(instring[0:found+1])

#execute main
main()
You can also try this code with Online Python Compiler
Run Code


Input

codewithcodeforcode


Output

code

Complexities

Time complexity

O(l*(l-2)), where l is the length of string

Reason: In the KMPalgo function, there are two loops, where the outer loop runs for the length of the string, the inner loop runs for the substring, and the maximum size of the substring is l-2.

Space Complexity

O(n), where n is the length of the string

Reason: The string is taken as an input; therefore, it is dynamic and hence the complexity of O(n).

Approach 3: Rabin Karp Algorithm

The Rabin Karp approach to pattern search is based on the hash function notion. The hash value of the pattern is compared to the hash value of the substrings of the text to be checked. For all the substrings whose hash value matches the pattern, individual characters are traversed and checked.

The steps of implementation are:

  • Take the input string in s.
  • Initialize mod to 1000000007 which will be used to reduce integer overflow.
  • Initialize dp and pa to be lists of length equal to the length of s.
  • Call preprocess function to calculate the hash values for prefixes.
  • In preprocess function, set p to 31 which can be any prime number, and power to p then calculate the hash of index 0, use a for loop to calculate the hash array for index 1 to the length of s -1.
  • Initialize ans to 0 which is the variable used to store the length of the longest substring fulfilling the criteria.
  • Traverse over the s to match prefix and suffix by using the hash values
  • To calculate the hash value of suffix call hashquery function.
  • If the hash of prefix is equal to the hash of suffix then we check if the hash value of any substring in between the string s matches that of prefix or suffix. 
  • If matches, then increment ans by 1.
  • At last print “Password not found if ans is equal to 0 else print the substring s[0:ans].

Python Implementation

mod = 1000000007 #mode to reduce integer overflow

def preprocess(s, pa, dp):
    p = 31  #any prime number
    power = p

    #calculate hash value of index 0
    dp[0] = ord(s[0]) - ord('a') + 1  
    pa[0] = 1
   
    #to generate the prefix hash array for index 1 to len(s)-1
    for i in range(1, len(s)):
        dp[i] = (dp[i-1]+ (ord(s[i]) - ord('a') + 1)* power) % mod
       
        pa[i] = power
        power = (power*p)% mod

def hashquery(l,r,dp):
    #This function gives the hash of any substring in the given string
    ans = dp[r]
    if(l>0):
        ans = (ans- dp[l-1] + mod) % mod
    return ans
   
def main():
    #input string
    s = input()
   
    dp = [0] * len(s)
    pa = [0] * len(s)
   
    #to calculate hash values for prefix
    preprocess(s, pa, dp)
    n = len(s)
    ans = 0
   
    #traverse over the string to match prefix and suffix
    for i in range(n-1):
        pref = dp[i]
        #find hash of suffix
        suff = hashquery(n-1-i, n-1, dp)
       
        if((pref*pa[n-1-i]) % mod == suff):
            j=1
            #check for the prefix in between the string
            for k in range(i+1, n-1):
                mid = hashquery(j,k,dp)
                if(pref*pa[j]) % mod == mid:
                    ans = i+1
                    break
               
                j += 1
   
    #printing output
    if(ans ==0):
        print("Password not found")
    else:
        print(s[0:ans])      
       
#execute main
main()        
You can also try this code with Online Python Compiler
Run Code


Input

codewithcodeforcode


Output

code

Complexities

Time Complexity

O(l*(l-2)), where l is the length of string

Reason: In the main function, there are two loops, where the outer loop runs for the length of the string, the inner loop runs for the substring, and the maximum size of the substring is l-2. 

Space Complexity

O(n), where n is the length of the string

Reason: The string is taken as an input; therefore, it is dynamic and hence the complexity of O(n).

Also check out - Substr C++

FAQs

  1. What is hashing?
    The process of converting a given key into another value is known as hashing. Here we have converted the string into numeric value to reduce the computation cost of the run time.
     
  2. What is String matching algorithms?
    It is a type of algorithm where we find one, several, or all occurrences of a pattern in the given string. Knuth Morris Pratt and Rabin Karp are examples of such algorithms.
     
  3. How is the hash value of the next window calculated?
    The hash value of the next window is calculated by adding the hash value of the leading character to the current hash value and then removing the hash value of the trailing character from the obtained value.
     
  4. What are spurious hits in Rabin-Karp?
    Spurious hits in Rabin Karp occur when the hash values of the pattern and the window match, but the actual window does not match the pattern.
     
  5. What are the Rabin Karp algorithm's worst case and best case complexity?
    The best-case complexity of the Rabin Karp algorithm is O(m+n), and the worst-case complexity is O(nm).

Key Takeaways

In this article, we learned how to solve the problem of "Password Substring". The brute force approach is unsuitable for larger strings, so we use optimized algorithms. The KMP and Rabin Karp are the two algorithms that optimize the problem and give a better and more efficient solution.

Check out this problem - Check If A String Is Palindrome

Are you planning to ace the interviews of reputed product-based companies like Amazon, Google, Microsoft, and more? 
Attempt our Online Mock Test Series on Coding Ninjas Studio now!

Happy Coding!

Live masterclass