Solution Approach
Approach 1: Naive Approach
The brute force approach is very straightforward. Here we discuss the brute force algorithm to solve the problem. In the approach first, we take in the string and start checking for prefix and suffix match, and only we check whether the substring is present in between the given string or not.
The steps of implementation are:
- Take in the input string in instring.
- Find the string length and store it in the length variable and the substring, excluding the first and last character in the midcheck variable.
- Initialise ans to -1 and loop over the instring and check if the prefix is equal to suffix or not.
- If equal, check whether the substring is in midcheck or not. If present update ans to current substring and move to next iteration.
- If ans is not equal to -1 print answer, else print “Password not found”.
Python Implementation
def main():
#input string
instring = input()
length = len(instring)
#to check whether the substringoccursr in between
midcheck = instring[1:-1]
ans = '-1'
#range is 1 to length to only check for proper prefix
for i in range(1, length):
#check for prefix and suffix
if instring[:i] == instring[length-i :]:
temp = instring[0:i]
if temp in midcheck :
ans = temp
#printing the output
if ans != '-1':
print(ans)
else:
print("Password not found")
#execute main
main()
You can also try this code with Online Python Compiler
Run Code
Input
codewithcodeforcode
Output
code
Complexities
Time complexity
O(l*l), where l is the length of instring
Reason: for all the prefixes, we have to iterate over the string and then for all the matches of prefix and suffix, check for a substring in between the string takes on another iteration over the string
Space Complexity
O(n), where n is the length of the string
Reason: The string is taken as an input; therefore, it is dynamic and hence the complexity of O(n).
Approach 2: KMP Algorithm
We have seen that the brute force approach is not good enough. KMP is the abbreviation for Knuth Morris Pratt. It makes use of the degenerating property, which reduces the algorithm's time complexity and optimizes the program. The algorithm takes into consideration the fact that while looking for a pattern, we also look at part of the characters in the following window, so this algorithm uses this information to avoid checking the windows which we already know are not going to match.
The steps of implementation are:
- Take the input string
- Call the KMPalgo function, which creates the temporary array arr to store the values which are used to find prefix and suffix.
- In KMPalgo, declare a variable tmp and initialize it to -1
- Loop over the input string to fill in the values in the temporary array arr. For each index from 1 to length of string-1, run a while loop until either tmp is equal to -1 or string[i] is equal to string[tmp+1] and assign the value of arr[tmp] to tmp
- If string[i] is equal to string[tmp+1], update arr[i] to tmp+1 and increment tmp
- Else update arr[i] to -1 and exit the KMPalgo function
- In the main function, if max is -1 update found to -1
- Traverse over the arr from 1 to length of the input string to find the longest substring and change found to value of max
- If found is -1, print “Password not found”, else print the longest substring that fulfills the criteria.
Python Implementation
def KMPalgo(instring, arr):
arr[0] = -1
tmp = -1
#traverse over the input string
for i in range(1,len(instring)):
while (tmp !=-1 and instring[i] != instring[tmp+1]):
tmp = arr[tmp]
if (instring[i] == instring[tmp+1]):
tmp += 1
arr[i] = tmp
else:
arr[i] = -1
def main():
instring = input()
arr = [0] * 1000005
#call KMPalgofunctionn to create the temporary array
KMPalgo(instring, arr)
l = len(instring)
max = arr[l-1]
found = arr[max]
if max == -1:
found = -1
#to find the largest substring
for i in range(1,l-1):
if arr[i] == max:
found = max
#Printis not foundd when the no string fulfill the criteria
if found == -1:
print("Just a legend")
else:
print(instring[0:found+1])
#execute main
main()
You can also try this code with Online Python Compiler
Run Code
Input
codewithcodeforcode
Output
code
Complexities
Time complexity
O(l*(l-2)), where l is the length of string
Reason: In the KMPalgo function, there are two loops, where the outer loop runs for the length of the string, the inner loop runs for the substring, and the maximum size of the substring is l-2.
Space Complexity
O(n), where n is the length of the string
Reason: The string is taken as an input; therefore, it is dynamic and hence the complexity of O(n).
Approach 3: Rabin Karp Algorithm
The Rabin Karp approach to pattern search is based on the hash function notion. The hash value of the pattern is compared to the hash value of the substrings of the text to be checked. For all the substrings whose hash value matches the pattern, individual characters are traversed and checked.
The steps of implementation are:
- Take the input string in s.
- Initialize mod to 1000000007 which will be used to reduce integer overflow.
- Initialize dp and pa to be lists of length equal to the length of s.
- Call preprocess function to calculate the hash values for prefixes.
- In preprocess function, set p to 31 which can be any prime number, and power to p then calculate the hash of index 0, use a for loop to calculate the hash array for index 1 to the length of s -1.
- Initialize ans to 0 which is the variable used to store the length of the longest substring fulfilling the criteria.
- Traverse over the s to match prefix and suffix by using the hash values
- To calculate the hash value of suffix call hashquery function.
- If the hash of prefix is equal to the hash of suffix then we check if the hash value of any substring in between the string s matches that of prefix or suffix.
- If matches, then increment ans by 1.
- At last print “Password not found if ans is equal to 0 else print the substring s[0:ans].
Python Implementation
mod = 1000000007 #mode to reduce integer overflow
def preprocess(s, pa, dp):
p = 31 #any prime number
power = p
#calculate hash value of index 0
dp[0] = ord(s[0]) - ord('a') + 1
pa[0] = 1
#to generate the prefix hash array for index 1 to len(s)-1
for i in range(1, len(s)):
dp[i] = (dp[i-1]+ (ord(s[i]) - ord('a') + 1)* power) % mod
pa[i] = power
power = (power*p)% mod
def hashquery(l,r,dp):
#This function gives the hash of any substring in the given string
ans = dp[r]
if(l>0):
ans = (ans- dp[l-1] + mod) % mod
return ans
def main():
#input string
s = input()
dp = [0] * len(s)
pa = [0] * len(s)
#to calculate hash values for prefix
preprocess(s, pa, dp)
n = len(s)
ans = 0
#traverse over the string to match prefix and suffix
for i in range(n-1):
pref = dp[i]
#find hash of suffix
suff = hashquery(n-1-i, n-1, dp)
if((pref*pa[n-1-i]) % mod == suff):
j=1
#check for the prefix in between the string
for k in range(i+1, n-1):
mid = hashquery(j,k,dp)
if(pref*pa[j]) % mod == mid:
ans = i+1
break
j += 1
#printing output
if(ans ==0):
print("Password not found")
else:
print(s[0:ans])
#execute main
main()
You can also try this code with Online Python Compiler
Run Code
Input
codewithcodeforcode
Output
code
Complexities
Time Complexity
O(l*(l-2)), where l is the length of string
Reason: In the main function, there are two loops, where the outer loop runs for the length of the string, the inner loop runs for the substring, and the maximum size of the substring is l-2.
Space Complexity
O(n), where n is the length of the string
Reason: The string is taken as an input; therefore, it is dynamic and hence the complexity of O(n).
Also check out - Substr C++
FAQs
-
What is hashing?
The process of converting a given key into another value is known as hashing. Here we have converted the string into numeric value to reduce the computation cost of the run time.
-
What is String matching algorithms?
It is a type of algorithm where we find one, several, or all occurrences of a pattern in the given string. Knuth Morris Pratt and Rabin Karp are examples of such algorithms.
-
How is the hash value of the next window calculated?
The hash value of the next window is calculated by adding the hash value of the leading character to the current hash value and then removing the hash value of the trailing character from the obtained value.
-
What are spurious hits in Rabin-Karp?
Spurious hits in Rabin Karp occur when the hash values of the pattern and the window match, but the actual window does not match the pattern.
-
What are the Rabin Karp algorithm's worst case and best case complexity?
The best-case complexity of the Rabin Karp algorithm is O(m+n), and the worst-case complexity is O(nm).
Key Takeaways
In this article, we learned how to solve the problem of "Password Substring". The brute force approach is unsuitable for larger strings, so we use optimized algorithms. The KMP and Rabin Karp are the two algorithms that optimize the problem and give a better and more efficient solution.
Check out this problem - Check If A String Is Palindrome
Are you planning to ace the interviews of reputed product-based companies like Amazon, Google, Microsoft, and more?
Attempt our Online Mock Test Series on Coding Ninjas Studio now!
Happy Coding!