Regular Expression Matching

Hard
0/120
Average time to solve is 25m
30 upvotes
Asked in companies
SwiggyAdobeFlipkart

Problem statement

Given an input string 'S' and a pattern 'P', implement a regular expression matching with the support of two special characters ‘.’ (dot) and ‘*’(asterisk) where

1. ‘.’ matches to any single character.
2. ‘*’ matches to zero or more of the preceding element.

If the input string 'S' matches the pattern 'P', then return true else, return false.

Note:
1. You have to match the entire string with the pattern given.

2. Both the strings, 'S' and 'P' contain only lower-case alphabets.

3. Only the pattern will contain additional characters ‘*’ and ‘.’ along with alphabets.
Detailed explanation ( Input/output format, Notes, Images )
Input Format:
The first line contains a single integer T representing the number of test cases.

The one and only line of each test case contains two space-separated strings 'S' and 'P' respectively. 
Output Format :
For each test case, print a single line containing “true” if the string matches the pattern, else print ”false”.

The output of every test case is printed in a separate line.
Note:
You do not need to print anything. It has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10
1 <= N <= 1000
1 <= M <= 1000    

Where 'T' is the number of test cases, ‘N’ is the length of the input string to be matched, and ‘M’ is the length of the pattern 'P'.

Time limit: 1 sec.
Sample Input 1:
2
aa a
aa a*
Sample Output 1:
false
true
Explanation of Sample output 1
For the first test case, it is clearly visible that “aa” is not equal to “a”.

For the second test case, ‘*’ means that we can replace zero or more of the preceding element. Hence we can replace it with another ‘a’. So, the given string can be formed with the pattern.
Sample Input 2:
3
ab .*
aab c*a*b
mississippi mis*is*p*
Sample output 2:
true
true 
false 
Explanation of Sample output 2:
For the first test case, we can replace the dot with any character so “.*” means zero or more (*) of any character (.). So we can replace (*) with a (.). i.e. .* --> .. --> ab. 

For the second test case, c can be repeated zero times, a can be repeated 1 time. Therefore it matches “aab”.

For the third test case, there is no way possible to bring an ‘i’ in the end, hence the answer is false.
Hint

 Can you do this recursively? Try to solve the problem by solving its subproblems first.

Approaches (3)
Recursion
  • Since there will be a lot of possibilities to explore, the best way is to try to solve the problem recursively.
  • Let isMatch be the function which takes the string S and pattern P as input and returns true if the string matches the pattern else returns false.
  • The base case of this recursion will be if the length of P is 0, then the length of S should also be 0.
  • One thing to note is that ‘*’ will be only present in the pattern if there are some previously occurring characters in the pattern, i.e. ‘*’ cannot be the first character.
  • So if the second character is ‘*’ in the pattern, we can just remove the first character from the string and check from the second character onwards, i.e return isMatch(S.substr(1), P.substr(2)).
  • Another case which is possible is that if the first character in the pattern is ‘.’ or both the characters are equal, we can match it with the character in the string. So we can just return isMatch(S.substr(1), P), where substr is a function used to find the substring from 2nd to last character of the string.
  • If the second character is not ‘*’, so we cannot copy any preceding character. Hence we just need to check whether the characters are equal in both or not, or the first character in the pattern is ‘.’. So we can return isMatch(S.substr(1), P.substr(1)).
Time Complexity

O((M + N) * (2 ^ (N + M/2))), where N is the length of string and M is the length of the pattern. 

 

In the worst case, the function isMatch(S[i:], pattern[2*j:]) will be called (i+j)C(i) times (where C is the combination of the form nCr). Thus it can be proven that the overall complexity of this code will be O((M + N) * 2 ^ (N + M/2)).

Space Complexity

O((M + N) * (2 ^ (N + M/2))), where N is the length of string and M is the length of the pattern. 

 

For every call to isMatch, we will be creating the strings as shown above and some of them will be duplicated. So in the worst case it will require O((M + N) * 2 ^ (N + M/2)) stack space.

Code Solution
(100% EXP penalty)
Regular Expression Matching
Full screen
Console