Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
“Those who cannot remember the past are condemned to repeat it.”
-Dynamic Programming.
As scary as it sounds, Dynamic Programming is a competitive topic to learn and requires brain and intensive practice. And once you can solve these problems, people think of you as a pro, but companies long for you.
The problem we’re going to solve today is an advanced DP problem.
The Central Bureau of Investigation encrypts some papers for security reasons. The document is first turned into lower case letter strings in this process. This string is now being encoded.
Encoding Procedures
The string can be broken down into a list of non-empty substrings at random.
Now that we have a list of strings, randomly pick some substrings and change them with their lengths.
After that, combine the list of substrings to make an encoded string.
You are given two encoded strings, 'S1' and 'S2', consisting of lower case English letters and digits 1 to 9, and you have to tell if both strings are encoded from the same parent string. In other words, return true if a parent string 'S' exists, which can be encoded as 'S1' and 'S2' both. Else return false.
Example
Note: The number of consecutive digits in ‘S1’ and ‘S2’ does not exceed 3.
Intuition
When you first read the problem, your mind might have blown off, but don't worry, that's normal because this problem is an Advanced DP problem. Let's break down the problem bit by bit and see how we can solve each part.
As you can see, we've been given two strings and must determine if both strings can be turned into a parent string "S" by substituting any character for the numbers in the strings.
For example
One thing is clear, we need to compare these two strings somehow, and comparing two strings reminds me of a well-known DP problem called Longest Common Substring.
One major change in this problem is that we have wildcard characters here.
So, if two characters in the string are not the same, we may be able to correct the mismatch moving forward, as seen in the example above.
S1[0] = 'g' and S2[0] ='3' are not the same, and because s2[0] is a number, we can replace 3 in the future with any three wildcard characters.
To summarize up till now, we will compare the encoded string character by character, and if one of the characters is a digit, we know that this string can in future concatenate digits times of wildcard characters.
Let us say we’ll be comparing characters at ‘POS1’ from ‘S1’ and ‘POS2’ in ‘S2’ and at the beginning POS1 = 0 and POS2 = 0. Since both encoded strings only contain lowercase characters and digits 1 to 9. There are four cases to occur.
S1[POS1] and S2[POS2] both are alphabets.
S1[POS1] and S2[POS2] both are digits.
S1[POS1] is a digit, and S2[POS2] is an alphabet.
S1[POS1] is an alphabet, and S2[POS2] is a digit.
Now that we've broken down the problem, let's solve it.
Approach 1- Map 4D
In this approach, the key point is every time we compare two characters, we can have wildcard characters space in strings from our past comparison, so we’ve to maintain the count of wildcard characters in both strings and let ‘WC1’ be wildcard count for ‘S1’ and ‘WC2’ be wildcard count for ‘S2’.
So for each string, we have two variables ‘POS’ and ‘WC’, and at each level of comparison FUNCTION(STRING, POS, WC), we know we compare any number of WC characters to the second string first and then start comparing the rest of the string by position POS + 1.
Since we have two strings, we’ll have the following function prototype. Let’s call this function SOLVER.
Different cases to consider in this function are
If both strings ‘S1’ and ‘S2’ have been completed, i.e., POS1 = S1.length(), POS2 = S2.length(), WC1 = 0, and WC2 = 0. Then we know we’ve found a parent string, hence we return TRUE.
One string is completed, but the other is not, then we return FALSE. Both 1 and 2 cases are the base cases.
Both have some wildcard characters that can be inserted, i.e., WC1 and WC2 are positive, then we decrease the minimum of WC1 and WC2 from both and continue.
One of them has wildcard characters, i.e., either WC1 > 0 or WC2 > 0. Then, in that case, we create every possible number of lengths one, two, and three and continue(since the length of possible numbers in a string is three).
If in both strings, count of wildcard characters is zero, i.e., WC1 = 0 and WC2 = 0.
The first case is both characters are alphabets, so we compare these characters. If they are the same, we continue.
The second case is when one character is an alphabet, and the other is a digit then we enumerate possible numbers and continue. Same as 4th case.
Program
#include <iostream>
#include <string>
#include <unordered_map>
#include <algorithm>
using namespace std;
// Function to tell if the string is in a complete state, i.e., remaining wildcard characters are zero, and we've reached the end of the string.
bool close(string &s, int pos, int debt)
{
return pos == s.length() && debt == 0;
}
// Function that does a dfs to try every possible case for both encoded strings to have the same parent string.
bool solver(string &s1, string &s2, int pos1, int pos2, int wc1, int wc2, unordered_map<string, bool> &dp)
{ // Create the key string to memoize the results.
string key = to_string(pos1) + " " + to_string(pos2) + " " + to_string(wc1) + " " + to_string(wc2);
// If we've previously explored this case, return the previously computed answer.
if (dp.find(key) != dp.end())
{
return dp[key];
}
// Boolean variables that tell whether string1 and string2 are in complete state.
bool end1 = close(s1, pos1, wc1);
bool end2 = close(s2, pos2, wc2);
// If both strings have a complete state, store the answer in the map and return true.
if (end1 && end2)
{
return dp[key] = true;
}
// If one of the strings has a complete state, but the other is not, then return false.
if (end1 != end2)
{
return dp[key] = false;
}
// If both strings have wildcard characters, then cut the first min(WC1, Wc2) characters and recurse.
if (wc1 > 0 && wc2 > 0)
{
return dp[key] = solver(s1, s2, pos1, pos2, wc1 - min(wc1, wc2), wc2 - min(wc1, wc2), dp);
}
// If 'S1' have wildcard characters.
if (wc1 > 0)
{
// If the current charater in 'S2' is an alphabet.
if (!isdigit(s2[pos2]))
{
return dp[key] = solver(s1, s2, pos1, pos2 + 1, wc1 - 1, wc2, dp);
}
else
{
// Now create all possible numbers of length 1, 2, and 3 and recurse.
int number = 0;
for (int i = 0; i + pos2 < s2.length(); i++)
{
if (!isdigit(s2[pos2 + i]))
{
break;
}
number = number * 10 + (s2[pos2 + i] - '0');
// Since these are S2 wildcard characters, WC2=number.
if (solver(s1, s2, pos1, pos2 + i + 1, wc1, number, dp))
{
return dp[key] = true;
}
}
return dp[key] = false;
}
}
// Same is done if 'S2' has wildcard characters.
if (wc2 > 0)
{
if (!isdigit(s1[pos1]))
{
return dp[key] = solver(s1, s2, pos1 + 1, pos2, wc1, wc2 - 1, dp);
}
else
{
int number = 0;
for (int i = 0; i + pos1 < s1.length(); i++)
{
if (!isdigit(s1[pos1 + i]))
{
break;
}
number = number * 10 + (s1[pos1 + i] - '0');
// Since these are 'S1' wildcard characters, WC1 = number.
// If we've found a parent string from this branch return true;
if (solver(s1, s2, pos1 + i + 1, pos2, number, wc2, dp))
{
return dp[key] = true;
}
}
return dp[key] = false;
}
}
// If both current characters are alphabets.
if (!isdigit(s1[pos1]) && !isdigit(s2[pos2]))
{
// If characters are not equal return false.
if (s1[pos1] != s2[pos2])
{
return dp[key] = false;
}
// Match the current character and recurse.
return dp[key] = solver(s1, s2, pos1 + 1, pos2 + 1, wc1, wc2, dp);
}
// If the current character of S1 is a digit, then try all possible numbers of length 1,2, and3. And recurse.
if (isdigit(s1[pos1]))
{
int number = 0;
for (int i = 0; i + pos1 < s1.length(); i++)
{
if (!isdigit(s1[pos1 + i]))
{
break;
}
number = number * 10 + (s1[pos1 + i] - '0');
if (solver(s1, s2, pos1 + i + 1, pos2, number, wc2, dp))
{
return dp[key] = true;
}
}
return dp[key] = false;
}
// Same for S2.
if (isdigit(s2[pos2]))
{
int number = 0;
for (int i = 0; i + pos2 < s2.length(); i++)
{
if (!isdigit(s2[pos2 + i]))
{
break;
}
number = number * 10 + (s2[pos2 + i] - '0');
if (solver(s1, s2, pos1, pos2 + i + 1, wc1, number, dp))
{
return dp[key] = true;
}
}
return dp[key] = false;
}
return false;
}
// Main Function to call solver().
bool encodedString(string &s1, string &s2)
{
unordered_map<string, bool> dp;
return solver(s1, s2, 0, 0, 0, 0, dp);
}
int main()
{
string s1, s2;
cout << "Enter the first encoded string: ";
cin >> s1;
cout << "Enter the second encoded string: ";
cin >> s2;
bool haveSameParentString = encodedString(s1, s2);
if (haveSameParentString)
{
cout << "Both the encoded strings have same parent string.";
}
else
{
cout << "Both the encoded strings do not have any parent string.";
}
return 0;
}
You can also try this code with Online C++ Compiler
Enter the first encoded string: “112s”
Enter the second encoded string: “g841”
Output
Both the encoded strings have the same parent string.
Time Complexity
O(N1 * N2 * WC1 * WC2), where N1 = S1.length(), N2 = S2.length(), WC1 = Greatest three digit number, i.e., 999 and WC2 = 999(given in the constraints).
In the worst case, we’ll recurse at every possible combination of POS1, POS2, WC1, and WC2.
Space Complexity
O(N1 * N2 * WC1 * WC2), where N1 = S1.length(), N2 = S2.length(), WC1 = Greatest three digit number, i.e., 999 and WC2 = 999(given in the constraints).
In the recursive function four variables ‘N1’, ‘N2’, ‘WC1’, and ‘WC2’ are changing hence we need a four-dimensional array for Dynamic Programming.
Approach 2- 3D Array
The main concept in the last problem was wildcard characters, but as wildcards cancel each other out, we will only have to maintain the difference DIFF = WC2 - WC1, where the negative value of DIFF indicates string S1 has wildcard characters and a positive value indicates that string S2 has wildcards.
In this approach, our function prototype will look something like this.
Let’s see what cases will form considering three variables.
If both S1[POS1] and S2[POS2] are digits, then we will construct possible one, two, and three digits numbers and update the value of DIFF, then continue.
If both S1[POS1] and S2[POS2] are alphabets, and DIFF != 0.
If DIFF is negative, we know S1 has wildcards, so we'll match one wildcard of S1 with one S2 character and recurse.
If DIFF is positive, then S2 has wildcards and hence does the same for S2.
If both S1[POS1] and S2[POS2] are alphabets and DIFF=0, then we match the two characters.
With this idea, let's code the solution.
Program
#include <iostream>
using namespace std;
// Three dimensional array to track which state have been visited.
bool dp[41][41][2000] = {};
// Function that performs DFS to try every possible state for both encoded strings to have the same parent string.
bool solver(string &s1, string &s2, int pos1, int pos2, int diff)
{
// If we've traversed both the strings completely, then if DIFF is zero means no extra characters are to be matched; hence return TRUE else, return FALSE.
if (pos1 == s1.length() && pos2 == s2.length())
{
return diff == 0;
}
// Now we check if we've visited this state before, then return false.
if (dp[pos1][pos2][1000 + diff])
{
return false;
}
// Next we mark this state visited.
dp[pos1][pos2][1000 + diff] = true;
// If the current character in ‘S1’ is a digit, we'll make all possible three digits numbers and recurse.
if (pos1 < s1.length() && isdigit(s1[pos1]))
{
// Initializing the variable, ‘NUMBER’.
int number = 0;
for (int i = 0; (i + pos1) < s1.length() && isdigit(s1[pos1+i]); i++)
{
// If the current character is not a digit, break.
if (!isdigit(s1[pos1 + i]))
{
break;
}
// Update the ‘NUMBER’.
number = number * 10 + (s1[pos1 + i] - '0');
// Now, since we are comparing any ‘NUMBER’ characters from ‘S’1. we'll decrease the ‘DIFF’ by ‘NUMBER’ and recurse. If by doing this, we can find the parent string existing then return true.
if (solver(s1, s2, pos1 + i + 1, pos2, diff - number))
{
return true;
}
}
return false;
}
// Same is done for ‘S2’.
if (pos2 < s2.length() && isdigit(s2[pos2]))
{
int number = 0;
for (int i = 0; (i + pos2) < s2.length() && isdigit(s2[pos2+i]); i++)
{
number = number * 10 +(s2[pos2 + i] - '0');
if (solver(s1, s2, pos1, pos2+i+1, diff + number))
{
return true;
}
}
return false;
}
// If ‘DIFF’ is negative, we know ‘S1’ has wildcards, so we'll match one wildcard of ‘S1’ with one ‘S2’ character and recurse.
if (diff < 0)
{
return pos2 < s2.length() && solver(s1, s2, pos1, pos2 + 1, diff + 1);
}
// If ‘DIFF’ is positive, then ‘S2’ has wildcards and hence does the same for ‘S2’.
if (diff > 0)
{
return pos1 < s1.length() && solver(s1, s2, pos1 + 1, pos2, diff - 1);
}
// Next, if both current characters of ‘S1’ and ‘S2’ are alphabets and the debt of both strings is zero, then we compare both. If they are equal, we recurse ahead.
return pos1 < s1.size() && pos2 < s2.size() && s1[pos1] == s2[pos2] && solver(s1, s2, pos1 + 1, pos2 + 1, diff);
}
// Function from where main DFS will be called.
bool encodedString(string &s1, string &s2)
{
return solver(s1, s2, 0, 0, 0);
}
int main()
{
string s1, s2;
cout << "Enter the first encoded string: ";
cin >> s1;
cout << "Enter the second encoded string: ";
cin >> s2;
bool haveSameParentString = encodedString(s1, s2);
if (haveSameParentString)
{
cout << "Both the encoded strings have the same parent string.";
}
else
{
cout << "Both the encoded strings do not have any parent string.";
}
return 0;
}
You can also try this code with Online C++ Compiler
There’s a story behind solving any such advanced problem. I personally enjoyed a lot building the solution and documenting it for you guys. And I hope you had a good time solving this problem. Head over to Coding Ninjas Studio to read other such exciting blogs. Coding Ninja has recently announced a course in DSA practice for Interviews. Do check it out.