Last Updated: 14 Mar, 2021

Longest Common Substring

Moderate
Asked in companies
Wells FargoShareChatAdobe

Problem statement

You are given two strings, 'str1' and 'str2'. You have to find the length of the longest common substring.


A substring is a continuous segment of a string. For example, "bcd" is a substring of "abcd", while "acd" or "cda" are not.


Example:
Input: ‘str1’ = “abcjklp” , ‘str2’ = “acjkp”.

Output: 3

Explanation:  The longest common substring between ‘str1’ and ‘str2’ is “cjk”, of length 3.
Input Format:
The First line of the test case contains the string ‘str1’.

The Second line contains the string ‘str2’.
Output format:
Return an integer representing the length of the longest common substring.
Note:
You don’t need to print anything. Just implement the given function.

Approaches

01 Approach

The basic idea is to recursively try and match characters from ‘str1’ to characters of ‘str2’ and vice versa also.

 

The steps are as follows:

 

  • Create a ‘count’ variable initialized to 0 to get the length of the longest common substring.
  • Try to match the last characters of both strings. Let ‘i’ be the last index for ‘str1’ and ‘j’ be the last index of ‘str2’, so match ‘str1’[i] and ‘str2’[j].
  • If they match, increment ‘count’ by 1 and try to match the preceding characters.
  • If they do not match, make a recursive call to compare ‘str1’[i -1] with ‘str2’[j] and compare ‘str1’[i] with ‘str2’[j -1], and we should choose the option which maximizes our answer i.e. return max(“previous ‘count’”, ‘count’ obtained by matching ‘str1’[i-1] with ‘str2’[j] and ‘count’ obtained by matching 'str1’[i-1] with ‘str2’[j]).
  • Keep repeating these steps until we reach the start of the two strings.

02 Approach

The basic idea is to start slicing the first string into the substring and match those substrings in the second string.

 

The steps are as follows:

 

  • Take a variable ‘ans’ initialized to 0, which keeps track of the longest common substring among the two strings.
  • Run a loop to traverse the first string, 'str1’. In this way, we can get the starting index of the substrings.
  • Run another loop to traverse the second string 'str2’ to match the characters of string 'str2’.
  • Take a variable, say ‘k’, initialized to 0, that will act as an iterator to match the strings.
  • Now start matching characters of strings 'str1’ and 'str2’ i.e, from the index ['i' + ‘k’] and index ['j' + ‘k’] where ‘i’ is the starting index of the substring of the first string and ‘j’ is the starting index of the substring of the second string.
  • If they match, increment the value of ‘k’ and check for the next characters of both strings and update the value of ‘ans’ variable.
  • If they do not match, change the value of ‘k’ to zero and move a step ahead in the second string.

03 Approach

The basic idea of this approach is to solve the problem iteratively.

 

Let ‘dp’[i] [j] be our dynamic programming matrix to store the length of LCS of substring ‘str1’[0: i-1] and ‘str2’[0:j-1].

 

Now, consider the following steps:

Start traversing the ‘str1’ using a variable 'i'.
Create a nested loop and start traversing the ‘str2' using a variable ‘j’.

 

If either ‘i’ or ‘j’ is zero then ‘dp’[i][j] = 0 because one of the strings is empty.
If ( ‘str1'[i-1] == ‘str2'[j-1] ), which means the current character of both strings matches, then it is optimal to include the current character in the LCS. Therefore, find the LCS of the remaining characters of both strings. ‘dp’[i][j] = ‘dp’[i-1][j-1] + 1


Otherwise, we can not have both ‘str1'[i-1] and ‘str2'[j-1] at the end of the LCS, which means one of them must be ignored. Now there are two possibilities, either we ignore ‘str1'[i-1] and get the LCS of the rest of the characters and vice versa. We should choose the option which maximizes the LCS i.e. ‘dp’[i][j] = max(‘dp’[i-1][j] , ‘dp’[i][j-1])

 

Now we can get the length of LCS from ‘dp’[n][m] where ‘n’ and ‘m’ are the length of ‘str1’ and ‘str2’ respectively.