Last Updated: 1 Mar, 2021

Longest Common Substring

Hard
Asked in companies
DunzoCommvault

Problem statement

You have been given two strings “str1” and “str2”. You have to find the length of the longest common substring.

A string “s1” is a substring of another string “s2” if “s2” contains the same characters as in “s1”, in the same order and in continuous fashion also.

For example :
If “str1” = “abcjklp” and “str2” = “acjkp”  then the output will be 3.

Explanation : The longest common substring is “cjk” which is of length 3.
Input Format:
The first line contains an integer ‘T’ which denotes the number of test cases or queries to be run. Then the test cases are as follows.

The first and only line of each test case contains two space-separated strings str1 and str2, respectively.

Output Format:
For each test case, print the length of the longest common substring among the two strings.

Print the output of each test case in a separate line.
Constraints:
1 <= T <= 100
1 <= |str1|, |str2| <= 100

where ‘T’ is the number of test cases and |str| is the length of the string str.

Approaches

01 Approach

The basic idea is to start slicing the first string into the substring and match those substrings in the second string.

 

  • Take a variable “ans” initialized to 0, which keeps track of the longest common substring among the two strings.
  • Run a loop to traverse the first string “str1”. In this way,we can get the starting index of the substrings.
  • Run another loop to traverse the second string “str2” to match characters of string “str2”.
  • Take a variable, say ‘k’ initialized to 0 that will act as an iterator to match the strings.
  • Now start matching characters of strings “str1” and “str2” i.e from index [i + k] and index [j + k] where ‘i’ is the starting index of the substring of first string and ‘j’ is the starting index of the substring of the second string.
  • If they match, increment the value of ‘k’ and check for next characters of both strings and update the value of “ans” variable.
  • If they do not match, change the value of ‘k’ to zero and move a step ahead in the second string.

02 Approach

The basic idea is to recursively try and match characters from str1 to characters of str2 and vice versa also.

  • Create a “count” variable initialized to 0 to get the length of the longest common substring.
  • Try to match the last characters of both strings. Let ‘i’ be the last index for “str1” and ‘j’ be the last index of “str2”, so match str1[i] and str2[j].
  • If they match, increment “count” by 1 and try to match the preceding characters.
  • If they do not match, make a recursive call to compare str1[i -1] with str2[j] and compare str1[i] with str2[j -1] and we should choose the option which maximises our answer i.e. return max(“count” obtained by matching str1[i-1] with str2[j] and “count” obtained by matching str1[i-1] with str2[j]).
  • Keep repeating these steps until we reach the starting of the two strings.

03 Approach

Let us observe the recursion tree for the case where str1 = “xyz” and str2 = “xbyz”

![Example](https://ninjasfiles.s3.amazonaws.com/asset_0000000000000265_1614421037_recursion tree.png)

The repetition of such sub-problems suggests that we can use dynamic programming to optimize our approach.

The key idea behind a dynamic programming approach is to use memoization, i.e. we’ll save the result of our sub-problem in a matrix so that it can be used later on.

Let dp[ i ][ j ] be our dynamic programming matrix to store the length of longest common substring of string “str1” and string “str2”.

  1. Before making a call to match index (i,j), we should check if we already have a solution to that set of the index (i,j) to avoid any duplicate function calls.
  2. If we have already a solution then we return the solution else we will do: 
  3. If ( i == 0  or j == 0) it means both strings are empty then, the answer will be 0.
  4. Now if ( str1[i-1] == str2[j-1] ) which means the last character of both string matches. Make a recursive call to match the preceding characters and increment the answer by 1.
  5. Now, we should maximize our answer and check max( matching str1[i-1] and str2[j] , matching str1[i] and str2[j-1]).
  6. Store the answer in the “dp” table.

04 Approach

The idea is to find the length of the longest common suffix for all substrings of both strings and store these lengths in a table (LCSuff) using the relation :

 

LCSuff[i][j] = | LCSuff[i-1][j-1] + 1       (if str1[i-1] == str2[j-1])

                     | 0                                   (otherwise)

 

where,

 0 <= i – 1 < N,    where N is the length of string str1

0 <= j – 1 < M,    where M is the length of string str2

 

  • Take a variable “ans” initialized to 0.
  • At each step of filling the cells of the table “LCSuff”, update result as : 

result = max(result, LCSuff[i][j])