Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Problem Statement
2.1.
Example
3.
1. Simple Approach
3.1.
Implementation
3.2.
C
3.3.
C++
3.4.
Java
3.5.
Python
3.6.
Javascript
3.7.
PHP
3.8.
C#
4.
2. Recursive Approach
4.1.
Algorithm
4.2.
Implementation
4.3.
C
4.4.
C++
4.5.
Java
4.6.
Python
4.7.
Javascript
4.8.
PHP
4.9.
C#
5.
3. Dynamic Programming Approach
5.1.
Algorithm
5.2.
Implementation
5.3.
C
5.4.
C++
5.5.
Java
5.6.
Python
5.7.
Javascript
5.8.
PHP
5.9.
C#
6.
Frequently asked questions
6.1.
How do you find the longest common substring?
6.2.
What is the difference between LCS and longest common substring?
6.3.
What is the recursive method for the longest common substring?
6.4.
What is the complexity of finding longest common substring?
6.5.
What is the naive approach to the longest common substring?
7.
Conclusion
Last Updated: Jul 5, 2024
Medium

Longest Common Substring

Author Nikunj Goel
4 upvotes

Introduction

In string processing and algorithmic design, the concept of finding commonalities between strings holds profound significance. One such fundamental problem is that of identifying the longest common substring (LCS). An LCS between two or more strings is a sequence that appears in the same order in each string, yet it may not be contiguous.

Longest Common Substring

Problem Statement

In this problem, we will be given two strings. We have to find the length of the longest substring, which is common in the two given strings. 

Example

Assume the two given strings are:

s1 = “qdef”

s2 = “defq”

“q”, “d”,  “e”, “f”, “de”, “ef”, and “def” are the common substrings between s1 and s2. “def” is the longest common substring. So, the length of the longest common substring of s1 and s2 will be “3”.

1. Simple Approach

The simplest approach is to generate all possible substrings of one string. Then check if each of them is a substring of the other string. This approach has a time complexity of O(n^3). n is the length of the longest input string. It is not efficient for large input sizes. It is mainly used for small instances or as a baseline for comparison.

Implementation

  • C
  • C++
  • Java
  • Python
  • Javascript
  • PHP
  • C#

C

#include <stdio.h>
#include <string.h>

// Simple approach
int LCSubstr(const char* str1, const char* str2) {
int result = 0;
int len1 = strlen(str1);
int len2 = strlen(str2);

for (int i = 0; i < len1; i++) {
for (int j = 0; j < len2; j++) {
int k = 0;
while ((i + k) < len1 && (j + k) < len2 && str1[i + k] == str2[j + k]) {
k++;
}
if (k > result) {
result = k;
}
}
}
return result;
}

// Driver code
int main() {
const char* X = "qdef";
const char* Y = "defq";

printf("Length of Longest Common Substring is %d\n", LCSubstr(X, Y));
return 0;
}
You can also try this code with Online C Compiler
Run Code

C++

#include <iostream>
#include <string.h>
using namespace std;

//simple approach
int LCSubstr(string str1 ,string str2){
int result = 0;

/*
loop to find all substrings of string 1.
check if current substring matches.
maximize the lenght of common substring
*/
for (int i = 0; i < str1.length(); i++) {
for (int j = 0; j < str2.length(); j++) {
int k = 0;
while ((i + k) < str1.length() &&
(j + k) < str2.length() && str1[i + k] == str2[j + k]){
k = k + 1;
}

result = max(result, k);
}
}
return result;
}

// Driver code
int main(){
string X = "qdef";
string Y = "defq";

cout << "Length of Longest Common Substring is " << LCSubstr(X, Y);
return 0;
}
You can also try this code with Online C++ Compiler
Run Code

Java

public class Main {
// Simple approach
public static int LCSubstr(String str1, String str2) {
int result = 0;

for (int i = 0; i < str1.length(); i++) {
for (int j = 0; j < str2.length(); j++) {
int k = 0;
while (i + k < str1.length() && j + k < str2.length() && str1.charAt(i + k) == str2.charAt(j + k)) {
k++;
}
result = Math.max(result, k);
}
}
return result;
}

// Driver code
public static void main(String[] args) {
String X = "qdef";
String Y = "defq";

System.out.println("Length of Longest Common Substring is " + LCSubstr(X, Y));
}
}
You can also try this code with Online Java Compiler
Run Code

Python

def LCSubstr(str1, str2):
result = 0

for i in range(len(str1)):
for j in range(len(str2)):
k = 0
while i + k < len(str1) and j + k < len(str2) and str1[i + k] == str2[j + k]:
k += 1
result = max(result, k)

return result

# Driver code
if __name__ == "__main__":
X = "qdef"
Y = "defq"

print(f"Length of Longest Common Substring is {LCSubstr(X, Y)}")
You can also try this code with Online Python Compiler
Run Code

Javascript

function LCSubstr(str1, str2) {
let result = 0;

for (let i = 0; i < str1.length; i++) {
for (let j = 0; j < str2.length; j++) {
let k = 0;
while (i + k < str1.length && j + k < str2.length && str1[i + k] === str2[j + k]) {
k++;
}
result = Math.max(result, k);
}
}
return result;
}

// Driver code
const X = "qdef";
const Y = "defq";

console.log("Length of Longest Common Substring is " + LCSubstr(X, Y));
You can also try this code with Online Javascript Compiler
Run Code

PHP

<?php
// Simple approach
function LCSubstr($str1, $str2) {
$result = 0;

for ($i = 0; $i < strlen($str1); $i++) {
for ($j = 0; $j < strlen($str2); $j++) {
$k = 0;
while (($i + $k) < strlen($str1) && ($j + $k) < strlen($str2) && $str1[$i + $k] == $str2[$j + $k]) {
$k++;
}
$result = max($result, $k);
}
}
return $result;
}

// Driver code
$X = "qdef";
$Y = "defq";

echo "Length of Longest Common Substring is " . LCSubstr($X, $Y);
?>
You can also try this code with Online PHP Compiler
Run Code

C#

using System;

public class Program
{
// Simple approach
public static int LCSubstr(string str1, string str2)
{
int result = 0;

for (int i = 0; i < str1.Length; i++)
{
for (int j = 0; j < str2.Length; j++)
{
int k = 0;
while (i + k < str1.Length && j + k < str2.Length && str1[i + k] == str2[j + k])
{
k++;
}
result = Math.Max(result, k);
}
}
return result;
}

// Driver code
public static void Main()
{
string X = "qdef";
string Y = "defq";

Console.WriteLine("Length of Longest Common Substring is " + LCSubstr(X, Y));
}
}


Output

C++ code of simple approach

2. Recursive Approach

This problem can be solved using Recursion. We can consider one by one each character in the two strings. The following two cases will arise:

1. The two characters from the two strings are the same: 

In this case, increase the length of the longest common substring and move to the next characters in both strings.

2. The two characters from the two strings are different:

In this case, take the maximum of the current length of the longest common string and the two lengths of the longest common substring from two cases - one by moving to the next character in only the first string and another by moving to the next character in only the second string.

Let’s understand this algorithm in detail and its C++ code.

Algorithm

Step 1. Create a recursive function “longestCommonSubstringLength()” which takes input

  1. s1 - First given string
  2. s2 - Second given string
  3. i - It denotes the current length of s1
  4. j - It denotes the current length of s2
  5. maxLen - For storing the length of the longest common substring
     

Step 2. Write the base case of the recursive function. If i <= 0 or j <= 0 i.e. if the current length of any of the string becomes 0, then simply return ‘maxLen’.

Step 3.  Now check if the characters at index ‘i-1’ in the first string and ‘j-1’ in the second string are the same, then increase the value of ‘maxLen’ by one and call the function for the smaller sub-problem.

Step 4.  Else If the two characters are not the same, then update the value as the maximum of ‘maxLen’ and the value returned by calling the recursive function for two smaller sub-problems.

Step 5. Finally, return ‘maxLen,’ i.e., the length of the longest common substring.

Implementation

  • C
  • C++
  • Java
  • Python
  • Javascript
  • PHP
  • C#

C

#include <stdio.h>
#include <string.h>

int longestCommonSubstringLength(char *s1, char *s2, int i, int j, int maxLen) {
// Base Case
if (i <= 0 || j <= 0) {
return maxLen;
}

int maxLen1 = maxLen;

// If the characters are same in both the string then increase maxLen by 1
if (s1[i - 1] == s2[j - 1]) {
maxLen1 = longestCommonSubstringLength(s1, s2, i - 1, j - 1, maxLen + 1);
}

int maxLen2 = longestCommonSubstringLength(s1, s2, i, j - 1, 0);
int maxLen3 = longestCommonSubstringLength(s1, s2, i - 1, j, 0);

return (maxLen1 > maxLen2) ? ((maxLen1 > maxLen3) ? maxLen1 : maxLen3) : ((maxLen2 > maxLen3) ? maxLen2 : maxLen3);
}

// Driver code
int main() {
char s1[] = "qdef";
char s2[] = "defq";
int maxLen = 0;

printf("The Length of Longest Common Substring is %d\n", longestCommonSubstringLength(s1, s2, strlen(s1), strlen(s2), maxLen));

return 0;
}
You can also try this code with Online C Compiler
Run Code

C++

#include <bits/stdc++.h>
using namespace std;

int longestCommonSubstringLength(string s1, string s2, int i, int j, int maxLen){
// Base Case
if (i <= 0 || j <= 0) {
return maxLen;
}

int maxLen1 = maxLen;

// If the characters are same in both the string then increase maxLen by 1
if (s1[i - 1] == s2[j - 1]) {
maxLen1 = longestCommonSubstringLength(s1, s2, i - 1, j - 1, maxLen + 1);
}

int maxLen2 = longestCommonSubstringLength(s1, s2, i, j - 1, 0);
int maxLen3 = longestCommonSubstringLength(s1, s2, i - 1, j, 0);

return max(maxLen1, max(maxLen2, maxLen3));
}

int main(){
string s1 = "qdef";
string s2 = "defq";

int maxLen=0;

cout<<"The Length of Longest Common Substring is "<<
longestCommonSubstringLength(s1, s2, s1.length(), s2.length(), maxLen)<<endl;

return 0;
}
You can also try this code with Online C++ Compiler
Run Code

Java

public class Main {
public static int longestCommonSubstringLength(String s1, String s2, int i, int j, int maxLen) {
// Base Case
if (i <= 0 || j <= 0) {
return maxLen;
}

int maxLen1 = maxLen;

// If the characters are same in both the string then increase maxLen by 1
if (s1.charAt(i - 1) == s2.charAt(j - 1)) {
maxLen1 = longestCommonSubstringLength(s1, s2, i - 1, j - 1, maxLen + 1);
}

int maxLen2 = longestCommonSubstringLength(s1, s2, i, j - 1, 0);
int maxLen3 = longestCommonSubstringLength(s1, s2, i - 1, j, 0);

return Math.max(maxLen1, Math.max(maxLen2, maxLen3));
}

// Driver code
public static void main(String[] args) {
String s1 = "qdef";
String s2 = "defq";
int maxLen = 0;

System.out.println("The Length of Longest Common Substring is " + longestCommonSubstringLength(s1, s2, s1.length(), s2.length(), maxLen));
}
}
You can also try this code with Online Java Compiler
Run Code

Python

def longestCommonSubstringLength(s1, s2, i, j, maxLen):
# Base Case
if i <= 0 or j <= 0:
return maxLen

maxLen1 = maxLen

# If the characters are same in both the string then increase maxLen by 1
if s1[i - 1] == s2[j - 1]:
maxLen1 = longestCommonSubstringLength(s1, s2, i - 1, j - 1, maxLen + 1)

maxLen2 = longestCommonSubstringLength(s1, s2, i, j - 1, 0)
maxLen3 = longestCommonSubstringLength(s1, s2, i - 1, j, 0)

return max(maxLen1, maxLen2, maxLen3)

# Driver code
if __name__ == "__main__":
s1 = "qdef"
s2 = "defq"
maxLen = 0

print("The Length of Longest Common Substring is", longestCommonSubstringLength(s1, s2, len(s1), len(s2), maxLen))
You can also try this code with Online Python Compiler
Run Code

Javascript

function longestCommonSubstringLength(s1, s2, i, j, maxLen) {
// Base Case
if (i <= 0 || j <= 0) {
return maxLen;
}

let maxLen1 = maxLen;

// If the characters are same in both the string then increase maxLen by 1
if (s1[i - 1] === s2[j - 1]) {
maxLen1 = longestCommonSubstringLength(s1, s2, i - 1, j - 1, maxLen + 1);
}

const maxLen2 = longestCommonSubstringLength(s1, s2, i, j - 1, 0);
const maxLen3 = longestCommonSubstringLength(s1, s2, i - 1, j, 0);

return Math.max(maxLen1, maxLen2, maxLen3);
}

// Driver code
const s1 = "qdef";
const s2 = "defq";
const maxLen = 0;

console.log("The Length of Longest Common Substring is", longestCommonSubstringLength(s1, s2, s1.length, s2.length, maxLen));
You can also try this code with Online Javascript Compiler
Run Code

PHP

<?php
function longestCommonSubstringLength($s1, $s2, $i, $j, $maxLen) {
// Base Case
if ($i <= 0 || $j <= 0) {
return $maxLen;
}

$maxLen1 = $maxLen;

// If the characters are same in both the string then increase maxLen by 1
if ($s1[$i - 1] == $s2[$j - 1]) {
$maxLen1 = longestCommonSubstringLength($s1, $s2, $i - 1, $j - 1, $maxLen + 1);
}

$maxLen2 = longestCommonSubstringLength($s1, $s2, $i, $j - 1, 0);
$maxLen3 = longestCommonSubstringLength($s1, $s2, $i - 1, $j, 0);

return max($maxLen1, $maxLen2, $maxLen3);
}

// Driver code
$s1 = "qdef";
$s2 = "defq";
$maxLen = 0;

echo "The Length of Longest Common Substring is " . longestCommonSubstringLength($s1, $s2, strlen($s1), strlen($s2), $maxLen);
?>
You can also try this code with Online PHP Compiler
Run Code

C#

using System;

public class Program {
public static int longestCommonSubstringLength(string s1, string s2, int i, int j, int maxLen) {
// Base Case
if (i <= 0 || j <= 0) {
return maxLen;
}

int maxLen1 = maxLen;

// If the characters are same in both the string then increase maxLen by 1
if (s1[i - 1] == s2[j - 1]) {
maxLen1 = longestCommonSubstringLength(s1, s2, i - 1, j - 1, maxLen + 1);
}

int maxLen2 = longestCommonSubstringLength(s1, s2, i, j - 1, 0);
int maxLen3 = longestCommonSubstringLength(s1, s2, i - 1, j, 0);

return Math.Max(maxLen1, Math.Max(maxLen2, maxLen3));
}

// Driver code
public static void Main() {
string s1 = "qdef";
string s2 = "defq";
int maxLen = 0;

Console.WriteLine("The Length of Longest Common Substring is " + longestCommonSubstringLength(s1, s2, s1.Length, s2.Length, maxLen));
}
}


Output

Longest Common Substring in C++


Time complexity: O(3^(M+N))

As the function is called 3 times, and it checks for all the characters, i.e. M + N, 

where M and N are the lengths of the respective strings. A recursive tree is created.

Space complexity: O(M+N)

The recursive function takes a space of N+M for auxiliary stack space during recursion.

3. Dynamic Programming Approach

In the recursive approach, there are a lot of duplicate calls for the same state i.e. for the same values of i, j, and maxLen, which leads to exponential time complexity.

Dynamic Programming Approach

The above solution can be modified, and the results can be stored in a table to avoid the repetitive calling of recursive functions for the same state. Thus dynamic programming approach will significantly reduce the time complexity as compared to the recursive solution.

The algorithm will be similar to the above recursive approach. The only difference is that instead of repetitively calling the recursive function, we will store the length of the longest common substring for each state in a table.

Algorithm

  • Create a matrix dp of size (m+1) x (n+1).
  • Initialize the first row and the first column of the matrix dp with zeros.
  • Initialize two variables: maxLength to store the length of the longest common substring and endIndex to store the ending index of the longest common substring.
  • Start a nested loop.
    • If i == 0 or j == 0, then dp[i][j] = 0.
    • If str1[i] == str2[j], then dp[i][j] = 1 + dp[i – 1][j – 1].
  • Keep track of the maximum value and return it.

Implementation

  • C
  • C++
  • Java
  • Python
  • Javascript
  • PHP
  • C#

C

#include <stdio.h>
#include <string.h>

int lcsDP(char* string_1, char* string_2, int M, int N) {
int dp[M + 1][N + 1];
int count = 0;

for (int i = 0; i <= M; i++) {
for (int j = 0; j <= N; j++) {
if (i == 0 || j == 0)
dp[i][j] = 0;
else if (string_1[i - 1] == string_2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
count = (count > dp[i][j]) ? count : dp[i][j];
} else
dp[i][j] = 0;
}
}
return count;
}

// Driver code
int main() {
char str1[] = "qdef";
char str2[] = "defq";

printf("Length of Longest Common Substring is %d\n", lcsDP(str1, str2, strlen(str1), strlen(str2)));
return 0;
}
You can also try this code with Online C Compiler
Run Code

C++

#include <iostream>
#include <string.h>
using namespace std;

int lcsDP(string string_1, string string_2, int M, int N) {
// create a 2-D array named dp and intitialise all the values as zero.

int dp[M + 1][N + 1];

int count = 0;
/*
intitialise a nested loop
Check the base Case
*/
for (int i = 0; i <= M; i++) {
for (int j = 0; j <= N; j++) {
if (i == 0 || j == 0)
dp[i][j] = 0;
else if (string_1[i - 1] == string_2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
count = max(count, dp[i][j]);
}
else
dp[i][j] = 0;
}
}
return count;
}

// Driver code
int main() {
string str1 = "qdef";
string str2 = "defq";

cout << "Length of Longest Common Substring is " << lcsDP(str1, str2, str1.length(),str2.length());
return 0;
}
You can also try this code with Online C++ Compiler
Run Code

Java

public class Main {
public static int lcsDP(String string_1, String string_2, int M, int N) {
int[][] dp = new int[M + 1][N + 1];
int count = 0;

for (int i = 0; i <= M; i++) {
for (int j = 0; j <= N; j++) {
if (i == 0 || j == 0)
dp[i][j] = 0;
else if (string_1.charAt(i - 1) == string_2.charAt(j - 1)) {
dp[i][j] = dp[i - 1][j - 1] + 1;
count = Math.max(count, dp[i][j]);
} else
dp[i][j] = 0;
}
}
return count;
}

// Driver code
public static void main(String[] args) {
String str1 = "qdef";
String str2 = "defq";

System.out.println("Length of Longest Common Substring is " + lcsDP(str1, str2, str1.length(), str2.length()));
}
}
You can also try this code with Online Java Compiler
Run Code

Python

def lcsDP(string_1, string_2, M, N):
dp = [[0] * (N + 1) for _ in range(M + 1)]
count = 0

for i in range(1, M + 1):
for j in range(1, N + 1):
if string_1[i - 1] == string_2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
count = max(count, dp[i][j])
else:
dp[i][j] = 0

return count

# Driver code
if __name__ == "__main__":
str1 = "qdef"
str2 = "defq"

print("Length of Longest Common Substring is", lcsDP(str1, str2, len(str1), len(str2)))
You can also try this code with Online Python Compiler
Run Code

Javascript

function lcsDP(string_1, string_2, M, N) {
let dp = new Array(M + 1).fill(0).map(() => new Array(N + 1).fill(0));
let count = 0;

for (let i = 1; i <= M; i++) {
for (let j = 1; j <= N; j++) {
if (string_1[i - 1] === string_2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
count = Math.max(count, dp[i][j]);
} else {
dp[i][j] = 0;
}
}
}
return count;
}

// Driver code
const str1 = "qdef";
const str2 = "defq";

console.log("Length of Longest Common Substring is", lcsDP(str1, str2, str1.length, str2.length));
You can also try this code with Online Javascript Compiler
Run Code

PHP

<?php
function lcsDP($string_1, $string_2, $M, $N) {
$dp = array_fill(0, $M + 1, array_fill(0, $N + 1, 0));
$count = 0;

for ($i = 1; $i <= $M; $i++) {
for ($j = 1; $j <= $N; $j++) {
if ($string_1[$i - 1] === $string_2[$j - 1]) {
$dp[$i][$j] = $dp[$i - 1][$j - 1] + 1;
$count = max($count, $dp[$i][$j]);
} else {
$dp[$i][$j] = 0;
}
}
}

return $count;
}

// Driver code
$str1 = "qdef";
$str2 = "defq";

echo "Length of Longest Common Substring is " . lcsDP($str1, $str2, strlen($str1), strlen($str2)) . "\n";
?>
You can also try this code with Online PHP Compiler
Run Code

C#

using System;

public class Program {
public static int lcsDP(string string_1, string string_2, int M, int N) {
int[,] dp = new int[M + 1, N + 1];
int count = 0;

for (int i = 0; i <= M; i++) {
for (int j = 0; j <= N; j++) {
if (i == 0 || j == 0)
dp[i, j] = 0;
else if (string_1[i - 1] == string_2[j - 1]) {
dp[i, j] = dp[i - 1, j - 1] + 1;
count = Math.Max(count, dp[i, j]);
} else
dp[i, j] = 0;
}
}
return count;
}

// Driver code
public static void Main() {
string str1 = "qdef";
string str2 = "defq";

Console.WriteLine("Length of Longest Common Substring is " + lcsDP(str1, str2, str1.Length, str2.Length));
}
}


Output

Dynamic Programming Approach in C++

Time Complexity: O(M*N) 

where M and N are the lengths of the given string1 and string2. The nested loop for filling the entries of the 2-D dp table requires a time on O(M*N).

Space Complexity: O(M*N) 

where M and N are the lengths of the given string1 and string2. 2-D dp array of size [M+1][N+1] is required.

Frequently asked questions

How do you find the longest common substring?

By using algorithms like dynamic programming, iterate through possible substrings of two strings to find the longest sequence present in both.

What is the difference between LCS and longest common substring?

LCS (Longest Common Subsequence) considers non-contiguous elements, while longest common substring requires elements to be contiguous.

What is the recursive method for the longest common substring?

Recursively compare characters of two strings. If they match, recursively check the next characters; otherwise, start checking from the next positions.

What is the complexity of finding longest common substring?

Finding the longest common substring can be done in O(m*n) time using dynamic programming. For each substring of both strings, determine the length of the longest common suffix and record that length in a table. Use the previously computed values to determine the answer of larger suffixes in O(1) time.

What is the naive approach to the longest common substring?

The naive approach for finding the longest common substring between two strings involves comparing all possible substrings of both strings. It has a time complexity of O(m^2 * n), making it impractical for large inputs.

Conclusion

In this article, we discussed the Longest Common Substring Problem, the various approaches to solve this problem programmatically, and the time and space complexities. We find out how to use extra space and drastically reduce the time complexity from exponential to polynomial. If you want to practice this problem, then you can visit Code360.
Recommended problems -

If you think this blog helped you share it with your friends! Refer to the DSA C++ course for more information.

Live masterclass