Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Brute Force Approach
2.1.
Algorithm
2.2.
C++
2.2.1.
Algorithm Complexity: 
3.
Top-Down Approach
3.1.
Algorithm 
3.2.
C++
3.2.1.
Algorithm Complexity:
4.
Bottom-Up Approach
4.1.
Algorithm 
4.2.
C++
4.2.1.
Algorithm Complexity: 
5.
Frequently Asked Questions
5.1.
What is the Longest Repeating Subsequence Problem?
5.2.
How do you find the longest repeating substring?
5.3.
What is the longest repeated subsequence recurrence?
5.4.
Which can be solved using the longest common subsequence problem?
6.
Conclusion
Last Updated: Mar 27, 2024
Medium

Longest Repeating Subsequence

Author Riya
1 upvote
gp-icon
Data structures & algorithms (Beginner to Intermediate)
Free guided path
13 chapters
99+ problems
gp-badge
Earn badges and level up

Introduction

This blog will discuss the various approaches to solve the Longest Repeating Subsequence problem. Before jumping into the problem, let’s first recall what a subsequence is and then understand the Longest Repeating Subsequence problem.

longest repeating subsequence

A subsequence of a string is generated by deleting zero, one, or more characters of the original string without disturbing the relative positions of the remaining characters.

In this problem, we will be given a string. We have to find the length of the longest subsequence occurring at least two times (i.e., the subsequence is repeated in the given string)  in the given string such that the two subsequences shouldn’t contain any character with the same index in the original string.

Let’s take an example to understand it further:

Let the given string be S= “XXYZYVWW” and assume 0-based indexing. 

In the above string, we can see that the repeating subsequences are ‘X,’ ‘Y,’ and ‘W,’ “XYW.” So, the longest repeating subsequence will be “XYW,” and the length of the longest Repeating Subsequence will be 3. We can see that there are two occurrences of the subsequence “XYZ,” one will be the subsequence formed by the 0th, 2nd, and 6th characters (shown in blue color in the table below), and the other will be the subsequence formed by 1st,  4th, and 7th characters (shown in red color in the table below).

X

X

Y

Z

Y

V

W

W

 

If we understand this question carefully, we can see that this question is just a variation of the Longest Common Subsequence Problem (LCS). Here, we have to find the length of the longest common subsequence of and where is the given string, with the condition that while checking for the same characters in the two strings, we have to take care that the two same characters shouldn’t be at the same index in the given string.

Brute Force Approach

Let’s assume that the given string is S, and we have to find the Longest Repeating Subsequence in S. As discussed above, this problem is a modification of the Longest Common Subsequence Problem. Now this problem is reduced to find the LCS (SS) with the condition that not any character in the resulting subsequence should be at the same index in both the S, where LCS (s1, s2) is the function to find the length of the longest common subsequence of the two strings s1 and s2.

So, longestRepeatingSubsequence(S) = LCS (SS) with a restriction that the resulting subsequence shouldn’t contain any character with the same index in their corresponding original string.

We can understand this by taking an example:

Let S= “XXYZYVWW”

LCS(S, S) = S, as the whole string is the same but if we consider the restriction that is written above then, the two ‘X’s shown in red, two ‘Y’s shown in blue and two ‘W’s shown in orange will be in the common subsequence from the two strings. Thus the LCS(S, S) with the condition that not any character in the resulting subsequence should be at the same index in both the S,  will be “XYW”. We can see that each character in “XYW” is at different indexes in both the S as shown below.

S= “XXYZYVWW”

S= “XXYZYVWW

 

Thus the code for this problem will be similar to the LCS problem. We will find LCS(S, S) with the above described condition. So, we need to consider each character in the two strings one by one. (Please note that the two strings are same here)

So, on considering one by one each character in the two strings, two situations will arise:

  1. The two characters are identical and are not occurring at the same index in their respective strings.
  2. The two characters either occur at the same index in their respective strings or differ.

If the first condition arises, we will increase the longest repeating subsequence by one and move to the next character of both strings. If the second condition occurs, we will take the maximum of the two cases - one by moving to the next character in the first string and the other by moving to the next character in the second string.

Algorithm

Step 1. Create a recursive function ‘longestRepeatingSubsequence()’ that will accept the below parameters -

  • ‘str’: The given string for which we have to calculate the length of the longest repeating subsequence.
  • ‘n’: Length of the given string 
  • ‘pos1’: The variable denoting the current character of ‘str1’.
  • ‘pos2’: The variable denoting the current character of ‘str2’.
     

Step 2. Then write the base case of the recursive function i.e., if ‘pos1’ or ‘pos2’ reaches the end of the string, then we can’t add more characters to the longest repeating subsequence, so return 0.

Step 3. Now check for the two conditions described in the brute force approach section and do accordingly.

  • C++

C++

#include <bits/stdc++.h>

using namespace std;

int longestRepeatingSubsequence(string str, int pos1, int pos2, int n)

{



        /*

           base case

           if pos1 or pos2 reaches n, it means there are no further characters

           to consider, so return 0

       */       

        if(pos1 == n || pos2 == n) return 0;



        /*

           If the two characters are equal and not occurring at the same index,

           then increase the length of the longest repeating subsequence by 1

        */

        if(pos1 != pos2 && str[pos1] == str[pos2]) return 1+longestRepeatingSubsequence(str, pos1 + 1, pos2 + 1, n);

        else return max(longestRepeatingSubsequence(str, pos1 + 1, pos2, n), longestRepeatingSubsequence(str, pos1, pos2 + 1, n));

}



int main()

 {

string str="XXYZYVWW";

cout<<"The length of the Longest Repeating Subsequence in the given  string is: "<<longestRepeatingSubsequence(str, 0, 0, 8)<<endl;

return 0;

 }


Output:

The length of the Longest Repeating Subsequence in the given string is: 3

Algorithm Complexity: 

Time Complexity: O(2 ^ N)

In every recursive call ‘longestRepeatingSubsequence()’, we are making two more calls to the same function, so calls are increasing at a rate of 2 until ‘pos1’ or ‘pos2’ reaches N. There will be, at most, ‘2 ^ N’ calls in this code. Therefore the overall time complexity will be O(2 ^ N), where ‘N’ is the length of the given string.

Space Complexity: O(1) 

As we are using constant space, the overall space complexity will be O(1).

Also check out - Substr C++

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Top-Down Approach

In the above approach, there are a lot of duplicate calls for the same state ‘pos1’ and ‘pos2’, which leads to exponential complexity. So, we will use a dynamic programming algorithm to store the states to reduce the time complexity.

We can avoid calling a recursive function again for the same state by using a two-dimensional vector and storing the state in that vector. If we encounter the same state during execution, we will return from the vector instead of recomputing. This approach will drastically decrease the time complexity.  We will need to make some changes in the above recursive code to make it a top-down dynamic programming solution.

Top-Down Approach

Algorithm 

Step 1. Create a two-dimensional vector ‘dp’ of size (N + 1) x (N + 1), where ‘N’ is the length of the given string. Initialize all the elements of the vector with ‘-1’ and call the ‘longestRepeatingSubsequence()’ function, which will now have one extra argument - two-dimensional vector ‘dp’ other than those that were there in the recursive solution.

We have taken the size N x N as pos1 and pos2 are varying from 0 to N.

Step 2. Base case will be if pos1 = N or pos2 = N, then return 0. 

Step 3. Check if the value is present in the vector’ dp’. If yes, then return it.

Step 4. If the value is not present in the ‘dp’ vector, then call the recursive function as in the brute force solution, but here we have to do one more thing: at each step, we have to store the calculated result in our ‘dp’ vector.

  • C++

C++

#include <bits/stdc++.h>

using namespace std;

int longestRepeatingSubsequence(string str, int pos1, int pos2, int n, vector<vector<int>>& dp)

     {



         /*

             base case

             if pos1 or pos2 reaches n, it means there are no further

             characters to consider, so return 0

         */

         if(pos1 == n || pos2 == n) return 0;



         /*

             If this state is already computed, then directly return it from

             the “dp” vector

         */

         if(dp[pos1][pos2] != -1) return dp[pos1][pos2];

 

         if(pos1 != pos2 && str[pos1] == str[pos2]) return dp[pos1][pos2] = 1 + longestRepeatingSubsequence(str, pos1 + 1, pos2 + 1, n, dp);

         else return dp[pos1][pos2] = max(longestRepeatingSubsequence(str, pos1 + 1, pos2, n, dp),longestRepeatingSubsequence(str,pos1,pos2+1,n,dp));

     }



int main()

 {

string str = "XXYZYVWW";

vector<vector<int>> dp(9,vector<int>(9,-1));

cout<<"The length of the Longest Repeating Subsequence in the given string is: " <<longestRepeatingSubsequence(str, 0, 0, 8, dp)<<endl;

return 0;

 }


Output:

The length of the Longest Repeating Subsequence in the given string is: 3

Algorithm Complexity:

Time Complexity: O(N ^ 2)

There will be ‘N * N’ calls in a recursive function as we are using a 2D vector, and if the value is present in this array, then we are not making duplicate calls, so there will be utmost N * N combinations. Therefore the overall time complexity is O(N ^ 2), where ‘N’ is the length of the given string.

Space Complexity: O(N ^2)

As we are using a 2D vector of size ‘N x N’, the overall space complexity is O(N^2), where N is the length of the given string.

Bottom-Up Approach

In the above approach, the overhead of the internal stack is being used in recursion, and we can eliminate it by using an iterative approach. We will take a 2D array and will start populating it from the bottom ( 0th index ) to top ( Nth index ); hence by using this approach, we can make our algorithm more memory efficient. 

Algorithm 

Step 1. Create a function that will accept the below parameters:

  1. ‘str’: The given string for which we have to calculate the length of the longest repeating subsequence.
  2. ‘n’: Length of the given string.

Step 2. Create a 2D array ‘dp’ of size (N+1) x (N+1). 

Step 3. Run two for loops to consider all the characters one by one to find LCS(str, str) with the condition that no two characters in the resulting subsequence will have the same index in their corresponding original string, resulting subsequence will be our longest repeating subsequence.

Step 4: If i=0 or j=0, i.e if the length of the string will be 0, then dp[i][j]=0

Step 4: Then check if the two characters are not at the same index and are identical, then increase the length of the resulting subsequence by 1, else take the maximum of the two cases -  one by moving to the next character in the first string and the another by moving to the next character in the second string.

Step 5:  Return dp[N][N]

  • C++

C++

#include <bits/stdc++.h>

using namespace std;

int longestRepeatingSubsequence(string str, int n)

{

   int dp[n+1][n+1];

   for(int i = 0; i <= n; i++)

     {

         for(int j = 0; j <= n; j++)

          {

/*

                  if length of the string will be 0, then length

                  of longest repeating subsequence will be 0  

              */              

              if(i == 0 || j == 0) dp[i][j] = 0;



              /*

                  If the two characters are not at same position and are equal,

                  then we will increase the length of the longest repeating

                  subsequence by one, else we will take maximum of the two

                  cases as described in the algorithm section

              */                   

              else if(i != j && str[i-1] == str[j-1]) dp[i][j] = 1 + dp[i-1][j-1];

              else dp[i][j] = max(dp[i-1][j], dp[i][j-1]);

          }

     }

   return dp[n][n];

}



int main()

 {

string str="XXYZYVWW";

vector<vector<int>> dp(9, vector<int>(9, -1));

cout<<"The length of the Longest Repeating Subsequence in the given string is: " <<longestRepeatingSubsequence(str, 8) <<endl;

return 0;

 }


Output:

The length of the Longest Repeating Subsequence in the given string is: 3

Algorithm Complexity: 

Time Complexity: O(N ^ 2)

There will be ‘N * N’ calls in a 2D array using the nested ‘for’ loop. Therefore the overall time complexity is O(N ^2), where N is the length of the given string.

Space Complexity: O(N ^ 2)

As we are using a 2D array of size ‘N * N’, the overall space complexity is O(N^2), where N is the length of the given string.

Frequently Asked Questions

What is the Longest Repeating Subsequence Problem?

 The longest repeating subsequence is to find the length of the longest repeating subsequence in a given string such that the two subsequences don’t have the same original string character at the same position.

How do you find the longest repeating substring?

We can simply find the longest repeating substring by taking all the possible substrings and checking each one to see if the remaining string has an identical substring. Dynamic programming can address this substring problem in O(n^2) time. 

What is the longest repeated subsequence recurrence?

The longest repeated subsequence (LRS) recurrence is a dynamic programming formula used to find the length of the longest repeated subsequence within a given sequence.

Which can be solved using the longest common subsequence problem?

Data comparison tools like the diff-utility are built on the longest common subsequence problem, which also has uses in bioinformatics. The longest palindromic subsequence in a given string can be computed using the LCS. We only reverse the supplied string before looking for the LCS in both the supplied and the reversed string.

Conclusion

In this article, we discussed what the Longest Repeating Subsequence Problem is, discussed the various approaches to solve this problem programmatically, and discussed the time and space complexities. We find out how to use extra space and drastically reduce the time complexity from exponential to polynomial.  

Recommended Readings: 

You can refer to our Guided Path on Coding Ninjas Studio to upskill yourself in Data Structures and AlgorithmsCompetitive ProgrammingSystem Design, and many more!

Start practising Longest Palindromic Subsequence problems to know frequently asked interview questions and land your dream job.

Also, check out some other topics such as,Operating SystemsComputer Networks, DBMS, etc. as well as some Contests and Test Series only on Coding Ninjas Studio. Enrol in our courses and refer to the mock test and problems available. Take a look at the interview experiences and interview bundle for placement preparations.

Previous article
Longest Common Substring
Next article
LCS of 3 strings
Guided path
Free
gridgp-icon
Data structures & algorithms (Beginner to Intermediate)
13 chapters
109+ Problems
gp-badge
Earn badges and level up
Live masterclass