Last Updated: 3 Jan, 2021

Longest Repeating Subsequence

Moderate
Asked in company
Amazon

Problem statement

Given a string 'STR', you are supposed to return the length of the longest repeating subsequence such that the two subsequences don’t have the same string character at the same position which means that any ith character in the two subsequences shouldn’t have the same index in the original string.

For example :
In a string “abaacbc”, the length of the longest repeating subsequence is 3 considering the subsequence “abc”. The subsequence “abc” which is present at positions [1,2,5] and [4,6,7] are the longest repeating subsequence in the given string.
Input format :
The first line contains a single integer ‘T’ denoting the number of test cases.

Each test case contains a single string “STR” denoting the given string.
Output format :
For each test case, print the length of the longest repeating subsequence.
Note :
You do not need to print anything; it has already been taken care of. Just complete the function.
Constraints :
1 <= T <= 5
1 <= N <= 100

Where ‘T’ is the number of test cases.
'N' is the length of the given string.

Time Limit: 1 sec.

Approaches

01 Approach

This problem is a little bit modification of the same problem as LCS (longest common subsequence) where we are provided with two strings. And now to use that same concept in our approach we have to consider both the strings equal like in the case of LCS we will use LCS(STR, STR) because we want to find the longest common subsequence among these strings with just one constraint when both the characters are same, they shouldn’t be on the same index in the two strings.

 

The naive approach will be to create a recursion function that keeps a check whether the characters of the strings at position let’s say i and j are equal (while considering i != j ) then return LRS(STR, i-1, j-1) + 1 and if they are not the same characters return max (LRS(STR, i-1, j), LRS(STR, i, j-1)).

02 Approach

This problem is a little bit modification of the same problem as LCS (longest common subsequence) where we are provided with two strings. And now to use that same concept in our approach we have to consider both the strings equal like in the case of LCS we will use LCS(STR, STR) because we want to find the longest common subsequence among these strings with just one constraint when both the characters are same, they shouldn’t be on the same index in the two strings.

 

The key idea behind a dynamic programming approach is to use memoization, i.e. we’ll save the result of our sub-problem in a matrix so that it can be used later on.

 

Let DP[ i ][ j ] be our dynamic programming matrix to store the length of LRS of substring STR[0 : i-1] and STR[0 : j-1]. Now, consider the following steps:

 

  1. Before calling the function for any valid (I, J), we will first check whether we have already a solution corresponding to the (I, J), if we have already a solution then we return the solution else we will solve it with the following steps :
  2. If (I == 0 ) i.e. “STR” is empty then, the length of the LRS will be 0. So the answer is 0.
  3. Similarly, if ( J == 0 ) the answer is 0.
  4. Now if ( STR[I-1] == STR[J-1] ) which means the last character of both string matches then it is optimal to include the current character in the LRS. Therefore, find the LRS of the remaining characters of both strings. So return LCS(I-1, J-1, STR, STR) + 1
  5. Otherwise, we can not have both str[I-1] and str[J-1] at the end of the LRS which means either of them can be ignored. Now there are two possibilities, either we ignore STR[i-1] and get the LRS of the rest of the characters and vice versa. We should choose the option which maximizes the LRS i.e. the answer will be max( LCS(I-1, J, STR, STR), LCS(I, J-1, STR, STR).
  6. Store the answer in the “DP” table for later use.

03 Approach

Our intuition is to find out those subsequences which have common characters of the given string. It is nothing but an approach similar to that of the longest common subsequence (LCS) where we generate all sequences of the given strings and try to find out the longest common subsequence among those generated subsequences.

 

This problem is a little bit of modification of the same problem as LCS where we are provided with two strings. And now to use that same concept in our approach we have to consider both the strings equal like in the case of LCS we will use LCS(STR, STR) because we want to find the longest common subsequence among these strings with just one constraint when both the characters are same, they shouldn’t be on the same index in the two strings.

 

So we will be using DP(Dynamic Programming) to solve this problem where :

  1. If the characters are the same and don't possess the same index then DP[i][j] = 1 + DP[i-1][j-1].
  2. Else if the characters are not same then, DP[i][j] = max(DP[i][j-1], DP[i-1][j]).

04 Approach

Our intuition is to find out those subsequences which have common characters of the given string. It is nothing but an approach similar to that of the longest common subsequence (LCS) where we generate all sequences of the given strings and try to find out the longest common subsequence among those generated subsequences.

This problem is a little bit of modification of the same problem as LCS where we are provided with two strings. And now to use that same concept in our approach we have to consider both the strings equal like in the case of LCS we will use LCS(STR, STR) because we want to find the longest common subsequence among these strings with just one constraint when both the characters are same, they shouldn’t be on the same index in the two strings.

 

Here as in the previous approach we were using the DP table of size of N x N. We can optimize that too. Since we only require the previous row details in order to calculate the values for current row, so we do not need to use a table of N x N size, rather we need a matrix of size 2 x N so that it may contain a current row and a previous row required for calculations of LRS (Longest Repeating Subsequence).

 

So we will be using DP(Dynamic Programming) to solve this problem where :

  1. If the characters are the same and don't possess the same index then DP[i%2][j] = 1 + DP[(i-1)%2][j-1].
  2. Else if the characters are not same then, DP[i%2][j] = max(DP[i%2][j-1], DP[(i-1)%2][j]).