Last Updated: 4 Mar, 2022

Shortest Common Supersequence

Hard
Asked in companies
HCL TechnologiesDream11Thought Works

Problem statement

Given two strings, ‘A’ and ‘B’. Return the shortest supersequence string ‘S’, containing both ‘A’ and ‘B’ as its subsequences. If there are multiple answers, return any of them.

Note: A string 's' is a subsequence of string 't' if deleting some number of characters from 't' (possibly 0) results in the string 's'.

For example:
Suppose ‘A’ = “brute”, and ‘B’ = “groot”

The shortest supersequence will be “bgruoote”. As shown below, it contains both ‘A’ and ‘B’ as subsequences.

A   A A     A A
b g r u o o t e
  B B   B B B  

It can be proved that the length of supersequence for this input cannot be less than 8. So the output will be bgruoote.

Input Format:

The first line of the input contains a single integer ‘T’ representing the no. of test cases.

The first line of each test case contains a single string ‘A’, denoting the first string described in the problem.

The second line of each test case contains a single string, ‘B’, denoting the second string described in the problem.

Output Format:

For each test case, print the shortest string ‘S’, which contains ‘A’ and ‘B’ as its subsequences.

Print a separate line for each test case.
Note:
You are not required to print anything; it has already been taken care of. Just implement the function and return the answer.

Constraints:

1 ≤ T ≤ 100
1 ≤ |A|, |B| ≤ 1000
Both strings consist of only lowercase English letters.
1 ≤ Σ(|A|+|B|) ≤ 3000

Time limit: 1 Sec

Approaches

01 Approach

Let’s consider two substrings of length m and n, respectively. Now we can have two situations:-

 

Both the strings end with the same element:

So for finding their shortest common supersequence, what we have to do is we will shorten each string by removing the last element. Then we will find the shortest common supersequence of shortened lines, and then the removed element will be appended to the shortest common supersequence.

SCS(A[1…..m], B[1…...n]) = SCS(A[1….m-1], B[1...n-1]) + A[m]

 

Suppose two given strings are not ending at the same character:

In this situation the shortest common supersequence will be the shorter of the two sequences SCS(A[1...m-1], B[1..n]) + A[m] and SCS(A[1...m], B[1...n-1]) + B[n].


 

Algorithm:

 

Function shortestSupersequenceRec

Function arguments - string ‘A’, string ‘B’, integer ‘N’ representing the length of A’s prefix to be considered, integer ‘M’ representing the length of B’s prefix to be considered.
Returns - Shortest supersequence of A[1,2,...N] and B[1,2,...M].

  • If n == 0
    • Return prefix of B of length m
  • If m == 0
    • Return prefix of A of length n
  • If a[n-1] == b[n-1]
    • Return shortestSupersequenceRec(a, b, n-1, m-1) + a[n-1]
  • Declare and initialize string sa = shortestSupersequenceRec(a, b, n-1, m) + a[n-1]
  • Declare and initialize string sb = shortestSupersequenceRec(a, b, n, m-1) + b[m-1]
  • Return the shorter string out of sa and sb

 

Given Function

  • Declare and initialize integers n = length of A and m = length of B
  • Return shortestSupersequenceRec(a, b, n, m)

02 Approach

We are doing a lot of duplicate calls to the recursive function and calculating the same thing multiple times. We can eliminate this by using a table containing answers for all the previous calls to the function. We will return the value saved in the lookup table if we have already solved it for a particular prefix pair. Implementation details are as per the following algorithm.

 

Algorithm:

 

Function shortestSupersequenceRec

Function arguments - string ‘A’, string ‘B’, integer ‘N’ representing the length of A’s prefix to be considered, integer ‘M’ representing the length of B’s prefix to be considered.
Returns - Length of the shortest supersequence of A[1,2,...N] and B[1,2,...M].

  • If n == 0
    • Return m
  • If m == 0
    • Return n
  • If table[n][m] is not equal to infinity
    • Return table[n][m]
  • If a[n-1] == b[n-1]
    • Set table[n][m] = shortestSupersequenceRec(a, b, n-1, m-1) + 1
    • Return table[n][m]
  • Set table[n][m] = 1 + min(shortestSupersequenceRec(a, b, n-1, m), shortestSupersequenceRec(a, b, n, m-1))
  • Return table[n][m]

 

Function backtrack

Function arguments - 2D lookup table, string ‘A’, string ‘B’

Returns - Shortest supersequence by backtracking lookup table.

  • Declare and initialize string ans to be empty
  • Declare and initialize integers n = length of A and m = length of B
  • While n > 0 AND m > 0
    • If a[n-1] == b[m-1]
      • Append a[n-1] to string ans
      • Decrement n and m
    • Else if table[n-1][m] < table[n][m-1]
      • Append a[n-1] to string ans
      • Decrement n
    • Else
      • Append b[m-1] to string ans
      • Decrement m
  • While n > 0
    • Append a[n-1] to string ans
    • Decrement n
  • While m > 0
    • Append b[m-1] to string ans
    • Decrement m
  • Reverse ans
  • Return ans

 

Given Function

  • Declare and initialize integer n = length of A and m = length of B
  • Declare 2D lookup table of size n+1 * m+1 initialized to infinity.
  • shortestSuperSequence(a, b, table, n, m)
  • Return backtrack(dp, a, b)

03 Approach

We are doing a lot of duplicate calls to the recursive function and calculating the same thing multiple times. We can eliminate this by using an iterative Dynamic Programming approach.

We will iterate on all the prefixes of strings in increasing order of their lengths and store the minimum length of supersequence for each of the prefix values, which will be used to calculate the minimum length for coins for bigger subarrays. After we fill the DP table, we backtrack our decisions by traversing in DP from (n,m) to (0,0) and retrieving the actual shortest supersequence. We will be retrieving supersequence in reverse order. We need to reverse it before returning. Implementation details are as per the following algorithm.

 

Algorithm:

 

Function backtrack

Function arguments - 2D DP table, string ‘A’, string ‘B’

Returns - Shortest supersequence by backtracking DP table.

  • Declare and initialize string ans to be empty
  • Declare and initialize integers n = length of A and m = length of B
  • While n > 0 AND m > 0
    • If a[n-1] == b[m-1]
      • Append a[n-1] to string ans
      • Decrement n and m
    • Else if dp[n-1][m] < dp[n][m-1]
      • Append a[n-1] to string ans
      • Decrement n
    • Else
      • Append b[m-1] to string ans
      • Decrement m
  • While n > 0
    • Append a[n-1] to string ans
    • Decrement n
  • While m > 0
    • Append b[m-1] to string ans
    • Decrement m
  • Reverse ans
  • Return ans

 

Given Function

  • Declare and initialize integer n = length of A and m = length of B
  • Declare 2D DP table of size n+1 * m+1
  • Iterate over i from 0 to n
    • Set dp[i][0] = i
  • Iterate over i from 0 to m
    • Set dp[0][i] = i
  • Iterate over i from 1 to n
    • Iterate over j from 1 to m
      • If a[i-1] == b[j-1]
        • Set dp[i][j] = dp[i-1][j-1] + 1
      • Else
        • Set dp[i][j] = min(dp[i-1][j], dp[i][j-1]) + 1
  • Return backtrack(dp, a, b)