Last Updated: 4 Mar, 2022

Minimum Number of Deletions and Insertions

Moderate
Asked in companies
PhonePeMicrosoftCoding Ninjas

Problem statement

You are given 2 non-empty strings 's1' and 's2' consisting of lowercase English alphabets only.


In one operation you can do either of the following on 's1':

(1) Remove a character from any position in 's1'.

(2) Add a character to any position in 's1'.


Find the minimum number of operations required to convert string 's1' into 's2'.


Example:
Input: 's1' = "abcd", 's2' = "anc"

Output: 3

Explanation:
Here, 's1' = "abcd", 's2' = "anc".
In one operation remove 's1[3]', after this operation 's1' becomes "abc".    
In the second operation remove 's1[1]', after this operation 's1' becomes "ac".
In the third operation add 'n' in 's1[1]', after this operation 's1' becomes "anc".

Hence, the minimum operations required will be 3. It can be shown that there's no way to convert s1 into s2 in less than 3 moves.


Input Format :
The first line of the input contains a string 's1'.

The second line of the input contains a string 's2'.


Output Format :
Return the minimum number of operations required to convert string 's1' into 's2'.


Note :
You do not need to print anything, it has already been taken care of. Just implement the given function.

Approaches

01 Approach

Approach: 

 

The basic idea to solve this problem is that we have to minimize the insertion and deletion operation. For minimizing these operations we want the maximum part of the string unchanged so, we want a longest subsequnce which is common in both of the strings this is known as longest common subsequence(LCS), in the strings “aeaea” and “aaae”, “aae” is the lcs. After we know the lcs we can say that (n-lcs) characters we will remove from str and then m-lcs character we will add so the final answer will be n+m-2*lcs. Now, the only task remain is to find the lcs of the two strings. We will sove it recursively, If we are at ith index in str and at jth index in ptr then we have 2 cases, one is when str[i] is equal to ptr[j], in this case we can say that ith and jth characters are included in the LCS and we can move further. 

 

If str[i] is not equal to ptr[j], in this case we have 2 options whether to take the ith character or jth character in the LCS and take the max of both of the cases.

 

Lets define a function “lcs(i, j, n, m, str, ptr) where ‘i’ is the index we are at in str, ‘j’ is the index we are at in ptr, ‘n’ is the length of the string str, ‘m’ is the length of the string ptr.

This function returns the LCS of string str[i..n-1] and ptr[j..m-1]. 

 

Here is the algorithm:

 

  • lcs function:
    • If ‘i’ is equal to ‘n’ or ‘j’ is equal to ‘m’
      • Return 0
    • Declare an integer variable “ans” and initialize it with 0.
    • If str[i] is equal to ptr[j]
      • Update ans as ans = 1 + lcs(i + 1, j + 1, n, m, str, ptr).
    • Else
      • Update ans as ans = max(lcs(i, j + 1, n, m, str, ptr), lcs(i + 1, j, n, m, str, ptr)).
    • Return ans.


 

  • given function:
    • Declare a variable ‘n’ and initialize it with str.length.
    • Declare a variable ‘m’ and initialize it with ptr.length.
    • Declare a variable “len” denoting the length of the longest common subsequence.
    • len = lcs(0, 0, n, m, str, ptr).
    • ans= n + m - 2 * len.
    • Return ans.

02 Approach

Approach: 

 

The basic idea is to store the results of the recursive calls to avoid repetition. We can memorize the results of our function calls in an array (say, ‘DP’).

So,let's first define Dp[i][j]

 

Dp[i][j] =  Longest common subsequence of string str[i..n-1] and ptr[j….m-1].

All the transition are same as discussed in brute force approach.

 

Here is the algorithm:

 

  • lcs function:
    • If ‘i’ is equal to ‘n’ or ‘j’ is equal to ‘m’
      • Return 0
    • If dp[i][j] is not equal to -1
      • Return dp[i][j].
    • Declare an integer variable “ans” and initialize it with 0.
    • If str[i] is equal to ptr[j]
      • Update ans as ans = 1 + lcs(i + 1, j + 1, n, m, str, ptr).
    • Else
      • Update ans as ans = max(lcs(i, j + 1, n, m, str, ptr), lcs(i + 1, j, n, m, str, ptr)).
    • Update dp as dp[i][j] = ans.
    • Return ans.


 

  • given function:
    • Declare a variable ‘n’ and initialize it with str.length.
    • Declare a variable ‘m’ and initialize it with ptr.length.
    • Declare an array “dp” of size n*m and initialize it with -1.
    • Declare a variable “len” denoting the length of the longest common subsequence.
    • len = lcs(0, 0, n, m, str, ptr).
    • ans= n + m - 2 * len.
    • Return ans.

03 Approach

Approach: 

 

The idea is similar to the previous approach. In this approach, we use an iterative way to find longest common subsequence.

 

Dp[i][j] =  Longest common subsequence of string str[0…i] and ptr[0…..j].

 

If we are at ith index in str and at jth index in ptr then we have 2 cases, one is when str[i] is equal to ptr[j], in this case we can say that ith and jth characters are included in the LCS and we have already calculated the LCS of str[0…i-1] and ptr[0…j-1] then:

Dp[i][j] =1 + dp[i-1][j-1], if str[i] is equal to ptr[j].

If str[i] is not equal to ptr[j], in this case we have 2 options whether to take the ith character or jth character in the LCS and take the max of both of the cases.

Dp[i][j] = max(dp[i-1][j], dp[i][j-1]), if str[i] is not equal to ptr[j].


 

Here is the algorithm :

 

  • Declare a variable ‘n’ and initialize it with str.length.
  • Declare a variable ‘m’ and initialize it with ptr.length.
  • Declare an array “dp” of size n*m and initialize it with 0.
  • Run a loop from ‘i’ =0 to ‘i’ = n-1
    • Run a loop from ‘j’ =0 to ‘j’ = m-1
      • If str[i] is equal to ptr[j]
        • If i-1 and j-1 is greater than or equal to 0
          • Update dp[i][j] as dp[i][j] = 1 + dp[i - 1][j - 1].
        • Else
          • Set dp[i][j] as 1.
      • Else
        • Declare two variables ‘a’ and ‘b’ and initialize them to 0.
        • If i-1 is greater than or equal to 0
          • Set a to dp[i-1][j].
        • If j-1 is greater than or equal to 0
          • Set b to dp[i-1][j].
        • Set dp[i][j] to max(a,b).
  • Set ans to ans = n + m - 2 * dp[n - 1][m - 1].
    • Declare a variable “ans” and initialize it with 1.
    • Run another loop for iterating over all the elements of the array, say iterator ‘j’
      • If i-num[j] >= 0
        • Update “ans” ans ans=min(ans, dp[i-num[j]]).
    • Update dp[i] as 1-ans.
  • Return ans.