Minimum Swaps To Make Two Strings Equal

Hard
0/120
Average time to solve is 30m
profile
Contributed by
15 upvotes
Asked in companies
MicrosoftEPAM SystemsVMware Inc

Problem statement

You are given two strings, ‘s1’ and ‘s2’, each of length ‘N’. The strings ‘s1’ and ‘s2’ are anagrams (i.e., they contain the same characters). You can swap the position of any two characters in ‘s1’ any number of times. The task is to find the minimum number of swaps we need to perform to make ‘s1’ equal to ‘s2’.

Example :
s1 = “bac”, s2 = “acb”, n = 3

We need to perform at least two swaps to make ‘s1 = s2’:
1. Swap ‘b’ and ‘a’, so:
    s1 = “abc”
2. Swap ‘b’ and ‘c’, so:
    s1 = “acb”
So, the answer is ‘2’.
Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first line of input contains an integer ‘T’ which denotes the number of test cases. Then, the ‘T’ test cases follow.

Each test case’s first line contains an integer ‘N’ denoting the size of strings ‘s1’ and ‘s2’.

The second line of each test case contains the string ‘s1’.

The third line of each test case contains the string ‘s2’.
Output Format :
For every test case, return the minimum number of swaps we need to perform to make ‘s1’ equal to ‘s2’.
Note :
You do not need to print anything; it has already been taken care of. Just implement the function.
Constraints :
1 <= T <= 1
1 <= N <= 20

Values in string ‘s1’ and ‘s2’ = {‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’}.
‘s1’ is an anagram of ‘s2’.


Time limit: 1 sec

Sample input 1 :

2
4
dabc
abcd
5
abaac
acaab

Sample output 1 :

3
1

Explanation Of Sample Input 1 :

Test Case 1:
s1 = “dabc”, s2 = “abcd”, n = 4

We need to perform at least three swaps to make ‘s1 = s2’:
1. Swap ‘d’ and ‘a’, so:
    s1 = “adbc”
2. Swap ‘d’ and ‘b’, so:
    s1 = “abdc”
3. Swap ‘d’ and ‘c’, so:
    s1 = “abcd”
So, the answer is ‘3’.

Test Case 2:
s1 = “abaac”, s2 = “acaab”, n = 5

We need to perform at least one swap to make ‘s1 = s2’:
1. Swap ‘b’ and ‘c’, so:
    s1 = “acaab”
So, the answer is ‘1’.

Sample Input 2 :

2
3
fde
def
4
babc
abcb

Sample Output 2 :

2
2
Hint

Try to find all possible solutions maybe we can use backtracking to generate all possible solutions.

Approaches (2)
Backtracking (DFS)

As ‘s1’ and ‘s2’ are anagrams, we can always make ‘s1’ equal to ‘s2’ by swapping the characters of ‘s1’. We can do ‘n - 2’ swaps to place ‘n - 2’ unmatched characters of ‘s1’ to their correct position (according to ‘s2’), and the last swap will put the remaining two characters at their correct position. Hence, we can always make ‘s1’ equal to ‘s2’ in ‘n - 1’ swaps.

 

Intuitively, the best swapping algorithm should follow the following:

  • Each swap should at least make one more ‘s1[i] == s2[i]’.
  • Don’t swap the characters in ‘s1’, which are already in their correct position.

 

For any two positions ‘i’ and ‘j’ such that ‘s1[i] != s2[i]’ and ‘s1[j] != s2[j]’ the optimal swap should make both ‘s1[i] == s2[i]’ and ‘s1[j] == s2[j]’. But this is not always possible. In such a scenario, we have to swap with such a ‘j’ that makes one more ‘s1[i] == s2[i]’, call this semi-optimal swap. If there is no optimal swap available, keep performing semi-optimal swaps until we reach an optimal swap condition (The last swap will always be an optimal swap). 

 

For some ‘i’ (‘s[i] != s2[i]’), there is no way of knowing which semi-optimal position ‘j’ will lead to the minimum number of total swaps. So, among all semi-optimal ‘j’ positions, we recursively try to find which ‘(i, j)’ swap will give the minimum number of total swaps. There will be ‘O(n^2)’ pairs of ‘(i, j)’. So, each node in the recursion tree makes ‘O(n^2)’ recursive calls. Traversing this tree will require a large amount of taking, making this approach infeasible.

 

But, it can be observed that instead of searching for all ‘O(n^2)’ pairs of ‘(i, j)’, only those pairs with ‘i’ as the leftmost unmatched ‘s1[i]’ will also produce the same result. Now, for a single ‘i’ value, we have ‘O(n)’ pairs of ‘(i, j)’. So, each node in the recursion tree makes ‘O(n)’ recursive calls. To understand why this is true, consider the following example:

The above cyclic graph represents the swaps we need to perform if we chose ‘i’ as ‘A’. But, if we chose ‘i’ as ‘B’, ‘C’ or ‘D’, we will get the same graph. Hence, all positions of ‘i’ give the same final answer. We chose the first unmatched ‘s1[i]’ as ‘i’, so ‘j’ ranges from ‘i< j < n’.

 

Consider the following example where the first unmatched character is ‘A’ so ‘i = 0’ and the next unmatched characters are (‘B’, ‘C’, etc.) so ‘j’ can be {1, 3, etc.}. We try to find which swapping which ‘(i, j)’ will lead to the minimum total swap count. For each ‘(i, j)’, swap ‘(s1[i], s1[j])’ and recursively find the minimum total swap count for the new ‘s1’.

 

There are two optimizations that we can do to prune the recursion tree:

  • Keep a hashmap to check if we have already found the answer for the current ‘s1’.
  • Once we swap ‘(s1[i], s[j])’, ‘s1[i]’ becomes equal to ‘s2[i]’. So, for each subsequent recursive call, the first unmatched character will be after ‘i’, i.e., from ‘i + 1’.

Note(optional): If there is a ‘j’ such that ‘(i, j)’ is an optimal swap, then only find swap count for this ‘(i, j)’ pair, no need to check for semi-optimal ‘j’ values. 

Such a problem-solving method is called Backtracking, similar to a pruned DFS traversal of the recursion tree. Following is the implementation:

  • Create a global hashmap ‘done’ that maps a string to an integer. Use it to store the minimum swap count of ‘s1’ for each ‘dfs’ call.
  • ‘res = dfs(s1, s2, 0)’. Here, ‘0’ is the starting position of ‘i’.
  • Return ‘res’ as the answer.

The ‘dfs(string s1, string s2, int i)’ helper function:

  • If ‘s1’ is present in ‘done’, end the recursion as we already know the answer for ‘s1’:
    • Return ‘done[s1]’.
  • If ‘s1’ is equal to ‘s2’, end the recursion as we don’t need to do any more swaps:
    • Return ‘0’.
  • Initialize ‘res = INT_MAX’. Use it to find the smallest swaps count from possible ‘j’ values.
  • Run a loop while ‘s1[i]’ is equal to ‘s2[i]’ (find the first unmatched index):
    • ‘i += 1’.
  • If there is an optimal value of ‘j’ available, such that (i < j < n) and  (‘s1[j] != s2[j]’ and ‘s1[j] = s2[i]’ and ‘s1[i] = s2[j]’), then:
    • ‘swap(s1[i], s1[j])’.
    • ‘res = dfs(s1, s2, id + 1)’. The next unmatched character will be after ‘i’.
    • ‘swap(s1[i], s1[j])’. Revert back to the original ‘s1’ string.
    • ‘done[i] = res’
    • Return ‘res’.
  • Run a loop where ‘j’ varies from ‘i+1’ to ‘n - 1’ to find semi-optimal ‘j’ values:
    • If ‘s1[j] != s2[j]’ and ‘s1[j] = s2[i]’, then:
      • ‘swap(s1[i], s1[j])’.
      • ‘res = min(res, dfs(s1, s2, i + 1))’. The next unmatched character will be after ‘i’.
      • ‘swap(s1[i], s1[j])’. Revert back to the original ‘s1’ string.
  • ‘done[s1] = res’. Store the smallest ‘res’ as the swap count for ‘s1’.
  • Return ‘res’.
Time Complexity

O(N!/(NA! * NB! * NC! * ND! * NE! * NF!)), where ‘N’ is the size of string ‘s1’ and ‘Ni’ is the number of characters of type ‘i’ in ‘s1’.

 

In the worst-case scenario, every character after ‘i’ is a semi-optimal ‘j’ (so, each node in the tree has ‘n - i - 1’ children), and the answer is in the rightmost node ( so, every DFS call not containing the answer can have depth from ‘r + 1’ to ‘n - 1’). If maximum calls reach level ‘n-1’, time complexity(T) will be ‘O(Number of nodes in the tree)’.

Number of nodes per level will be, 1: 1, 2: n-1, 3: (n-1)*(n-2), … , n-1: (n-1)*(n-2)* … *2.

T = ‘O(Number of nodes in tree) = O((n-1)!*(1 + 1/2! + 1/3! … +1/(n-1)!)) = O(n!)’

(Because, (1 + 1/2! + 1/3! … + 1/INF!) = e ~ 2.71)

As there are only ‘6’ distinct characters, many nodes will be the same. 

So, T = ‘O(n! / (na! * nb! * nc! * nd! * ne! * nf!))’.

On top of that, not all nodes will reach level ‘n-1’, and the nodes having optimal ‘j’ value have only one child, so the actual time consumed will be far less.

Space Complexity

O((N * N!)/(NA! * NB! * NC! * ND! * NE! * NF!)), where ‘N’ is the size of string ‘s1’ and ‘Ni’ is the number of characters of type ‘i’ in ‘s1’.

 

The maximum swaps performed can be ‘n-1’ so the maximum size of the DFS stack will be ‘n - 1’. The size of the hashmap will be ‘O(n * No. of unique nodes)’, because each node stores a string of size ‘n’.

Video Solution
Unlock at level 3
(75% EXP penalty)
Code Solution
(100% EXP penalty)
Minimum Swaps To Make Two Strings Equal
Full screen
Console