Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Problem Statement
3.
Approach
3.1.
Algorithm
3.2.
Implementation
3.3.
Complexity Analysis
4.
Frequently Asked Questions
4.1.
Can more than one string yield the same minimized inversion from the given array of strings?
4.2.
Can I sort the array with a comparator as the count of “b” in the strings?
5.
Conclusion
Last Updated: Mar 27, 2024
Hard

Minimum count of Inversion Pairs possible by concatenating N binary strings in any order

Author HET FADIA
0 upvote

Introduction

Strings are a popular Data Structure with various usages depending upon the need. A Binary String is the same as any other String Data Structure but with the special condition that it only contains two types of characters. In this problem, we are provided with some Binary Strings and we have to find a way to concatenate them in such a manner that the inversion pairs found after concatenation are minimum. 

To brush up on your knowledge of inversions, you can read the article Count Inversions on Coding Ninjas Studio.

Let’s get right into the problem next.

Problem Statement

An array consisting of N strings, each of length M is given.  Each character of the string is either “a” or “b”.

We have to return the minimum inversion count possible after arranging the strings in any order and concatenating them.

Inversion Count

The pair (arr[i], arr[j]) is an inversion pair if arr[i]> arr[j] and i<j.

The total number of pairs in the array is the total inversion count.

For example inversion count in “ba” is 1 as arr[0]=b>arr[1]=a but 0<1.

Similarly, the inversion count in “abaa” is 2 for the pairs at indices: (1,2) and (1,3).

 

Now, let us see a few of the examples of the problem statement:

1. arr=["aab","aba","bbb","aaa"]

Output: 3

Explanation: We can arrange the strings in the following way ['aaa', 'aab', 'aba', 'bbb'].

After we concatenate the array, we get the string “aaaaabababbb”.

The inversion count in the string is 3 as there are three pairs (b,a), (b,a) and (b,a) as seen in the below figure:

Inversion count

Figure 1: inversion count=3 in the string “aaaaabababbb”

2. arr=['aa', 'aa', 'ba', 'bb']

Output: 1

Explanation: We can arrange the string in the following way ['aa', 'aa', 'ba', 'bb'] and after concatenating them we get the string “aaaababb”.

We can see that the inversion count in the string “aaaababb” is one because there is only one pair (b, a), highlighted in red.

Also see, Euclid GCD Algorithm

Approach

The approach to the problem is simple. To minimize the inversions, we must have the string having more b’s towards the last.

As we can see in the sample examples, the strings having more “b” will be towards the end of the array to minimize the inversion count.  This is because if we concatenate string1 + string2, the increase in the number of inversions will be the number of b in string1 * the number of “a” in string2.

Now both have length m. So the number of a+b in string2 is m, so the number of “a” in string2 is (m-(number of “b” in string2)).

Thusthe number of inversions in the new string will increase by (the number of b in string1) x (m-number of b in string2).

Thus having more b in string1 will increase the number of inversions. Similarly, having more b in string2 will decrease the number of inversions. Also, mathematically, the value of the expression (number of b’s in string1) x (m-number of b’s in string2) will me minimum when the number of b’s in string1 is minimum and in string2 is maximum.

Thus we conclude that only the count of b determines the order of concatenating the strings.

So our approach is to sort the array of strings based on the number of b in the string. As the maximum count of b can be m, we can sort by storing them in the array of length m+1.

Let us count the number of inversions in the string

1. First, we initialize the b_count to 0. The idea behind the b_count is that when we are at ith index and if we encounter “a”, there were b_count b before it. So we can increment the inversion_count by b_count.

2. We make another variable inversion_count and set it to 0.

3. Whenever we encounter “a” we add b_count to the inversion_count. There were b_count “b” before a, so each b makes an inversion pair with this “a”. 

4. When we encounter b, we increment the b_count by 1.

5. Finally, after traversing the string, we get the inversion_count. We return the inversion_count then.

Algorithm

DEFINE FUNCTION count_inversion(s):
    inversions=0
    b_count=0

    FOR i IN s:
        IF i is equal to 'a':
            inversions+=b_count
        ELSE:
            increment b_count by 1
    RETURN inversions

DEFINE FUNCTION get_min_inversion(arr):
    n=length of (arr)
    m=length of (arr[0])

    d=[[] FOR i IN range(m+1)]
    FOR i IN arr:
        d[i.count("b")].append(i)
    arr_of_string=[]
    FOR i IN d:
        FOR j IN i:
            arr_of_string.append(j)
    answer="".join(arr_of_string)
    inversions=count_inversion(answer)
    RETURN inversions

arr=["aab","aba","bbb","aaa"]
answer=we call get_min_inversion with argument (arr)
OUTPUT(answer)

Implementation

Here is the code for the problem in python3.

def count_inversion(s):
    inversions=0
    b_count=0
    """
    When we encounter a this means that there were b_count b before it. So we can increment the inversion_count by b_count.
    When we encounter b we increment the b
    """
    for i in s:
        if i=='a':
            inversions+=b_count
        else:
            b_count+=1
    return inversions
    
def get_min_inversion(arr):
    n=len(arr)
    m=len(arr[0])
    """
    We sort the arr on the basis of b count in it.
    We store the strings on d[i] where i is the count of b in the array.
    Thus using the array d we sort them
    """
    
    d=[[] for i in range(m+1)]
    for i in arr:
        d[i.count("b")].append(i)
    arr_of_string=[]
    for i in d:
        for j in i:
            arr_of_string.append(j)
            
    # print(arr_of_string)
    answer="".join(arr_of_string)
    # print(answer)
    inversions=count_inversion(answer)
    # finally we return the inversions
    return inversions
    
arr=["aab","aba","bbb","aaa"]
answer=get_min_inversion(arr)
print(answer)

Output

3

Complexity Analysis

Time Complexity

The time complexity of the above algorithm is O(N*M) where N is the length of the array and M is the length of strings in it.

For making the array d, we count the number of “b” in every string of the array. This takes O(N*M) time complexity.

Then we join the string, which again takes O(N*M) time.

Then we count inversions in the string. For that, we traverse the joined string once. Thus it again takes O(N*M) time.

Thus final time complexity = max(O(N*M),O(N*M),O(N*M))= O(N*M).

Space Complexity

The space complexity of the above code is O(N*M).

The array d is of size M. We store the pointers of N strings in the array. So the space complexity for making the array d and storing the N string pointer is O(N+M).

We make a string “answer” by concatenating the N strings in the array d. The answer string has size N*M, and so it takes O(N*M) space complexity.

Then we traverse the answer string and find the inversion count, which takes O(1) space.

Thus final space complexity = max(O(N+M),O(N*M),O(1))= O(N*M).

Frequently Asked Questions

Can more than one string yield the same minimized inversion from the given array of strings?

Yes, there can be multiple strings that yield the same minimized result.

Consider arr=["ab","ba"].

The string “abba” and “baab” both have an inversion count of 2.

Can I sort the array with a comparator as the count of “b” in the strings?

Yes, but sorting in this way makes our implementation inefficient.

Here in this article, we have sorted the strings based on the count of “b” by storing the strings in an array of arrays. This way, the time complexity is O(N*M). The time complexity would have been O(N*log N * M) if we sorted by the comparator. Thus because of sorting, there would have been N*log N comparisons, and in each comparison, the program would calculate the count of “b”, so each comparison would take O(M). Thus the time complexity would have been O(N*log N * M). Thus sorting them using a comparator function would increase the time complexity, and the program would become inefficient in terms of speed.

Conclusion

The article helps us understand the inversion operations on strings in python. We also understand how to sort the strings efficiently. You can also copy the code and run it on your system on multiple inputs to understand the approach better.

Recommended Readings:

 

Recommended problems -

 

Check out some of the amazing Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Basics of C, etc. along with some Contests and Interview Experiences only on Coding Ninjas Studio

You can also check out some of the Basic String Problems to test your understanding of the String Data Structure.

Happy Learning!!!

Live masterclass