Last Updated: 12 Dec, 2021

Largest Sub-Matrix With Equal 0’s And 1’s

Hard
Asked in companies
AmazonCodenationGoogle inc

Problem statement

You are given a matrix ‘ARR’ of size N * M. Each entry in this matrix is either 0 or 1.

Find the largest area of a rectangular sub-matrix that has an equal number of 0’s and 1’s in it.

Example :
If ‘N’ = 4 and ‘M’ = 5, and the matrix is:
0 1 0 1 1
0 1 0 1 0
0 1 0 1 1
0 1 0 1 1

Then clearly, the submatrix containing all the elements of the first four columns contains has equal number of 0’s and 1’s, the area of this submatrix is equal to 4 * 4 = 16, therefore we will print 16.
Follow Up :
Try to solve it in less than O( N^2 * M^2 ) time complexity.
Input Format :
The first line contains a single integer ‘T’ denoting the number of test cases, then each test case follows:

The first line of each test case contains two integers ‘N’ and ‘M’, denoting the number of rows and columns in the matrix.

The next N lines, each contain M integers ‘ARR’, denoting the elements of the current row.
Output Format :
For each test case, print the area of the largest sub-matrix that contains equal 0’s and 1’s.

Output for each test case will be printed in a separate line.
Note :
You are not required to print anything; it has already been taken care of. Just implement the function.
Constraints :
1 ≤ ‘T’ ≤ 10      
1 ≤ N, M ≤ 10
0 ≤ ARR[i][j] ≤ 1 

Time limit: 1 sec

Approaches

01 Approach

A naive approach is to generate all the submatrices. We can fix the top left and bottom right ends of the submatrices to generate all the submatrices of the original input matrix. Note that each end has ~O(N*M) options, so in total just to generate the matrices takes approx ~O((N*M)*(N*M)) operations. Now for each submatrix generated we can count the number of 0’s and 1’s in it, and we can finally output the area of the submatrix that had the largest size and also had an equal number of 0's and 1's.

 

We may simply check for the sum of the submatrix equal to half of the total elements in the submatrix, or a clever approach can be to replace 0’s by -1’s and we now need to simply check if the submatrix sum is equal to zero, to check for the condition of equal 0’s and 1’s.

 

Note that this approach can be further optimised using the prefix sum of a 2D matrix. This approach involves precomputing the prefix matrix. This further helps us to find the sum of a submatrix in constant time, though the precomputation takes O(N*M) time, all the following followups to calculate the sum of the submatrix require only constant time. We have not discussed the implementation of this approach here as finding all the submatrices in this approach still takes ~O((N*M)*(N*M)) computations.

 

The steps are as follows :

  1. Initialize a variable ‘maxArea’ equal to 0, this variable stores our final answer.
  2. Runner a for loop for variable ‘TopR’ from 0 to N-1, run an inner for loop for the variable ‘TopC’ from 0 to M-1, run the third loop for variable ‘BottomR’ from TopR to N-1, and run the fourth loop for variable ‘BottomC’ from TopC to M-1. Here (TopR, TopC) denotes the top-left cell of our submatrix and (BottomR, BottomC) denotes the bottom-right cell. Inside this:
    • Iterate all the elements of the submatrix to count the number of 0’s and 1’s. Store these counts in two variables and finally check if they are equal then update the value of ‘maxArea’ if needed.
  3. Finally, return the value stored in ‘maxArea’.

02 Approach

This approach can be a little difficult to figure out, but one may still be able to figure it out if he somehow manages to observe that we can replace 0’s by -1’s and then find the largest submatrix with a sum equal to zero.

 

Also one must know the way to find the largest subarray with a sum equal to zero from the input 1D array.

 

This approach starts by fixing the number of columns we are currently going to use. Let's suppose the input array is:

(not replaced 0’s by -1’s right now so that the columns are indented properly)

1 1 1 1 1 1

1 0 1 0 1 1

1 1 0 1 1 1

1 1 0 1 1 1

1 0 1 0 1 1

1 1 1 1 1 1

We will explore all the options of the columns being used. Let's suppose currently we are using the 1’st, 2’nd and 3’rd column (0-based indexing), ie: the search space now becomes:

1 1 1 1 1 1

1 0 1 0 1 1

1 1 0 1 1 1

1 1 0 1 1 1

1 0 1 0 1 1

1 1 1 1 1 1

Suppose we can directly find the sum of the elements of some particular columns lying in the same row using some precalculated prefix array calculated for that row.

 

In this way, we can consider this sum of elements of the row as a single element, and our new problem now reduces to a problem of finding the largest subarray of sum zero in a 1D input here (here the 1D array elements correspond to the elements storing the sum of elements in the row, remember to only consider the elements that are present in the columns currently being considered).

 

In this way, if it’s compulsory to use some selected columns, then we are able to calculate the largest rectangle in ~O(N) operations. We can simply check this for all the possible combinations of columns by fixing the left and right bound of the columns being selected.

 

The steps are as follows :

  1. Initialize a variable ‘maxArea’ equal to 0, this variable stores our final answer.
  2. Initialize a matrix ‘prefixRow’ to store the prefix sum of each row, also fill this matrix.
  3. Run an outer loop for variable ‘en’ from M to 0. Run an inner for loop for variable ‘st’ from en - 1 to 0. Our search space is narrowed down to (st, en] now, so inside the for loop:
    • Initialize a variable ‘length’ equal to en - st, this stores the length of our current submatrix.
    • Also, initialize a variable ‘cur’ equal to 0.
    • Declare a hash-map 'mp'. This map will store sum v/s row-index information. Update the value corresponding to 0 equal to -1.
    • Run a for loop for variable ‘i’ to iterate through all the rows. Inside the loop:
      • Increment the value of ‘cur’ by prefixRow[i][en] - prefixRow[i][st].
      • Check if ‘cur’ is present in the ‘mp’, if it is present then:
        • Initialize a variable ‘height’ equal to i - mp[curVal], this stores the height of our current submatrix.
        • Initialize ‘area’ equal to length * height. Update the value of ‘maxArea’ to store the maximum area if needed.
      • If ‘cur’ is not present in the hash-map, then simply insert it in the hash-map and also insert the value ‘i’ with it.
  4. Finally, return the value stored in ‘maxArea’.