Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Strings are one of the most popular and important Data Structures as well as one of the most basic ones. A substring is a contiguous sequence that is found in a string. Strings are an important topic from the placement and interview point of view.
In this blog, we will discuss the problem of counting the total number of substrings in a binary string that contains more 1s than 0s. Before diving into the solution, let’s briefly discuss the problem statement.
Problem Statement
Given a binary string of length ‘N’, our task is to count the total number of possible substrings in which the count of 1s is strictly greater than that of 0s.
Note- The count of 1s must be greater than 0s.
Input
Length of the string (N) = 5
Given string = “11010”
Output
8
Explanation
The substrings which have a count of 1s are strictly greater than the count of 0s are “1”, “1”, ”1”, “11”, “110”, “11010”, “101”, and “1101”.
Brute Force Approach
In the brute force approach, the simplest way to do this is to generate all possible substrings and then count the number of 1s and 0s in each substring. Then print the count of those substrings which satisfy the above condition.
This is a straightforward approach and very easy to implement. Let’s take a look at the algorithm part of this approach.
Algorithm
Traverse the string ‘s’ from index 0 to ‘n - 1’ and consider each ‘i’ the starting point of the current substring.
Initialize two variables, ‘count_ones’ and ‘count_zeros’, to store the count of ones and zeroes in the substring that is currently considered.
Initialize a variable ‘total’, which stores the total count of the possible substrings whose count of 1s is greater than 0s.
Finally, after traversing through each substring of the given string. Return the ‘total’, which is displayed as the output.
Dry Run
Let’s look at the visualization over the brute force approach. Considering each substring and counting the total number of ones and zeros in the current substring.
Number of ones = 1, Number of zeroes = 0, Total number of substrings = 1
Number of ones = 2, Number of zeroes = 0, Total number of substrings = 2
Number of ones = 2, Number of zeroes = 1, Total number of substrings = 3
Number of ones = 3, Number of zeroes = 1, Total number of substrings = 4
Number of ones = 3, Number of zeroes = 2, Total number of substrings = 5
Number of ones = 1, Number of zeroes = 0, Total number of substrings = 6
Number of ones = 1, Number of zeroes = 1, Total number of substrings = 6 As the number of ones must be greater than zeros hence, the condition isn’t satisfied.
Number of ones = 2, Number of zeroes = 1, Total number of substrings = 7
Number of ones = 2, Number of zeroes = 2, Total number of substrings = 7
As the number of ones must be greater than zeros hence, the condition isn’t satisfied.
Number of ones = 0, Number of zeroes = 1, Total number of substrings = 7
As the number of ones must be greater than zeros hence, the condition isn’t satisfied.
Number of ones = 1, Number of zeroes = 1, Total number of substrings = 7
As the number of ones must be greater than zeros hence, the condition isn’t satisfied.
Number of ones = 1, Number of zeroes = 2, Total number of substrings = 7
As the number of ones must be greater than zeros hence, the condition isn’t satisfied.
Number of ones = 1, Number of zeroes = 0, Total number of substrings = 8
Number of ones = 1, Number of zeroes = 1, Total number of substrings = 8
As the number of ones must be greater than zeros hence, the condition isn’t satisfied.
The output will be 8 according to the eight valid substrings we got from the given string as seen above.
Implementation in C++
#include <iostream>
#include <string>
using namespace std;
// Function for calculating total substrings
int count_substrings(string s, int n){
int total = 0;
// Outer Loop for starting point of substring
for(int i = 0; i < n; i++){
// Variables for counting 1s and 0s
int count_ones = 0;
int count_zeros = 0;
// Inner Loop for ending point of substring
for(int j = i; j < n; j++){
// Incrementing if the current character is 1 or 0
if(s[j]=='1'){
++count_ones;
}
else{
++count_zeros;
}
// Checking count of 1s > count of 0s
if(count_ones>count_zeros){
++total;
}
}
}
return total;
}
int main() {
// Length of the given string
int n = 5;
// Given string
string s="11010";
// Calling the count_strings function
cout<<count_substrings(s,n);
}
You can also try this code with Online C++ Compiler
public class MyClass {
// Function for calculating total substrings
public static int count_substrings(String s, int n){
int total = 0;
// Outer Loop
for(int i = 0; i < n; i++){
// Variables for counting 1s and 0s
int count_ones = 0;
int count_zeros = 0;
// Inner Loop
for(int j = i; j < n; j++){
// Incrementing if current character is 1 or 0
if(s.charAt(j)=='1'){
++count_ones;
}
else{
++count_zeros;
}
// Checking count of 1s > count of 0s
if(count_ones>count_zeros){
++total;
}
}
}
return total;
}
// Driver Function
public static void main(String args[]) {
// Length of the given string
int n = 5;
// Given string
String s="11010";
// Calling the count_strings function
System.out.print(count_substrings(s,n));
}
}
You can also try this code with Online Java Compiler
Explanation: As there can be N*(N + 1) / 2 substrings possible, maintaining two variables and doing their comparison takes us O(1) time, so the total time complexity is O(N2).
Space Complexity
O(1)
Explanation: We initialized three variables that take up a constant space apart from the given input string. Hence, a space complexity of O(1).
Let’s find the optimized solution to this problem to find all substrings in which the count of 1s is more than the count of 0s.
We will first create one temporaryarray, let's say, pref[], and traverse the string, if for any ‘i’, s[i] == ‘1’, then we will put 1 into the temporary array, and if s[i] == ‘0’, we will put -1 into the temporary array. Then we will convert this temporary array into its prefix sum array.
For Example-
str = “11010”, then the corresponding prefix array will be
Pref[] = [1, 2, 1, 2, 1].
We have created the pref array according to the algorithm explained above.
Initially, Replacing ‘1’ with 1 and ‘0’ with -1.
Pref[] = [1, 1, -1, 1, -1].
Converting this array into prefix sum array.
Pref[] = [1, 2, 1, 2, 1].
Now, let’s understand why we created a prefix array above. Now if we take a closer look at the prefix array, and take any indexes, let's say ‘i’ and ‘j’,
If arr[j] < arr[i] for j < i, it means that the number of 1’s between indexes ‘j’ and ‘i’, is strictly greater than the number of 0’s, then only arr[i] is greater than arr[j] because on every character ‘1’, +1 was added in the prefix array and on every character ‘0’, -1 was added in the prefix array.
In the above example, let take i = 3, and j = 0, then pref[i] = 2, and pref[j] = 1,
Now between j and i, there are 1-0’s and 3-1’s, hence pref[i] > pref[j].
Now, this problem is similar to counting inversions in a given array. In our case, our task is reduced to counting inversions in the prefix array.
Algorithm
Create a temporary array, say ‘pref’ to store 1 and -1 corresponding to the value of characters ‘1’ and ‘0’, respectively.
Traverse the string str, and if s[i] == ‘1’, then pref[i] = 1, otherwise s[i] == ‘0’ and pref[i] = -1.
Update the ‘pref’ array to store the prefix sum of the array.
Now, this problem is reduced to finding a number of pairs i, and j, where arr[i]>arr[j] and i > j, which is very much similar to the count inversions problem.
Dry Run
Given input string
Assigning values 1 and -1 to the corresponding characters in the ‘pref’ array yields-
After taking cumulative sums the corresponding prefix array looks like this-
Now, we are ready with our prefix array and our task is to count the inversions in the prefix array obtained.
For all values of the ‘pref’ array, the positive values denote that the substring starting from the index 0 and ending at the index with a positive value is a valid string.
Hence, all the strings which start from the index 0 and end at the index with a positive value are:
Hence, the ‘total’ will be updated by a value of 5(two 2s and three 1s) as the positive values in the ‘pref’ array are five.
Now, we will move on to counting inversions in the ‘pref’ array.
When the current index = 0, all the values before the index = 0 and having values less than pref[0] are 0.
When the current index = 1, all the values before the index = 1 and having values less than pref[1] are 1, hence, the total is incremented by a value of one.
When the current index = 2, all the values before the index = 2 and having values less than pref[2] are 0. Hence, the total is incremented by a value of zero.
When the current index = 3, all the values before the index = 3 and having values less than pref[3] are 2. Hence, the total is incremented by a value of two.
When the current index = 4, all the values before the index = 4 and having values less than pref[4] are 0. Hence, the total is incremented by a value of zero.
Overall, our ‘total’ is incremented by a value of 3. So, the total value was five and is incremented by a value of 3. The total value of the ‘total’ = 5 + 3 = 8.
The above approach can be implemented using various techniques like merge sort, fenwick tree, etc. We will see the implementation of the above approach in both merge sort and fenwick tree techniques.
Implementation using Merge Sort in C++
// C++ program for all substrings in which the count of 1’s is more than the count of 0’s
#include <iostream>
#include <vector>
#include <string>
using namespace std;
/*
Function call to merge two partitions
Such that the merged array is sorted
*/
int merge(vector <int> &pref, int left, int mid, int right){
vector<int> ref(right - left + 1);
int i = left;
int j = mid + 1;
int k = 0;
int cnt = 0;
int inversion_count = 0;
while (i <= mid && j <= right) {
if (pref[i] <= pref[j]) {
ref[k++] = pref[i++];
}
else {
// Counting inversions
inversion_count += (mid - i + 1);
ref[k++] = pref[j++];
}
}
while (i <= mid){
ref[k++] = pref[i++];
}
while (j <= right){
ref[k++] = pref[j++];
}
k = 0;
for (int itr = left; itr <= right; itr++) {
pref[itr] = ref[k++];
}
return inversion_count;
}
/*
Function to calculate number of
Inversions in a given array using
the merge sort technique
*/
int count_inversions(vector <int> & pref, int left,int right){
int left_count = 0, right_count = 0, merge_count = 0;
if(left<right){
int mid = (left + right) / 2;
left_count = count_inversions(pref, left, mid);
right_count = count_inversions(pref, mid + 1, right);
merge_count = merge(pref, left, mid, right);
}
return left_count + right_count + merge_count ;
}
/*
Function to count the number of
Substrings that contains more 1s than 0s
*/
int count_substrings(string& s ,int n){
vector <int>pref(n,0);
// Putting 1 for '1' and -1 for '0' as explained in the approach
for(int i=0;i<n;i++){
if(s[i]=='0'){
pref[i] = -1;
}
else{
pref[i] = 1;
}
}
// Converting into prefix sum array
for(int i=1;i<n;i++){
pref[i] += pref[i-1];
}
// Stores the count of valid substrings
int total = 0;
for(int i=0;i<n;i++){
// If pref[i] > 0 means string from 0 to i is a valid one
if(pref[i]>0){
++total;
}
}
// Reversing the given array
int j = n-1;
for(int i=0;i<j;i++,j--){
swap(pref[i],pref[j]);
}
total += count_inversions(pref,0,n-1);
return total;
}
// Driver Code
int main(){
// Length of the given string
int n = 5;
// Given input string
string s = "11010";
// Function Call
int ans = count_substrings(s,n);
cout << ans << endl;
return 0;
}
You can also try this code with Online C++ Compiler
public class MyClass {
/*
Function call to merge two partitions
Such that the merged array is sorted
*/
static int merge(int pref[], int left, int mid, int right){
int ref[];
ref =new int [right - left + 1];
int i = left;
int j = mid + 1;
int k = 0;
int cnt = 0;
int inversion_count = 0;
while (i <= mid && j <= right) {
if (pref[i] <= pref[j]) {
ref[k++] = pref[i++];
}
else {
// Counting inversions
inversion_count += (mid - i + 1);
ref[k++] = pref[j++];
}
}
while (i <= mid){
ref[k++] = pref[i++];
}
while (j <= right){
ref[k++] = pref[j++];
}
k = 0;
for (int itr = left; itr <= right; itr++) {
pref[itr] = ref[k++];
}
return inversion_count;
}
/*
Function to calculate number of
Inversions in a given array using
the merge sort technique
*/
static int count_inversions(int pref[], int left,int right){
int left_count = 0, right_count = 0, merge_count = 0;
if(left<right){
int mid = (left + right) / 2;
left_count = count_inversions(pref, left, mid);
right_count = count_inversions(pref, mid + 1, right);
merge_count = merge(pref, left, mid, right);
}
return left_count + right_count + merge_count ;
}
/*
Function to count the number of
Substrings that contains more 1s than 0s
*/
static int count_substrings(String s ,int n){
// Initializing prefix array
int pref[];
pref=new int [n];
for(int i=0;i<n;i++){
pref[i]=0;
}
// Putting 1 for '1' and -1 for '0' as explained in the approach
for(int i=0;i<n;i++){
if(s.charAt(i)=='0'){
pref[i] = -1;
}
else{
pref[i] = 1;
}
}
// Converting into prefix sum array
for(int i=1;i<n;i++){
pref[i] += pref[i-1];
}
// Stores the count of valid substrings
int total = 0;
for(int i=0;i<n;i++){
// If pref[i] > 0 means string from 0 to i is a valid one
if(pref[i]>0){
++total;
}
}
// Reversing the given array
int j = n-1;
for(int i=0; i<j;i++,j--){
int x = pref[i];
pref[i] = pref[j];
pref[j] = x;
}
total += count_inversions(pref,0,n-1);
return total;
}
// Driver Function
public static void main(String args[]) {
// Length of the given string
int n = 5;
// Given string
String s="11010";
// Function Call
int ans = count_substrings(s,n);
System.out.print(ans);
}
}
You can also try this code with Online Java Compiler
Explanation: In the merge sort technique, the list of size N is divided into a max of log N parts, and the merging of all sublists into a single list takes O(N) time, the worst-case run time of this algorithm is O(NlogN). Therefore all substrings in which the count of 1’s is more than the count of 0’s can find out in O(NlogN) time complexity.
Space Complexity
O(N)
Explanation: In the merge sort technique, as a prefix array, ‘pref’ is initialized and used. Hence, it is taking O(N) space extra.
We saw the implementation of counting inversions using the merge sort technique. Now, let’s see the implementation of counting inversions using the Fenwick tree approach.
Implementation using Fenwick Tree in C++
// C++ program for all substrings in which the count of 1’s is more than the count of 0’s
#include <bits/stdc++.h>
using namespace std;
// Function for adding the number in the fenwick tree
void add(vector<int>& BIT, int idx, int val){
idx += 1;
while(idx < BIT.size()){
BIT[idx] += val;
idx += idx & -idx;
}
}
// Query for numbers < = idx
int query(vector<int>& BIT, int idx){
idx += 1;
int sum = 0;
while(idx){
sum += BIT[idx];
idx &= idx - 1;
}
return sum;
}
/*
Function to calculate the number of
Inversions in a given array using
the fenwick tree technique
*/
int count_inversions(vector<int>& pref, int n){
vector<int> BIT(n + 1, 0);
vector<int> b = pref;
sort(b.begin(), b.end());
for(int i = 0; i < n; i++){
pref[i] = lower_bound(b.begin(), b.end(), pref[i]) - b.begin();
}
// Now for all i, 0 <= a[i] < n
int inversion_count = 0;
for(int i = 0; i < n; i++){
// Increasing index a[i] by 1
add(BIT, pref[i], 1);
/*
Count of numbers greater than a[i] before it is
Being added to the inverse count
*/
inversion_count += query(BIT, pref[i] - 1) ;
}
return inversion_count;
}
/*
Function to count the number of
Substrings that contains more 1s than 0s
*/
int count_substrings(string& s ,int n){
vector <int>pref(n,0);
// Putting 1 for '1' and -1 for '0' as explained in the approach
for(int i=0;i<n;i++){
if(s[i]=='0'){
pref[i] = -1;
}
else{
pref[i] = 1;
}
}
// Converting into prefix sum array
for(int i=1;i<n;i++){
pref[i] += pref[i-1];
}
// Stores the count of valid substrings
int total = 0;
for(int i=0;i<n;i++){
// If pref[i] > 0 means string from 0 to i is a valid one
if(pref[i]>0){
++total;
}
}
total += count_inversions(pref,n);
return total;
}
// Driver Code
int main(){
// Length of the given string
int n = 5;
// Given input string
string s = "11010";
// Function Call
int ans = count_substrings(s,n);
cout << ans << endl;
return 0;
}
You can also try this code with Online C++ Compiler
import java.util.Arrays;
public class MyClass {
// Function for adding number in the fenwick tree
static void add(int BIT[], int idx, int val){
idx += 1;
while(idx < BIT.length){
BIT[idx] += val;
idx += idx & -idx;
}
}
// Query for numbers < = idx
static int query(int BIT[], int idx){
idx += 1;
int sum = 0;
while(idx>0){
sum += BIT[idx];
idx &= idx - 1;
}
return sum;
}
/*
Function to calculate number of
Inversions in a given array using
the fenwick tree technique
*/
static int count_inversions(int pref[], int n){
int BIT[];
BIT = new int[n + 1];
for(int i=0;i<=n;i++){
BIT[i]=0;
}
int b [] = Arrays.copyOf(pref, pref.length);
Arrays.sort(b);
for(int i = 0; i < n; i++){
int left = 0, right = n-1, idx = n;
while(left<=right){
int mid = (left + right)/2;
if(b[mid]>=pref[i]){
idx=mid;
right=mid-1;
}
else{
left=mid+1;
}
}
pref[i] = idx;
}
// Now for all i, 0 <= a[i] < n
int inversion_count = 0;
for(int i = 0; i < n; i++){
// Increasing index a[i] by 1
add(BIT, pref[i], 1);
/*
Count of numbers greater than a[i] before it is
Being added to the inverse count
*/
inversion_count += query(BIT, pref[i] - 1) ;
}
return inversion_count;
}
/*
Function to count the number of
Substrings that contains more 1s than 0s
*/
static int count_substrings(String s ,int n){
// Initializing prefix array
int pref[];
pref=new int [n];
for(int i=0;i<n;i++){
pref[i]=0;
}
// Putting 1 for '1' and -1 for '0' as explained in the approach
for(int i=0;i<n;i++){
if(s.charAt(i)=='0'){
pref[i] = -1;
}
else{
pref[i] = 1;
}
}
// Converting into prefix sum array
for(int i=1;i<n;i++){
pref[i] += pref[i-1];
}
// Stores the count of valid substrings
int total = 0;
for(int i=0;i<n;i++){
// If pref[i] > 0 means string from 0 to i is a valid one
if(pref[i]>0){
++total;
}
}
total += count_inversions(pref,n);
return total;
}
// Driver Function
public static void main(String args[]) {
// Length of the given string
int n = 5;
// Given string
String s="11010";
// Function Call
int ans = count_substrings(s,n);
System.out.print(ans);
}
}
You can also try this code with Online Java Compiler
Explanation: In the fenwick tree technique, the ‘pref’ array is traversed once taking a time of O(N). The ‘add’ and ‘query’ in each iteration take up a time of O(logN). Hence, the worst-case run time of this algorithm is O(NlogN). Therefore all substrings in which the count of 1’s is more than the count of 0’s can be solved in O(NlogN) time complexity.
Space Complexity
O(N)
Explanation: In the fenwick tree technique, a prefix array ‘pref’ and bit array ‘BIT’ is initialized and used. Hence, it is taking O(N) space extra.
Additional Information
In C++, this problem can be implemented using an ordered set. An ordered set is a data structure that stores all the elements in a sorted order ascending or descending. Two types of operations can be performed using an ordered set.
order_of_key( val ): Number of values in the set with a value less than ‘val’.
find_by_order( k ): kth element in the set(0 based indexing).
In our problem, we will use the 1st operation at each index and see how much easier the above problem implementation will become.
Implementation in C++
// C++ program for all substrings in which the count of 1’s is more than the count of 0’s
#include <bits/stdc++.h>
// Importing libraries for ordered set
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace __gnu_pbds;
#define ordered_set tree<int, null_type,less_equal<int>, rb_tree_tag,tree_order_statistics_node_update>
using namespace std;
/*
Function to count the number of
Substrings that contains more 1s than 0s
by counting inversions using ordered set
*/
int count_substrings(string& s ,int n){
vector <int>pref(n,0);
// Putting 1 for '1' and -1 for '0' as explained in the approach
for(int i=0;i<n;i++){
if(s[i]=='0')
pref[i] = -1;
else
pref[i] = 1;
}
// Converting into prefix sum array
for(int i=1; i<n;i++){
pref[i] += pref[i-1];
}
// Stores the count of valid substrings
int total = 0;
for(int i=0; i<n;i++){
// If pref[i] > 0 means string from 0 to i is a valid one
if(pref[i]>0){
++total;
}
}
ordered_set st;
for(int i=0; i<n;i++){
// Count of all elements which are less than pref[i]
total += st.order_of_key(pref[i]);
// Inserting current element into the set
st.insert(pref[i]);
}
return total;
}
// Driver Code
int main(){
// Length of the given string
int n = 5;
// Given input string
string s = "11010";
// Function Call
int ans = count_substrings(s,n);
cout << ans << endl;
return 0;
}
You can also try this code with Online C++ Compiler
Which sorting algorithm is used when we sort using STL?
The sorting algorithm used in STL sort() is IntroSort. Introsort is a hybrid sorting algorithm that uses three sorting algorithms to minimize the running time. These algorithms are quicksort, Heapsort, and Insertion Sort.
What is the worst, average, and best case time complexity of merge sort?
Merge sort always have O(NlogN) in all the cases, because we always divide the array into maximum LogN parts, and then again merge them in O(N) time complexity, so overall time complexity is constant, i.e, O(NlogN).
How many substrings are possible for a string of length N?
There can be a possibility of N*(N + 1)/2 substrings of a string of length N.
What is the time complexity of update and query operations in the fenwick tree?
The updates and query operations in the worst case take up to an O(log N) runtime.
What are the different ways to solve the counting inversion problem?
The different types of ways to solve the counting inversion problem are merge sort technique, segment tree, Fenwick tree, and using ordered set (in C++).
Conclusion
In this blog, we discussed the problem of finding all substrings in which the count of 1s is more than the count of 0s in a given binary string. We saw the various approaches to solving this problem programmatically and saw the implementation of 3 approaches in Java and C++. We also discussed the time and space complexity of both approaches and saw how efficiently the approach uses the count inversion problem technique.