Finding the Largest Balanced Subarray | Baeldung on Computer Science

1. Introduction

In this tutorial, we’ll see how to find the largest balanced subarray in a binary array.

2. Problem Description

Given a binary array $\mathbf {A}$ consisting of only zeros and ones, find the maximum length of a contiguous subarray with an equal number of zero and one. For example, if the array is $\{0, 1, 1, 0, 1, 1, 1, 0\}$ , then the maximum length of the balanced subarray is 4. Also, the related balanced subarray is $\{0, 1, 1, 0\}$ .

3. Brute Force Approach

Let’s suppose the number index of the array $A$ starts with $0$ , i.e., $A = \{A[0], A[1], ..., A[n-1]\}$ . We can explore every possible subarray and count the number of zeros and ones in each subarray:

algorithm MaxBalancedSubarrayLength(A, n):
    // INPUT
    //    A = a binary array
    //    n = the length of A
    // OUTPUT
    //    The length of the maximum balanced subarray of A

    maxLength <- 0

    for i <- 0 to n - 2:
        zeroCount <- 0
        oneCount <- 0

        for j <- i to n - 1:
            if A[j] = 0:
                zeroCount <- zeroCount + 1
            else:
                oneCount <- oneCount + 1

            if zeroCount = oneCount:
                maxLength <- max(maxLength, j - i + 1)

    return maxLength

This algorithm uses two nested loops to go through all possible subarrays of $A$ . The outer loop defines the start position, $i$ , of the subarray. The inner loop defines the end position, $j$ , of the subarray that starts at the position $i$ . In each iteration, we count the number of zeros and ones of the subarray. If we see an equal number of zeros and ones, we check the subarray size and record the maximum one.

Since we have two nested loops over the array $A$ , the overall time complexity of this algorithm is $O(n^2)$ , where $n$ is the number of elements in the array $A$ . Also, the space requirement is $O(1)$ as we only use a constant number of variables, such as $zeroCount$ and $oneCount$ .

4. Hash Table Approach

We can use a $counter$ variable to store the relative number of ones and zeros encountered so far while traversing the array. In each iteration we increase the $counter$ by $1$ if we see $1$ . Otherwise, we decrease the $counter$ by $1$ .

Suppose the $counter$ value is $m$ at position $i$ . Later, we encounter the same value at position $j$ . Then, we have the same amount of increments and decrements during the traversal. Therefore, the subarray from position $i+1$ to position $j$ contains the same number of zeros and ones. Also, the length of the corresponding subarray is $(j - i)$ .

For example, if the array is $\{0, 1, 1, 0, 1, 1, 1, 0\}$ , we can have the following $counter$ value changes:

index	-1	0	1	2	3	4	5	6	7	8
array		0	1	1	0	1	1	1	1	0
counter	0	-1	0	1	0	1	2	3	4	3

We start the $counter$ value with $0$ . If in the future, the $counter$ becomes $0$ again at position index $j$ , the subarray from beginning index $0$ till the current index $j$ is a balanced array. Also, the length of the subarray is $j+1$ . Therefore, we associate index $-1$ with $counter$ value $0$ . In this way, we can calculate the subarray length easily by subtracting two indexes.

In our example, we again encounter $0$ twice at indexes $1$ and $3$ , and the corresponding array lengths are $2$ and $4$ . Also, we see $counter$ value $1$ at index $2$ and meet it again at index $4$ . Therefore, the subarray from index $3$ to $4$ is also a balanced array with a length of $2$ . Similarly, we have another balanced array from index $7$ to $8$ as we encounter $3$ twice. Overall, the maximum balanced array length is $4$ . The corresponding subarray starts at index $0$ and ends at index $3$ , which is $\{0, 1, 1, 0\}$ .

4.1. Hash Table

Based on the above observations, we can use a hash table to store all possible $counter$ values. Each entry of a hash table contains a pair $(key, value)$ . We can use $counter$ as the key and the corresponding index position as the entry value. If we encounter the same $counter$ again, we can then use the indexes to calculate the length of the subarray:

algorithm HashTableApproach(A, n):
    // INPUT
    //    A = a binary array
    //    n = the size of array A
    // OUTPUT
    //    The length of the maximum balanced subarray of A

    hashTable <- construct an empty hash table
    maxLength <- 0
    counter <- 0
    Put (0, -1) into hashTable

    for i <- 0 to n - 1:
        if A[i] = 1:
            counter <- counter + 1
        else:
            counter <- counter - 1
        
        if hashTable contains counter as a key:
            start <- hashTable.getValue(counter)
            maxLength <- max(maxLength, i - start)
        else:
            Put (counter, i) into hashTable

    return maxLength

In this algorithm, we first initialize an empty hash table. Then, we create a $counter$ variable with the initial value of $0$ . Also, we add the hash table entry $(0, -1)$ for the $counter$ value 0. In each iteration, we increase or decrease the $counter$ value by 1 based on the array number value. Also, if we see the hash table contains the $counter$ value, we calculate the subarray size and record the maximum one.

Since we only go through the array once, the overall time complexity of this algorithm is $\mathbf{O(n)}$ . However, the maximum size of the hashtable is $O(n)$ if all the elements are either 1 or 0. Therefore, the space requirement is $O(n)$ .

5. Conclusion

In this tutorial, we showed two algorithms to find the largest balanced subarray in a binary array. The brute force approach takes $O(n^2)$ time to solve the problem with $O(1)$ space. We can also use a hash table approach to solve this problem with a better performance of $O(n)$ but with $O(n)$ space.

Core Concepts

Operating Systems

Neural Networks

Graph Theory

Latex

Full Archive

About Baeldung