1. Introduction

An array is a collection of items, each identified by a unique key. When the items are themselves arrays, we have a two-dimensional (2D) array.

In this tutorial, we’ll delve into the concept of 2D arrays in Bash under Linux. More specifically, we begin with the general structure of different array types. We then discuss how Bash deals with that structure. Next, we tackle different simulations of 2D arrays. In the end, we conclude with some general caveats.

We tested the code in this tutorial on Debian 10.10 (Buster) with GNU Bash 5.0.3. It is POSIX-compliant and should work in any such environment.

2. Arrays

A regular one-dimensional (1D) array is just a row or column of data with indices:

[0][1][2][3]
(a)(b)(c)(d)

Here, each item (element) is in parentheses. Above it is the unique index (key) in square brackets.

To access a specific element, we use its index and the base. The base is the name of the array. For example, item 0 of our array is equal to a.

Items can be of any supported data type.

2.1. Two-dimensional Arrays

We can create a 2D array like so:

   [0][1][2][3]
[0](a)(b)(c)(d)
[1](A)(B)(C)(D)

Basically, item 0 from the original array is now column aA – an array. We access that array’s elements with a second index. For example, item [0,1] is A. The situation is similar for the other items.

The lengths of rows and columns in 2D arrays are often fixed constants.

2.2. Hybrid Arrays

With fixed array dimensions, we can transform the 2D array from the previous section:

[0][1][2][3][4][5][6][7]
(a)(b)(c)(d)(A)(B)(C)(D)

We mapped the 2D array with 2 rows and 4 columns into a 1D list of 8 (2*4) items. To calculate an element’s index, we use the formula (COLUMNS*row+col). It “skips” row rows of COLUMNS elements. After that, it gets the col element of the current row.

Importantly, indices are also known as keys. In some programming languages, the key’s data type can vary like the item’s data type.

Almost all programming languages contain an array data type. Indeed, there is a way to declare an array in most weakly typed languages. Bash is no exception.

Array items in Bash can be anything… except arrays. How then do we create 2D arrays in Bash?

4. 2D Bash Arrays

For our purposes, we’ll minimize the definition of a 2D array. In this article, it is any array we can consistently index using two separate values.

Since Bash does not support 2D arrays out of the box, we must come up with our own way to use them. Let’s explore common implementations.

4.1. Item 2D Array Simulation

In Bash, arrays can be distinguished from strings only with separators. One of the simplest ways to have arrays as items is to convert them from strings on the spot:

$ sep=','
$ declare -a alpha=()
$ alpha+=("a${sep}b")
$ alpha+=("c${sep}d")
$ row=0
$ col=1
$ IFS="$sep" read -ra alpharow < <(printf '%s' "${alpha[$row]}")
$ echo "${alpharow[$col]}"
b

Initially, we choose a comma as a column separator $sep. Next, the array is defined, appending row by row. In this case, the first row is “a,b”, and the second – “c,d”.

Structurally, the IFS (Internal Field Separator) special shell variable comes into play. Its default value contains space, tab, and newline. IFS allows built-ins like read to delimit array items.

Finally, the above snippet uses $row and $col to index the main array, albeit in two steps. First, the whole row $row is split on $sep into a new array $alpharow. After that, $alpharow is indexed with column $col, which returns the requested element. For constant arrays, this is quite enough.

Obviously, we can’t directly use the separator $sep as a normal character. The effects of this can be minimized in multiple ways:

  • use rare or escape characters as separators
  • double/multiply or otherwise augment each actual non-separating separator character, processing accordingly
  • use a combination of characters as a separator

Next, we’ll discuss an implementation that does not suffer from this particular drawback.

4.2. Associative 2D Array Simulation

Here’s an example of an associative array in Bash:

$ declare -A alpha=([0,0]=a [0,1]=b [1,0]=c [1,1]=d)
$ echo "${alpha[@]}"
c d b a

Notice how we give the appearance of two indices. Therefore, we could theoretically imagine the array looking like this:

   [0][1]
[0](a)(b)
[1](c)(d)

In reality, these indices are just string keys:

$ for k in "${!alpha[@]}"; do
  printf "alpha[%s]=%s\n" "$k" "${alpha[$k]}"
done
alpha[1,0]=c
alpha[1,1]=d
alpha[0,1]=b
alpha[0,0]=a
$ echo "${alpha["1,0"]}"
c

We can generate the keys in a loop or calculate them just like numeric indices.

Even minor deviations, however, will lead to issues:

$ for (( col=0; col<2; col++ )); do
  printf "alpha["1,%s"]=%s\n" "$col" "${alpha["1,$col"]}"
done
alpha[1,0]=c
alpha[1,1]=d
$ for (( col=0; col<2; col++ )); do
  printf "alpha["1,%s"]=%s\n" "$col" "${alpha["1, $col"]}"
done
alpha[1,0]=
alpha[1,1]=

The difference between these two loops is very subtle – the space between the comma and $col.

When managing an already complex structure, such intricacies do not help. For this reason, a simpler and more universal solution exists.

4.3. Hybrid 2D Array Simulation

We can declare the above structure as a regular array:

$ ROWS=2
$ COLUMNS=2
$ declare -a alpha=()
$ alpha+=(a b)
$ alpha+=(c d)
$ echo "${alpha[@]}"
a b c d

So far, we have a normal 1D array with known exact dimensions (COLUMNS, ROWS). To fully “convert” it to 2D, the index calculation also needs to change:

$ row=0
$ col=1
$ index=$((COLUMNS*$row+$col))
$ echo "${alpha[$index]}"

We already explored the formula above. Unlike the strings in associative arrays, indices here remain numeric. This is an obvious advantage, as numeric operation syntax is not too sensitive. Spaces, leading zeroes, and other such artifacts don’t trip up calculations.

There are two conditions for the one-dimensional 2D array emulation to work:

  • the array should be mappable to an exact “rectangle”, i.e. COLUMNS*ROWS = ELEMENTS
  • we know the exact dimensions of the array in advance

Overcoming these limitations requires sacrificing some freedom. For that, we can combine two of the methods already discussed.

4.4. Complex 2D Array Implementation

We can potentially give up two characters for separators. For example comma – for rows, newline – for columns. Under these conditions, the whole 2D array can initially be a string:

$ alpharaw='a,b,c,d
e,f,g
h,i,,k
l,,n,o'

Next, we use readarray to separate the rows from the “array to be” string $alpharaw into a new array:

$ readarray -t -d $'\n' alpharows < <(printf '%s' "$alpharaw")

Finally, we create Bash functions for reading, writing, deleting and inserting elements. We’ll only provide an example for a read function. It uses the concepts discussed in the sections above:

$ get_element () {
  alpharaw="$1"
  row="$2"
  col="$3"
  sep="${4:-,}"
  local alpharow alpharows
  IFS=$'\n' readarray -t alpharows < <(printf '%s' "$alpharaw")
  if [[ $row -ge ${#alpharows[@]} ]]; then
    echo "Bad row."
    return
  fi
  IFS="$sep" read -ra alpharow < <(printf '%s' "${alpharows[$row]}")
  if [[ $col -ge ${#alpharow[@]} ]]; then
    echo "Bad column."
    return
  fi
  echo "${alpharow[$col]}"
}
$ get_element "$alpharaw" 3 2
n

While the implementation is more complex, its usage is much more robust. There is much less potential for issues.

5. Caveats

We already mentioned the drawbacks of the implementations in the previous section. For brevity, we enumerate the main ones:

  • implementation complexity
  • performance issues
  • sparse arrays
  • complex and fragile syntax
  • separator escaping

Of course, this list is not complete, as 2D arrays are complex structures.

When choosing an implementation, we must consider its pros and cons for our needs.

6. Summary

In this tutorial, we discussed 2D arrays under Linux.

First, arrays and their types were defined in general and in Bash. Additionally, we explored different Bash implementations for 2D arrays, along with their downsides. Finally, we listed some general caveats.

Despite Bash’s lack of support for 2D arrays, there are ways to achieve a similar effect. However, none of the implementations are entirely robust.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.