
Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: July 25, 2023
The JavaScript Object Notation (JSON) format has been around as an IETF Information Standard since 2006. It’s widely used for the storage of structured data in the field of data analytics.
In this tutorial, we’ll explore various ways to convert a JSON dataset to a table using jq. First, we’ll download a public JSON dataset. Subsequently, we’ll use options of the jq command to transform that dataset into tables with different formats.
We’ll use a Ubuntu 22.04 environment for running the examples.
First, we’ll download a sample JSON file:
$ curl 'http://api.worldbank.org/v2/countries/IND/indicators/SP.POP.TOTL?format=json' -o ind.json
$ cat ind.json | jq -r '.[1][1]'
{
"indicator": {
"id": "SP.POP.TOTL",
"value": "Population, total"
},
"country": {
"id": "IN",
"value": "India"
},
"countryiso3code": "IND",
"date": "2021",
"value": 1407563842,
"unit": "",
"obs_status": "",
"decimal": 0
}
As illustrated above, we use the curl command to download a public JSON dataset. The dataset consists of the population of India since 1960. The curl command has one argument and one option:
We inspect the downloaded file using the cat and jq commands:
We’ll use this sample dataset for exploring the various options of the jq command.
The jq syntax supports extraction of the data and provides options for transforming JSON to a CSV file:
$ cat ind.json |
jq '.[1][] | [.country.id, .date, .value] | @csv' |
head
"IN","2022",1417173173
"IN","2021",1407563842
"IN","2020",1396387127
"IN","2019",1383112050
"IN","2018",1369003306
"IN","2017",1354195680
"IN","2016",1338636340
"IN","2015",1322866505
"IN","2014",1307246509
"IN","2013",1291132063
As we can see above, the command performs several actions and uses head to only show the first ten rows of the output:
In summary, we created a CSV file from a list of JSON objects.
We’ll explore adding a header row to our table. This can be helpful to better understand the dataset.
First, we’ll add a header while converting to a CSV file:
$ cat ind.json |
jq -r '["country","year","population"], (.[1][] | [.country.id, .date,.value]) | @csv' |
head
"country","year","population"
"IN","2022",1417173173
"IN","2021",1407563842
"IN","2020",1396387127
"IN","2019",1383112050
"IN","2018",1369003306
"IN","2017",1354195680
"IN","2016",1338636340
"IN","2015",1322866505
"IN","2014",1307246509
"IN","2013",1291132063
Here, the jq command prepends a custom header [“country”,”year”,”population”] to the output.
Next, we’ll use the tab character as a separator of the fields in the generated table:
$ cat ind.json |
jq -r '["country","year","population"], (.[1][] | [.country.id, .date,.value]) | @tsv' |
head
country year population
IN 2022 1417173173
IN 2021 1407563842
IN 2020 1396387127
IN 2019 1383112050
IN 2018 1369003306
IN 2017 1354195680
IN 2016 1338636340
IN 2015 1322866505
IN 2014 1307246509
The jq command uses the @tsv instead of the @csv directive to separate the fields with a tab character.
Finally, let’s add dashes under the header row based on the length of each header field:
$ cat ind.json |
jq -r '(["country","year","population"] | (., map(length*"-"))), (.[1][] | [.country.id, .date,.value]) | @tsv' |
head
country year population
------- ---- ----------
IN 2022 1417173173
IN 2021 1407563842
IN 2020 1396387127
IN 2019 1383112050
IN 2018 1369003306
IN 2017 1354195680
IN 2016 1338636340
IN 2015 1322866505
In summary, the directive (., map(length*”-“) iterates over the current set of lines and generates a row of dashes below the header. Another key point is that map and the .[] are equivalent in this case.
Likewise, we can use the column command to separate fields using the pipe character:
$ cat ind.json |
jq -r '["country","year","population"], (.[1][] | [.country.id, .date,.value]) | @tsv' |
column -t -o '|' |
head
country|year|population
IN |2022|1417173173
IN |2021|1407563842
IN |2020|1396387127
IN |2019|1383112050
IN |2018|1369003306
IN |2017|1354195680
IN |2016|1338636340
IN |2015|1322866505
IN |2014|1307246509
As shown above, the column command has a couple of options:
In summary, combining the jq and column commands provides a powerful way of formatting tables.
At this point, we’ll use the output of the df command to build a table from the disk usage information:
$ df -P |
jq -Rs 'split("\n")[1:-1] | map(split(" +";"")) | map({FileSystem: .[0], Used:.[1], Available : .[2], Capacity: .[3]})'
[
{
"FileSystem": "tmpfs",
"Used": "1091488",
"Available": "1856",
"Capacity": "1089632"
},
{
"FileSystem": "/dev/sda3",
"Used": "97977620",
"Available": "26216284",
"Capacity": "66738208"
}
]
In this case, the jq command transforms the output of the df command to create an array of JSON objects, in several steps:
Next, we’ll convert the above JSON to a table:
$ df -P |
jq -Rs 'split("\n")[1:-1] | map(split(" +";"")) | map({FileSystem: .[0], Used:.[1], Available : .[2], Capacity: .[3]})' > disk.json
$ cat disk.json | jq -r '["FileSystem","Used","Capacity"], (.[] | [.FileSystem,.Used,.Available]) | @tsv' |
column -t -o '|'
FileSystem|Used |Capacity
tmpfs |1091488 |1864
/dev/sda3 |97977620 |26216476
tmpfs |5457432 |0
tmpfs |5120 |4
/dev/sda2 |524252 |6216
Scratch |497775612|131487316
tmpfs |1091484 |108
/dev/sr0 |51806 |51806
Similar to our earlier example with pipe separators, we create a jq pipeline with several steps:
Thus, we transformed the line-oriented output of df command into a sequence of JSON objects. Subsequently, we converted the list of JSON objects into a table.
In this article, we learned a few ways to transform a sequence of JSON objects into a table format using jq.
Firstly, we used the sample dataset to create a CSV file from the separate JSON objects. Secondly, we enhanced the table with a header, underlined with dashes. Lastly, we saw an advanced example of converting the Linux df command output into a table.
In summary, jq is a powerful text processing tool to reshape JSON objects to meet our use cases.