1. Introduction

Since its standardization in 2014, the JavaScript Object Notation (JSON) has been invaluable as part of programming languages, REST and application programming interface (API) output and definitions, and many end-user processes and tools. Similarly, the YAML Ain’t Markup Language (YAML) has had a long journey since 2004, landing it as one of the most useful languages for defining virtual machines, containers, resources, and similar. Both JSON and YAML have multiple stand-alone interpreters and libraries that support them.

In this tutorial, we explore YAML and JSON and ways to process and convert between the two when processing data. First, we go over YAML. After that, we briefly refresh our knowledge about JSON. Next, discuss applications of both formats. Then, we check tools that can understand and work with JSON and YAML. Finally, we show conversions between the formats.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. Unless otherwise specified, it should work in most POSIX-compliant environments.

2. YAML Ain’t Markup Language (YAML)

First appearing in 2001, YAML was Yet Another Markup Language. Yet, its data orientation made it rise through the ranks of other new markup languages at the time.

2.1. HTML and YAML

As perhaps one of the most used languages of the type, HyperText Markup Language (HTML) is a good reference point for YAML:

$ cat basic.html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>HTML 5 Boilerplate</title>
    <link rel="stylesheet" href="style.css">
  </head>
  <body>
      <script src="index.js"></script>
  </body>
</html>

Although there isn’t a one-to-one mapping between HTML source code and YAML, we can create a custom interpretation:

$ cat basic-html.yaml
#!DOCTYPE html
- html:
  lang: en
  subs:
  - head:
    subs:
      meta:
      - meta:
        charset: UTF-8
      - meta:
        name: viewport
        content: width=device-width, initial-scale=1.0
      - meta:
        http-equiv: X-UA-Compatible
        content: ie=edge
      title: HTML 5 Boilerplate
      link:
      - link:
        rel: stylesheet
        href: style.css
  - body:
    subs:
    - script:
      src: index.js

Notably, there are several types of YAML elements.

2.2. YAML Elements

Let’s understand the basic structure of YAML:

  • # octothorp comments like # comment
  • dash list elements like – el
  • [] square bracket , comma-separated elements like [el1, el2]
  • {:} associative arrays with , comma-separated elements like {el1:val1, el2:val2}
  • document separator

All of these elements employ and depend mainly upon non-tab indentation and colons for structure.

2.3. Advanced Constructs

More high-level syntax includes the & prefix, which defines a node, part, or object that repeats within the same definition:

$ cat ampersand.yaml
credentials: &creds
  user: x
  pass: password

connection:
  <<: *creds
  uri: db://192.168.6.66

In this case, the credentials object is inserted within the connection object as a reference via the * asterisk after being defined with a &.

Further, the ! and !! exclamation mark modifier can cast values to specific types, avoiding ambiguity and increasing precision:

$ cat exclamation.yaml
data: !!binary eA==

In this case, we simply denote the value of data is a Base64-encoded string. There are a number of other type names that we can use in place of binary as defined in the official types specification.

2.4. Advanced Features

Unlike other markup and data definition languages, YAML has distinguishing features like data types, tags, anchors, aliases, and others. These make it unique and much more suitable for integration within the context of many programming languages such as Python and even JavaScript.

3. JavaScript Object Notation (JSON)

Originally developed around 2001, JSON has had a long journey through many iterations until reaching its first standard version in 2013. By that time, it had become the de facto standard not only in JavaScript environments but in many other languages and application configurations as well.

3.1. HTML and JSON

Using HTML as a reference, let’s see a basic conversion:

$ cat basic.html
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>HTML 5 Boilerplate</title>
    <link rel="stylesheet" href="style.css">
  </head>
  <body>
	<script src="index.js"></script>
  </body>
</html>
$ cat basic.json
{
  "#comment": "DOCTYPE html",
  "html": {
    "@lang": "en",
    "head": {
      "meta": [
        {
          "@charset": "UTF-8"
        },
        {
          "@name": "viewport",
          "@content": "width=device-width, initial-scale=1.0"
        },
        {
          "@http-equiv": "X-UA-Compatible",
          "@content": "ie=edge"
        }
      ],
      "title": "HTML 5 Boilerplate",
      "link": {
        "@rel": "stylesheet",
        "@href": "style.css"
      }
    },
    "body": {
      "script": {
        "@src": "index.js"
      }
    }
  }
}

Again, this is one possible way to interpret HTML. Still, we can see the basic JSON structure and its charm as a data-definition language: simplicity.

3.2. JSON Elements

There are several basic elements or data types within JSON:

  • null
  • string (supports Unicode) within “” double quotes like “string”
  • number like 0, 0.1, or 6.022E23
  • boolean as either true or false
  • array of comma-separated elements within [] square brackets like [el1, el2]
  • object of comma-separated name-value pairs within {} curly brackets like {el1:val1, el2:val2}

In this case, whitespace in the form of space, horizontal tab, line feed, or carriage return is only significant within double quotes, but not around elements.

Notably, there are no comments in JSON.

4. Applications of YAML and JSON

While JSON is still primarily an API language, YAML is most often employed as a way to define and configure resources and applications.

4.1. JSON API

To understand the difference better, let’s create a minimal Python API:

$ cat minapi.py
from flask import Flask
app = Flask(__name__)

@app.route('/')
def min():
  return '{ "data": "value", "metadata": "info" }';

Here, we employ the Flask library to create the most basic interface with a single route that returns a JSON object.

Now, let’s create a virtual environment in the current directory, install Flask, and run the example:

$ python3 -m venv .venv
$ . .venv/bin/activate
(.venv) $ pip install Flask
[...]
(.venv) $ FLASK_APP=minapi.py flask run --host 0.0.0.0
 * Serving Flask app 'minapi.py'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5000
 * Running on http://192.168.6.66:5000
Press CTRL+C to quit

At this point, we can access http://192.168.6.66:5000 API endpoint and we get the expected JSON response:

$ curl http://192.168.6.66:5000
{ "data": "value", "metadata": "info" }

Now, let’s create a microservice with this example.

4.2. YAML Microservice

To convert the Flask API above to a microservice, we use a Dockerfile definition and a docker compose file:

$ cat Dockerfile
# syntax=docker/dockerfile:1
FROM python:3.10-alpine
WORKDIR /minapi
ENV FLASK_APP=minapi.py
RUN pip install flask
EXPOSE 5000
COPY . .
CMD ["flask", "run", "--host", "0.0.0.0"]
$ cat compose.yaml
services:
  minapi:
    build: .
    ports:
      - "5000:5000"
$ docker compose up
[+] Building 2.7s (13/13) FINISHED                                                     docker:default
 => [minapi internal] load build definition from Dockerfile                                      0.0s
 => => transferring dockerfile: 32B                                                              0.0s
 => [minapi internal] load .dockerignore                                                         0.0s
[...]
 => => naming to docker.io/library/minapi-minapi                                                 0.0s
[+] Running 1/0
 ✔ Container minapi-minapi-1  Created                                                            0.0s Attaching to minapi-1
minapi-1  |  * Serving Flask app 'minapi.py'
minapi-1  |  * Debug mode: off
minapi-1  | WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
minapi-1  |  * Running on all addresses (0.0.0.0)
minapi-1  |  * Running on http://127.0.0.1:5000
minapi-1  |  * Running on http://172.18.0.2:5000
minapi-1  | Press CTRL+C to quit

This way, we copy the minapi.py file, install Flask, and run the API.

Notably, the compose.yaml definition is in the YAML format.

Yet, even Docker Compose can use JSON instead of YAML, while some API endpoints do return YAML. However, we might still have to convert between the two formats.

5. JSON and YAML Tools

Before delving into specific ways to convert between JSON and YAML, let’s briefly refresh our knowledge about tools that provide this functionality.

5.1. jq

One of the foremost tools for JSON manipulation is jq:

$ apt install jq

Using our earlier output JSON as an example, let’s extract a field with jq:

$ curl http://192.168.6.66:5000
{ "data": "value", "metadata": "info" }
$ curl --silent http://192.168.6.66:5000 | jq -r '.data'
value

Thus, we get the [-r]aw string value of the data field.

Since the jq language is very mature, we can define functions that provide custom processing:

$ cat suffix.jq
def suffix(suf):
  . + suf;

Here, we [def]ine a basic suffix() function that adds the suf suffix to its input. Notably, suf is an argument, while . represents any data from the previous pipe.

Let’s see this in action, assuming suffix.jq is in the current working directory:

$ echo '
  {
    "filename": "basic",
    "path": "/home/baeldung/jqscripts"
  }' |
jq '
  import "suffix" as suffix;
  .filename | suffix::suffix(".jq")
'
"basic.jq"

First, we import the suffix as an object with the same name. After that, we get the value of the filename field and pipe it to the suffix() function to append .jq to the filename.

Normally, modules are [import]ed from several main directories:

  • current working directory
  • $HOME/jq/
  • $ORIGIN/../lib/jq/
  • $ORIGIN/../lib/

Here, $ORIGIN is the current absolute path of the jq binary. Further, -L can supply additional paths.

5.2. yq

Based on the success and code of jq, yq is a tool for both JSON and YAML processing:

$ apt install yq

In fact, yq is a wrapper around jq that adds extra functionality. This means we can use both commands interchangeably for the subset that the latter supports.

Since it’s based on Python, we can also install yq via pip, as long as we have jq.

5.3. Python

In the Python programming language, yaml and json are both core modules that provide the ability to process data in the respective formats:

$ echo '---' |
  python3 -c 'import sys, yaml; yaml.safe_load(sys.stdin.read())'
$ echo '{ }' |
  python3 -c 'import sys, json; json.loads(sys.stdin.read())'

Here, we use the safe_load() method of the yaml module and the loads() method of json to attempt to safely read the respective format as piped in from the echo command. In case of issues, we see a raw Python exception.

5.4. gojq

Since yq is just a wrapper around jq, the gojq port has YAML support built in. This way, we can again use the established syntax of jq, but without additional installations for YAML.

While there are other means to process both formats, the Linux shell is usually best equipped with the ones above.

6. Convert JSON API Response to YAML

Commonly, JSON to YAML conversions happen in a shell pipeline after an automated HTTP(S) request, usually via curl.

6.1. Custom jq YAML Function

Since jq doesn’t include YAML support directly, we can implement it as a custom function:

$ cat json2yaml.jq
def json2yaml:
  (objects | to_entries[] | (.value | type) as $type |
    if $type == "array" then
      "\(.key):", (.value | json2yaml)
    elif $type == "object" then
      "\(.key):", " \(.value | json2yaml)"
    else
      "\(.key): \(.value)"
    end
  )
  // (arrays | select(length > 0)[] | [json2yaml] |
      "  - \(.[0])", " \(.[1:][])"
  )
  // .
  ;

Notably, // is an alternatives operator that uses its right-hand operand only if the left-hand one produces a NULL or false result.

First, objects | to_entries[] | (.value | type) as $type extracts the $type of the current object, so we can check it against array and object. In both cases, we get the value and pass it recursively to json2yaml. If $type is neither, we structure the key and value according to the YAML syntax.

In case we can’t get the $type, we get all non-empty arrays and process those one-by-one recursively via json2yaml.

If none of the above returns a non-NULL and non-false result, the function just returns what’s passed to it.

Once we have the definition, we can leverage that by importing it as a module:

$ printf '{
  "#comment": "DOCTYPE html",
  "html": {
    "@lang": "en",
    "head": {
      "meta": [
        {
          "@charset": "UTF-8"
        },
        {
          "@name": "viewport",
          "@content": "width=device-width, initial-scale=1.0"
        },
        {
          "@http-equiv": "X-UA-Compatible",
          "@content": "ie=edge"
        }
      ],
      "title": "HTML 5 Boilerplate",
      "link": {
        "@rel": "stylesheet",
        "@href": "style.css"
      }
  }' | jq -r 'import "json2yaml" as json2yaml; json2yaml::json2yaml'
#comment: DOCTYPE html
html:
    @lang: en
    head:
        meta:
          - @charset: UTF-8
          - @name: viewport
            @content: width=device-width, initial-scale=1.0
          - @http-equiv: X-UA-Compatible
            @content: ie=edge
        title:  HTML 5 Boilerplate
        link:
            @rel: stylesheet
            @href: style.css
    body:
        script:
            @src: index.js

Notably, we employ the -r switch to output only raw strings.

In general, this approach might be more tedious to set up and run. Further, it can miss some special cases such as the @ at sign above, which can be interpreted as a special character in YAML.

However, the benefit of having a function is that it enables potentially simpler further customization.

6.2. yq

When it comes to the purpose-built wrapper yq, we can directly use the –yaml-output (–yml-output, -y) flag for conversion to YAML:

$ printf '{ "filename": "basic", "path": "/home/baeldung/jqscripts" }' |
  yq --yaml-output
filename: basic
path: /home/baeldung/jqscripts

Let’s check the output with the earlier example:

$ printf '{
  "#comment": "DOCTYPE html",
  "html": {
    "@lang": "en",
    "head": {
      "meta": [
        {
          "@charset": "UTF-8"
        },
        {
          "@name": "viewport",
          "@content": "width=device-width, initial-scale=1.0"
        },
        {
          "@http-equiv": "X-UA-Compatible",
          "@content": "ie=edge"
        }
      ],
      "title": "HTML 5 Boilerplate",
      "link": {
        "@rel": "stylesheet",
        "@href": "style.css"
      }
    },
    "body": {
      "script": {
        "@src": "index.js"
      }
    }
  }
}' | yq --yaml-output
'#comment': DOCTYPE html
html:
  '@lang': en
  head:
    meta:
      - '@charset': UTF-8
      - '@name': viewport
        '@content': width=device-width, initial-scale=1.0
      - '@http-equiv': X-UA-Compatible
        '@content': ie=edge
    title: HTML 5 Boilerplate
    link:
      '@rel': stylesheet
      '@href': style.css
  body:
    script:
      '@src': index.js

Here, we see some more sophistication in the output in the form of single quotes that prevent problems with special characters.

6.3. Python

The Python yaml and json libraries usually make quick work of conversions:

$ printf '{
  "#comment": "DOCTYPE html",
  "html": {
    "@lang": "en",
    "head": {
      "meta": [
        {
          "@charset": "UTF-8"
        },
        {
          "@name": "viewport",
          "@content": "width=device-width, initial-scale=1.0"
        },
        {
          "@http-equiv": "X-UA-Compatible",
          "@content": "ie=edge"
        }
      ],
      "title": "HTML 5 Boilerplate",
      "link": {
        "@rel": "stylesheet",
        "@href": "style.css"
      }
    },
    "body": {
      "script": {
        "@src": "index.js"
      }
    }
  }
}' | python3 -c 'import sys, yaml, json; print(yaml.dump(json.loads(sys.stdin.read())))'
'#comment': DOCTYPE html
html:
  '@lang': en
  body:
    script:
      '@src': index.js
  head:
    link:
      '@href': style.css
      '@rel': stylesheet
    meta:
    - '@charset': UTF-8
    - '@content': width=device-width, initial-scale=1.0
      '@name': viewport
    - '@content': ie=edge
      '@http-equiv': X-UA-Compatible
    title: HTML 5 Boilerplate

In summary, we print() the result of yaml.dump() as applied to the JSON object loaded via json.loads() and the input.

6.4. gojq

Similarly, gojq is well-equipped to handle the conversion, since it implements native YAML support:

$ printf '{
  "#comment": "DOCTYPE html",
  "html": {
    "@lang": "en",
    "head": {
      "meta": [
        {
          "@charset": "UTF-8"
        },
        {
          "@name": "viewport",
          "@content": "width=device-width, initial-scale=1.0"
        },
        {
          "@http-equiv": "X-UA-Compatible",
          "@content": "ie=edge"
        }
      ],
      "title": "HTML 5 Boilerplate",
      "link": {
        "@rel": "stylesheet",
        "@href": "style.css"
      }
    },
    "body": {
      "script": {
        "@src": "index.js"
      }
    }
  }
}' | gojq --yaml-output
'#comment': DOCTYPE html
html:
  '@lang': en
  body:
    script:
      '@src': index.js
  head:
    link:
      '@href': style.css
      '@rel': stylesheet
    meta:
      - '@charset': UTF-8
      - '@content': width=device-width, initial-scale=1.0
        '@name': viewport
      - '@content': ie=edge
        '@http-equiv': X-UA-Compatible
    title: HTML 5 Boilerplate

As expected, the –yaml-output switch works like the same flag of yq.

7. Convert YAML Definition to JSON

Now, let’s see how to take a simple YAML definition and make a JSON from it.

7.1. yq

Since yq is effectively jq that understands YAML, we don’t need any flags to convert from YAML to JSON:

$ printf '
services:
  minapi:
    build: .
    ports:
      - "5000:5000"
' | yq
{
  "services": {
    "minapi": {
      "build": ".",
      "ports": [
        "5000:5000"
      ]
    }
  }
}

As expected, we have the same definition in a different format.

7.2. Python

If we switch the sequence of methods in the Python code for JSON-to-YAML conversion, we can achieve the reverse:

$ printf '
services:
  minapi:
    build: .
    ports:
      - "5000:5000"
' | python3 -c 'import sys, yaml, json; print(json.dumps(yaml.safe_load(sys.stdin.read()), indent=2, sort_keys=False))'
{
  "services": {
    "minapi": {
      "build": ".",
      "ports": [
        "5000:5000"
      ]
    }
  }
}

In this case, we also add some formatting and indentation via JSON.dumps().

The result is the same as that from yq. This is expected since yq is implemented in Python.

Both jq and gojq can perform the same conversion. However, we again need to implement a custom function, this time for both.

8. Summary

In this article, we discussed YAML and JSON, what usually generates either format, and ways to convert between them in common scenarios.

In conclusion, both YAML and JSON are versatile and convenient formats with slightly different applications, but we can usually convert one to the other with similar tools.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments