There are times when we want to compress files directly while reading data from stdin. However, we don’t want to write the data to files first before compressing them. One of the reasons is to save space.
In this short tutorial, we’ll discuss ways to compress files with various compressor applications while reading data from stdin.
2. Introduction to Reading Data from stdin
2.1. Compressing Files by Giving Filenames to Compressor Applications
Usually, we provide filenames as arguments for compressor applications. For example, to compress a file with zip:
$ zip -0 archive.zip file1
Here, the file1 argument is the filename of a file. We’re compressing the file into archive.zip.
However, there may be times when we want to read data from stdin and compress it straight away without first writing our data to a file.
To understand how to do this, let’s review how we read data from stdin.
2.2. Reading Data from stdin
We don’t have to write the data to files first before compressing the data. This is useful when compressing data from a large processor in a pipeline. For example, mysqldump backs up a database’s SQL statements to stdout. Most of the time, we want to compress the output directly rather than store the uncompressed version in a temporary file.
The rev command reverses the string. It accepts a filename as an argument:
$ echo hello > /tmp/hello.txt $ cat /tmp/hello.txt hello $ rev /tmp/hello.txt olleh
But rev can also accept data from stdin with a pipe:
$ cat /tmp/hello.txt | rev olleh
We can also read data from stdin by typing data in the console:
Then we type hello and press the Enter key:
$ rev hello olleh
The rev command read data from stdin, which we typed manually and processed the data afterward.
Let’s now look at how to compress data from stdin with gzip, xz, bzip2, 7z, and zip.
3. Compressing Data from stdin with gzip
We can compress data using gzip with a pipe and >:
$ cat /tmp/hello.txt | gzip > hello.gz
Here we first displayed the content of /tmp/hello.text using cat. Then we redirected that data to the gzip command with |. Then, gzip compressed the data from the pipe and put it to stdout. Finally, we directed stdout into hello.gz using >.
4. Compressing Data from stdin with xz
We can compress data using xz with a pipe and >:
$ cat /tmp/hello.txt | xz > hello.xz
This works the same as our previous example.
5. Compressing Data from stdin with bzip2
We can compress data using bzip2 with a pipe and >:
$ cat /tmp/hello.txt | bzip2 > hello.bz2
This also works the same as our previous examples.
6. Compressing Data from stdin with 7z
We can compress data using 7z with a pipe. However, we also need some additional command-line options:
$ cat /tmp/hello.txt | 7z a -si hello.7z
We displayed the content of /tmp/hello.txt then we redirected the data to the 7z command. Finally, 7z compressed the data using the -si option, which means reading data from stdin. The a option means adding content.
7. Compressing Data from stdin with zip
zip requires us to use a dash “-” as an input filename to indicate that it should read from stdin.
$ cat /tmp/hello.txt | zip hello.zip -
We displayed the content of /tmp/hello.txt then we redirected the data to the zip command. Finally, zip compressed the data and used dash “-” to read data from stdin.
In this article, we saw how different compression tools allow us to compress data from stdin. We covered gzip, xz, bzip2, 7z, and zip.
We saw that most of the commands use a pipe to send the data from the source to the stdin of the tool, with > to direct the output from stdout to a file. However, we saw that some tools allow us to specify the output file as an argument or require a parameter to tell them to read from stdin.