Compress Files Using Data from stdin

1. Overview

There are times when we want to compress files directly while reading data from stdin. However, we don’t want to write the data to files first before compressing them. One of the reasons is to save space.

In this short tutorial, we’ll discuss ways to compress files with various compressor applications while reading data from stdin.

2. Introduction to Reading Data from stdin

We can compress data by using a variety of compressor applications, like gzip, bzip2, zip, and so on. We can compress data by reading data from files or reading data directly from stdin.

2.1. Compressing Files by Giving Filenames to Compressor Applications

Usually, we provide filenames as arguments for compressor applications. For example, to compress a file with zip:

$ zip -0 archive.zip file1

Here, the file1 argument is the filename of a file. We’re compressing the file into archive.zip.

However, there may be times when we want to read data from stdin and compress it straight away without first writing our data to a file.

To understand how to do this, let’s review how we read data from stdin.

2.2. Reading Data from stdin

We don’t have to write the data to files first before compressing the data. This is useful when compressing data from a large processor in a pipeline. For example, mysqldump backs up a database’s SQL statements to stdout. Most of the time, we want to compress the output directly rather than store the uncompressed version in a temporary file.

We can read data from stdin using dash “-“ and pipe.

The rev command reverses the string. It accepts a filename as an argument:

$ echo hello > /tmp/hello.txt 
$ cat /tmp/hello.txt 
hello
$ rev /tmp/hello.txt 
olleh

But rev can also accept data from stdin with a pipe:

$ cat /tmp/hello.txt | rev
olleh

We can also read data from stdin by typing data in the console:

$ rev

Then we type hello and press the Enter key:

$ rev
hello
olleh

The rev command read data from stdin, which we typed manually and processed the data afterward.

Let’s now look at how to compress data from stdin with gzip, xz, bzip2, 7z, and zip.

3. Compressing Data from stdin with gzip

We can compress data using gzip with a pipe and >:

$ cat /tmp/hello.txt | gzip > hello.gz

Here we first displayed the content of /tmp/hello.text using cat. Then we redirected that data to the gzip command with |. Then, gzip compressed the data from the pipe and put it to stdout. Finally, we directed stdout into hello.gz using >.

4. Compressing Data from stdin with xz

We can compress data using xz with a pipe and >:

$ cat /tmp/hello.txt | xz > hello.xz

This works the same as our previous example.

5. Compressing Data from stdin with bzip2

We can compress data using bzip2 with a pipe and >:

$ cat /tmp/hello.txt | bzip2 > hello.bz2

This also works the same as our previous examples.

6. Compressing Data from stdin with 7z

We can compress data using 7z with a pipe. However, we also need some additional command-line options:

$ cat /tmp/hello.txt | 7z a -si hello.7z

We displayed the content of /tmp/hello.txt then we redirected the data to the 7z command. Finally, 7z compressed the data using the -si option, which means reading data from stdin. The a option means adding content.

7. Compressing Data from stdin with zip

zip requires us to use a dash “-” as an input filename to indicate that it should read from stdin.

$ cat /tmp/hello.txt | zip hello.zip -

We displayed the content of /tmp/hello.txt then we redirected the data to the zip command. Finally, zip compressed the data and used dash “-” to read data from stdin.

8. Conclusion

In this article, we saw how different compression tools allow us to compress data from stdin. We covered gzip, xz, bzip2, 7z, and zip.

We saw that most of the commands use a pipe to send the data from the source to the stdin of the tool, with > to direct the output from stdout to a file. However, we saw that some tools allow us to specify the output file as an argument or require a parameter to tell them to read from stdin.

Administration

Scripting

Networking

Files

Processes

Full Archive

About Baeldung