Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Overview

In this tutorial, we’ll explore few common strategies to redirect the output of a process to a file and standard streams such as stdout and stderr simultaneously.

2. The tee Command

The tee command is one of the most popular Linux commands that we can use for redirecting a process’s output.

2.1. Redirect stdout

Let’s take a simple use case of redirecting the output of the ls command to stdout and a temporary file /tmp/out.log:

$ ls -C | tee /tmp/out.log
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr

We can verify that the contents of the file are the same as the output generated from the executed command:

$ cat /tmp/out.log
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr

Another important thing to note is that the default behavior of the tee command is to overwrite the contents of the file. However, if required, we can choose the -a option to append the new content after the existing content of a file.

2.2. Redirect stdout and stderr to the Same File

We need to understand that internally, the tee command is acting as a T-splitter for the incoming stdin so that data can be redirected to the stdout and one or more files. Let’s use this understanding to redirect stderr of a process to stdout and a file:

$ (ls -C; cmd_with_err) 2>&1 | tee /tmp/out.log
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr
bash: cmd_with_err: command not found

We can notice that cmd_with_err is an unknown command, so it generates an error message. To make this available to the tee command, we redirect the stderr file descriptor (fd=2) to the stdout file descriptor (fd=1).

Alternatively, we can also use |& as a shorthand notation for 2>&1| to get the same result:

$ (ls -C; cmd_with_err) |& tee /tmp/out.log
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr
bash: cmd_with_err: command not found

Now, let’s verify the content of the /tmp/out.log file:

$ cat /tmp/out.log
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr
bash: cmd_with_err: command not found

2.3. Redirect stdout and stderr to Separate Files

In some scenarios, we might need to redirect the stdout and stderr of a process to separate files. We can do this by using process substitution while invoking the tee command. Before that, let’s take a look at a template code snippet that will enable the tee command to listen to a specific file descriptor and write back to the same file descriptor stream and a file:

fd> >(tee file_name fd>&fd)

We must note that fd is just a placeholder for a file descriptor, and the actual value will be 1 for stdout, 2 for stderr, and 0 for stdin.

Now, let’s use this understanding to redirect the stdout and stderr output of the process to /tmp/out.log and /tmp/err.log, respectively:

$ ((ls -C; cmd_with_err) 1> >(tee /tmp/out.log)) 2> >(tee /tmp/err.log 2>&2)
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr
bash: cmd_with_err: command not found

We can verify that /tmp/out.log contains the valid stdout message, whereas /tmp/err.log contains the error message from stderr:

$ cat /tmp/out.log
bin   dev  home  lib32	libx32	mnt  proc  run	 srv  tmp  var
boot  etc  lib	 lib64	media	opt  root  sbin  sys  usr
$ cat /tmp/err.log
bash: cmd_with_err: command not found

3. Redirection Delays

In a few scenarios, invocation of the tee command to redirect the process’s output to a file and stdout can introduce delay. In this section, we’ll explore such a scenario and learn to mitigate it.

3.1. Scenario

Let’s see a simple python script that prints the current time every one second:

$ cat time.py
#!/usr/bin/python
from datetime import datetime
import time
import sys
from sys import stdout
while True:
    sys.stdout.write(datetime.today().strftime("%H:%M:%S %p\n"))
    time.sleep(1)

If we execute this script, we’ll observe a delay of one second between any two consecutive timestamps written on the stdout:

$ ./time.py
 6:49:48 PM
 6:49:49 PM
 6:49:50 PM
 6:49:51 PM
 6:49:52 PM
 6:49:53 PM

3.2. Delayed Redirection

Now, let’s use the tee command to redirect the output of this process to stdout and the time.out file:

$ ./time.py | tee time.out

Unlike earlier, we’ll notice that there’s no output written on stdout for a long time, after which a huge chunk of output will be dumped on stdout in a single go.

The delay is introduced due to Linux’s stdio buffering policy in the glibc, a system library used by python internally. The buffering policy causes the writes to stdout to pass through a 4096byte buffer, thereby reducing the number of I/O calls required to write on the stream.

For interactive applications, such delays in redirection are not acceptable. So, let’s find ways to mitigate the redirection delay issue.

3.3. Mitigation

As the root cause of the issue is associated with the delay in flushing of data to stdout, one way to solve this issue is by ensuring timely flushing of data to stream in the application code:

$ cat time.py
#!/usr/bin/python
from datetime import datetime
import time
import sys
from sys import stdout
while True:
    sys.stdout.write(datetime.today().strftime("%H:%M:%S %p\n"))
    sys.stdout.flush()
    time.sleep(1)

Let’s verify that the delay is indeed gone:

$ ./time.py | tee time.out
19:29:12 PM
19:29:13 PM

In this scenario, we had direct access to the application code, so we were able to modify it.

But, in many cases, the program could be an executable binary, and we might not have access to modify it. In such a scenario, we can use the unbuffer command in Linux to solve the delay caused by buffered writes to stdout.

Let’s remove the sys.stdout.flush() method call from our script and re-execute the redirection command using the unbuffer command:

$ unbuffer ./time.py | tee time.out
19:34:22 PM
19:34:23 PM

We can observe that there are no unexpected delays in the stdout writes now.

4. Conclusion

In this article, we explored several use cases of redirecting a program’s output to streams such as stdout and stderr simultaneously. Additionally, we learned few strategies to solve the issue of delayed redirection caused due to buffering.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Comments are closed on this article!