Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Overview

The rsync tool is well-known for synchronizing files and directories. It’s also suitable for creating backups and mirrors in addition to providing many other features and options.

In this tutorial, we’ll examine the batch mode of the rsync command. Using batch mode is useful when we have to synchronize multiple copies of a source directory.

2. The Batch Mode

We assume that we have a source directory and several copies that we want to synchronize. In addition, let’s assume that all the copies have the same initial content.

Let’s preview our steps with the batch mode synchronization of the rsync command:

  • we choose one of the copies and run rsync between the source directory and the chosen copy
  • the rsync command finds the differences between the two directories and saves them in a file
  • we synchronize the rest of the copies using rsync by supplying the file that contains the differences

It’s worth noting that if one copy is more outdated than the others, an rsync synchronization using the differences file will fail. In that case, we can use rsync to find and apply the differences between the source and the copy.

3. Creating an Example Use Case

Let’s create a directory tree with files so that we can use it with the rsync command:

$ mkdir source
$ cd source
$ touch file{1..2}.txt
$ echo 'Hello world' | tee source/file1.txt source/file2.txt
$ head -c 100 /dev/random > binfile
$ mkdir subdir
$ cd subdir/
$ touch subfile{1..2}.txt

Here, we’ve created the source directory, which contains two text files and one binary file with random data. In addition, there’s a subdirectory named subdir with two more files.

Furthermore, we’ll create two destination directories that we’ll try to synchronize with the rsync command:

$ mkdir copy-a
$ ssh [email protected] mkdir copy-b

Here, we’ve created the two empty directories:

  • copy-a: local copy
  • copy-b: remote copy via ssh on host remotehost with user baeldung

We should note that we’ve set up SSH public key authentication between the hosts. So we won’t use any passwords. We intend to synchronize the directories copy-a and copy-b with the source directory using the rsync command. Again, the copy-a and copy-b directories are initially empty.

4. The –write-batch Option

The rsync command with the –write-batch option synchronizes the destination directory and saves a batch file with all the changes that are applied to the destination:

$ rsync -a --delete --write-batch=changelog source/ copy-a

Here, we start with the copy-a directory. In addition to the –write-batch option, we use several others:

  • -a: archive mode, which copies files and directories recursively including symbolic links, preserving permissions, modification times, group, and owner
  • –delete: deletes files of the destination directory that no longer exist in the source directory

As expected, rsync has successfully synchronized the copy-a directory:

$ ls copy-a
binfile    file1.txt  file2.txt  subdir

Moreover, the –write-batch option has created a batch file containing the changes that the command applied for the synchronization of the copy-a directory:

$ ls
changelog     changelog.sh  copy-a        source

In the listing above, the changelog batch file has the name we supplied to the –write-batch option earlier. The file’s contents are in an internal binary format.

5. The Batch Shell Script

In addition to the batch file, the –write-batch option creates a shell script file with the same name and the .sh extension:

$ cat changelog.sh
rsync -a --delete --read-batch=changelog ${1:-copy-a}

Here, changelog.sh contains the command or commands that we should execute to synchronize other copies of the source directory. Here, the options -a and –delete are added to the command since we’ve used them in the initial execution, which generated the file.

Moreover, the third parameter is –read-batch and it defines the batch file to use. In essence, the –read-batch option of rsync applies all changes recorded in the batch file to the destination directory.

Finally, the last parameter is the destination directory. This is a parameter that we may set to the script when we run it. If we don’t set a destination directory, it defaults to the copy-a directory thanks to the :- operator.

6. Using the Batch Shell Script

Next, we proceed to synchronize the copy-b directory. We copy both the batch file and the shell script to the remotehost host via SSH:

$ scp changelog changelog.sh [email protected]:/home/baeldung

Here, we’ve used the scp command to copy the two files to the remote home directory of the baeldung user. Next, we execute the changelog.sh script on the remote host:

$ ssh [email protected] ./changelog.sh copy-b

So, we’ve run the shell script from the local host via ssh. Of course, we need to have rsync on the remote host as otherwise, the above command will fail. We can verify that the script synchronized the copy-b directory by checking its contents:

$ ssh [email protected] ls copy-b
binfile
file1.txt
file2.txt
subdir

As expected, the rsync command copied the contents of the source directory to the remote system.

7. Synchronize Without Copying the Batch File

An alternative use of the –read-batch option is when we set its value to the character. When we use –readbatch=-, rsync receives the contents of the batch file on the standard input. This is convenient if we don’t want to copy the file itself to the remote host:

$ ssh [email protected] rsync -a --delete --read-batch=- /home/baeldung/copy-b <changelog

Here, we didn’t use the changelog.sh script. Instead, we executed the command from it directly on the remote host from the local host. The command received the batch file changes on the standard input with the help of the < redirection operator.

8. The –only-write-batch Option

The –only-write-batch option has the same effect as the –write-batch option, except that it doesn’t apply any changes to the destination directory.

Let’s execute the same scenario as before, but using –only-write-batch. For this purpose, we may remove the contents of the copy-a directory first and then run rsync:

$ rm -r copy-a/*
$ rsync -a --delete --only-write-batch=changelog source/ copy-a

Moreover, we can verify that the copy-a directory is still empty:

$ ls copy-a
$

Indeed, rsync didn’t synchronize the copy-a directory.

Let’s also check whether it created the batch and script files:

$ ls
changelog     changelog.sh  copy-a        source

As expected, the –only-write-batch option created the batch file and its corresponding script.

9. The –protocol Option

The –protocol option defines the version of the algorithm that calculates the differences between two directories and generates the files. We can use it if a remote system has an older version of the rsync package.

We can find out the protocol version that the installed rsync command uses by executing the command either with no arguments or with the –version option:

$ rsync
rsync  version 3.1.3  protocol version 31
Copyright (C) 1996-2018 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes, prealloc
...

Notably, the command outputs the protocol version in the first line. Moreover, it prints the capabilities it supports. In this example, the protocol version is 31.

Let’s create the batch file with a previous protocol version:

$ rsync -a --delete --only-write-batch=changelog --protocol=30 source/ copy-a

Here, we set the protocol version as 30. Consequently, version 30 is also set in the generated script file:

$ cat changelog.sh
rsync -a --delete --read-batch=changelog --protocol=30 ${1:-copy-a}

As a result, if we use the changelog.sh script to perform the synchronization, the remote system will use protocol version 30. This can be very useful in case the remote host is an outdated system that we can’t update, and only supports older protocol versions.

10. Exclude or Include Files With the Batch Mode

When we want to exclude or include files according to a pattern, we can use the –exclude and –include options in batch mode.

Let’s set a pattern for both options:

$ rsync -a --delete --write-batch=changelog --exclude=*.txt --include=file1.txt source/ copy-a

In this example, we exclude all files with the .txt extension, but include all files with the name file1.txt. Let’s see how the resulting shell script includes these rules:

$ cat changelog.sh
rsync --filter=._- -a --delete --read-batch=changelog ${1:-copy-a} <<'#E#'
- *.txt
+ file1.txt
#E#

In this case, rsync dropped the options –include and –exclude in favor of adding the rules within a here document. Let’s check the contents of the copy-a folder:

$ ls copy-a
binfile  subdir

As expected, the tool didn’t copy any text files.

Counterintuitively, although we included the file1.txt file in the –include option, the command didn’t copy it. This behavior is linked with the filtering rules and how the rsync command applies them.

The sequence of the rules impacts the result. Since we defined the –exclude option first, it excluded all the text files at an earlier stage. Consequently, the –include option couldn’t include the file1.txt file. Let’s verify this by changing the order of the two options:

$ rsync -a --delete --write-batch=changelog --include=file1.txt --exclude=*.txt source/ copy-a
$ ls copy-a
binfile    file1.txt  subdir

As expected, we managed to copy the file1.txt file by placing the –include option first.

11. Conclusion

In this article, we examined the batch mode of the rsync command and learned the options and capabilities useful in multiple scenarios:

  • saving the batch file that contains the changes between two directories
  • synchronizing a directory on a remote system
  • handling remote systems that run older versions of the rsync command
  • including and excluding files with a pattern

In conclusion, the batch mode can save us time when we want to synchronize multiple directories. Moreover, it can be useful in cases where the remote system isn’t accessible via a network connection. In these cases, we can manually copy the batch file to the remote system.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Comments are closed on this article!