Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Overview

In Linux, everything is a file, including directories. Linux has several GNU tools to compare the content of two (or more) files, but has no specific tool to compare directory structures.

In this tutorial, we’ll look at how to compare directory structures.

All commands have been tested on Linux Debian Buster. The commands are usually already installed on common Linux distributions and available on most Linux core repositories.

2. Introduction to the Problem

Linux has many tools to list the structure of a directory, such as ls, tree, find, and du. However, it has no specific tool to compare directory structures without also comparing the content of the files.

Therefore, we’ll follow these steps to implement a workaround:

  • firstly, list the structure of each directory that we want to compare
  • then store them in separate files
  • lastly, use the diff/comm/vimdif/meld commands to compare the files

3. Directory Structure Sample

Let’s say we have the following directories:

$ tree dir1 dir2 dir3
dir1
├── file01.bin
├── file02.bin
└── subdir1
    ├── file03.bin
    └── file04.bin
dir2
├── file01.bin
├── file02.bin
├── subdir1
│   ├── file03.bin
│   └── file04.bin
└── subdir2
    └── file05.bin
dir3
├── file01.bin
├── file02.bin
├── subdir1
│   └── file03.bin
└── subdir2
    └── file05.bin

5 directories, 13 files

We’ll use these directories to try out different techniques and tools.

4. Capturing the Directory Structure

There are many tools that we can use in order to output the directory structure including the files, such as the tree, ls, find, or du commands.

4.1. tree

Let’s use the tree command with -i option to flatten the output structure:

$ tree -i dir1
dir1
file01.bin
file02.bin
subdir1
file03.bin
file04.bin

1 directory, 4 files

tree lists the files and directories recursively, and sorts them by default.

4.2. ls

We can also use ls with the -R option to crawl through directories recursively, and -1 to print one filename per line:

$ ls -R -1 dir1
dir1:
file01.bin
file02.bin
subdir1

dir1/subdir1:
file03.bin
file04.bin

ls lists the files and directories recursively, and sorts them by default. The -1 switch ensures that we get only one file per line.

4.3. find

The find command also provides a way to list files and directories recursively:

$ find dir1 -printf "%P\n" | sort

file01.bin
file02.bin
subdir1
subdir1/file03.bin
subdir1/file04.bin

The output of the find command is not sorted, so we sort it using the sort command. As find outputs the relative path to each file, this sort is consistent between different directory trees.

4.4. du

We can also utilize du to list the files and directory structure:

$ du -a dir1 | cut -f2 | sort
dir1
dir1/file01.bin
dir1/file02.bin
dir1/subdir1
dir1/subdir1/file03.bin
dir1/subdir1/file04.bin

The du command with the option -a means list all files in dir1 directory.

We then piped the output of the du command to the cut command with the -f2 option, which means select field #2.

Lastly, we sorted the output with the sort command as du doesn’t sort the output.

5. Capturing the Directory Structure Without the Files

There are many tools that we can use in order to list the directory structure excluding the files, such as the tree, find, or du commands.

5.1. tree

Let’s use the tree command with the -d option to only list directories, and the -i option to flatten the structure:

$ tree -d -i dir1
dir1
subdir1

1 directory

The tree command lists the directories recursively and sorts them by default.

5.2. find

The find command provides an option to list only the directories recursively:

$ find dir1 -type d -printf "%P\n" | sort

subdir1

The find command doesn’t sort the output so we sorted it using sort command. We should note that find doesn’t include the top directory dir1.

5.3. du

To list directory structure without the files, we can also use the du command:

$ du dir1 | cut -f2 | sort
dir1
dir1/subdir1

Firstly, we piped the output of the du command to the cut command with -f2 option, which means select field #2. Then we sorted it with the sort command.

6. Comparing Directory Structures

Now that we can list the directory structure, let’s move on to the next step: comparing the list.

For this we’re going to use diff, comm, vimdif, and meld.

6.1. diff

The diff command compares the content of the files, so we’ll list the directory structures that we want to compare, save them to separate files, then compare them.

Let’s first list and store the directory structure of dir1 and dir2 to file1 and file2 respectively:

$ find dir1 -printf "%P\n" | sort > file1
$ find dir2 -printf "%P\n" | sort > file2

Then we compare the files:

$ diff file1 file2
6a7,8
> subdir2
> subdir2/file05.bin

As we can see in the output above, the lines prefixed with ‘>‘ mean they only exist in the second directory.

Let’s switch the order of the files:

$ diff file2 file1
7,8d6
< subdir2
< subdir2/file05.bin

The output has a similar meaning – the lines prefixed with ‘<‘ only exist in the file2, which means they exist only in the second directory.

Now, let’s look at a one-liner that puts this all together:

$ diff <( cd dir1; find * |sort ) <(cd dir2; find * | sort)
5a6,7
> subdir2
> subdir2/file05.bin

The output above shows the same information: there’s a directory and a file that only exist in the second directory.

6.2. comm

comm compares two sorted files line by line.

Let’s list and store the directory structure of dir1 and dir2 to file1 and file2 respectively:

$ find dir1 -printf "%P\n" | sort > file1
$ find dir2 -printf "%P\n" | sort > file2

Then we compare the files:

$ comm file1 file2

        file01.bin
        file02.bin
        subdir1
        subdir1/file03.bin
        subdir1/file04.bin
    subdir2
    subdir2/file05.bin

comm command displays the output in three columns:

  • column 1: displayed on the leftmost part of the line, contains lines unique to file1
  • column 2: displayed with a 4-space indent, contains lines unique to file2
  • column 3: displayed with an 8-space indent, contains lines that appear in both files

As we can see in the output above, there are two lines that are unique to file2 (second directory) – subdir2 and subdir2/file05.bin.

We can manipulate the output to make it easier to read. For example, we can suppress column 1/2/3 by passing -1/-2/-3 option respectively.

Let’s suppress column 3:

$ comm -3 file1 file2
    subdir2
    subdir2/file05.bin

In the output above, comm suppressed column 3, and only displayed column 2 – the lines that are unique to file2.

Here’s the one-liner code that brings these together:

$ comm -3 <( cd dir1; find * |sort ) <(cd dir2; find * | sort)
    subdir2
    subdir2/file05.bin

The output shows the same information – there are two lines that only exist in the second directory.

6.3. vimdiff

vimdiff is a vim-based tool to compare two (or more, up to four) files.

Let’s use it to compare files. Firstly, we list and store the directory structure of dir1, dir2, and dir3 in file1, file2, and file3 respectively:

$ find dir1 -printf "%P\n" | sort > file1
$ find dir2 -printf "%P\n" | sort > file2
$ find dir3 -printf "%P\n" | sort > file3

Then we can compare two files:

$ vimdiff file1 file2
2 files to edit

The command opens a vim window with two columns:

Comparing two files with vimdiff

The highlighted lines are the differences between the two files.

We can also compare three files:

$ vimdiff file1 file2 file3
3 files to edit

The command opens a vim window with three columns:

Comparing three files with vimdiff

vim highlighted the differences between the three files.

To exit vim, we have to make sure we’re in Command mode by pressing Esc, then type ‘:q‘, and press Enter.

6.4. Meld

Meld is a visual diff and merge tool to compare two (or three) files or directories.

Let’s use it to compare three directory structures.

Firstly, we list and store the directory structure of dir1, dir2, and dir3 in file1, file2, and file3 respectively:

$ find dir1 -printf "%P\n" | sort > file1
$ find dir2 -printf "%P\n" | sort > file2
$ find dir3 -printf "%P\n" | sort > file3

Then we can compare the two directory structures – file1 and file2:

$ meld file1 file2
Comparing Two Files with meld

The highlighted lines are the differences between the two files.

We can also compare three directory structures – file1, file2, and file3:

$ meld file1 file2 file3
Comparing Three Files with meld

As we can see, Meld highlighted the differences between the files.

7. Conclusion

In this article, we learned how to compare directory structures without comparing the content of the files.

We do that by capturing the directory structures into files, and then comparing them using a diffing tool.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Comments are closed on this article!