1. Overview

From time to time, we may need to count the number of files in each directory in a Linux system. There’s no single command to solve this problem. However, we can find solutions by combining a few basic commands that are available by default on most Linux distributions.

In this tutorial, we’ll explore a few solutions to count the number of files in each directory.

2. The Problem

For this example specifically, let’s look at a directory containing three subdirectories:

$ ls
Assignments Conference Projects

Each subdirectory contains files and additional directories with files in them:

$ ls *
Assignments:
1.txt  2.txt  3.txt  directory_1  directory_2

Conference:
1.txt  2.txt directory_3

Projects:
1.txt  2.txt  3.txt  directory_4  directory_5

As we can see, there are five files in the Assignments directory, three files in the Conference directory, and five files in the Projects directory.

Let’s also display the contents of the other subdirectories.

In Assignments/directory_1 and Assignments/directory_2, we have these files:

$ ls Assignments/ *
directory_1:
1.txt    2.txt    3.txt    4.txt

directory_2:
1.txt    2.txt    3.txt

In Conference/directory_3, we have these files:

$ ls Conference/ *
directory_3:
1.txt    2.txt    3.txt

Finally, in Projects/directory_4 and Projects/directory_5, we have these files:

$ ls Projects/ *
directory_4:
1.txt    2.txt    3.txt

directory_5:
1.txt    2.txt    3.txt    4.txt

Putting it all together, we should expect our overall output to look something like this:

5 Assignments
  4 in ./Assignments/directory_1
  3 in ./Assignments/directory_2
3 Conference
  1 in ./Conference/directory_3
5 Projects
  3 in ./Projects/directory_4
  4 in ./Projects/directory_5

With this in mind, let’s explore the commands we’ll need to accomplish this.

3. Using find

The Linux find command is a flexible and powerful tool that searches for files and directories in a directory hierarchy. It can search for executable files, empty files, files owned by other users, and even files with a specific extension.

The find command is recursive by default and can use regex patterns to list files and directories and their respective access or modified dates.

First, let’s look at how we can use the find command to list all the directories and subdirectories in our current working directory:

$ find . -type d -print0
../Assignments./Assignments/directory_1./Assignments/directory_2./Conference./Conference/directory_3./Projects./Projects/directory_4./Projects/directory_5

Here, we’re using the “.” symbol to target the current working directory. We then use the -type d option to search for all directories and print their relative paths. We’re using the -print0 option to output the results in a single line.

Next, we’ll pipe the output to a while loop to count the number of files in each directory:

$ find . -type d -print0 | while read -d "" -r dir; do
    files=("$dir"/*)
    printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done

On running this command, we get this output:

    3 files in directory .
    5 files in directory ./Assignments
    4 files in directory ./Assignments/directory_1
    3 files in directory ./Assignments/directory_2
    3 files in directory ./Conference
    1 files in directory ./Conference/directory_3
    5 files in directory ./Projects
    3 files in directory ./Projects/directory_4
    4 files in directory ./Projects/directory_5

The output from the find command is piped to a while loop that contains the read command which splits the directory names with the delimiter (-d) option.

We then use the -r option to treat the backslash character as-is. The backslash character is commonly used as an escape character. This is important because each directory name contains a backslash.

Finally, we create a variable named dir to keep track of each directory name, and then count the number of files in each directory.

This method counts all the files in each directory and subdirectory, no matter the level they reside within the specified directory hierarchy.

4. Using awk

The awk command is a powerful Linux tool typically used for processing text files and generating reports based on that data.

Let’s use awk to find the total number of files in each directory and subdirectory. However, we need to pipe it with the findgrep, and wc commands to accurately count the number of files:

$ find . -type d | awk '{print "echo -n \""$0" \";ls -l "$0" | grep -v total | wc -l" }' | sh

On running this command, we get this output:

. 3
./Assignments 5
./Assignments/directory_1 4
./Assignments/directory_2 3
./Conference 3
./Conference/directory_3 1
./Projects 5
./Projects/directory_4 3
./Projects/directory_5 4

The find command returns a list of all directories and subdirectories in our current working directory. We’re piping the output to the awk command, where we list and print all the file and directory names, each separated by a newline.

Finally, we pipe the output to the grep -v total command which inverts the matching, and the wc command counts the number of lines. The number of lines counted represents the number of files within that directory.

This method also scans for files in all directories and subdirectories, no matter the level they reside within the specified directory hierarchy. Since this approach passes the output through three different commands, it’s prone to be slower than the first method.

5. Conclusion

In this article, we’ve discussed some methods for counting the total number of files in each directory within a specific directory hierarchy.

The first method involves using the find command and a while loop that reads every directory and counts the number of files within.

We’ve also seen another way to solve the problem using the findawk, grep and wc commands. We should consider using the first method if our target directory contains a large number of files.

Comments are closed on this article!