
Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
Listing files is a common operation when we work with the Linux command line. Usually, we’ll use two commands to list files: the ls command and the find command.
In this tutorial, we’ll explore how to sum up the size of listed files. Of course, we’ll cover both ls and find commands.
To address how to sum up filesize straightforwardly, let’s create a directory and some files as an example:
$ tree -f myDir
myDir
├── myDir/001.txt
├── myDir/002.txt
├── myDir/003.txt
├── myDir/004.txt
├── myDir/005.txt
├── myDir/images
│ ├── myDir/images/image01.jpg
│ └── myDir/images/image02.jpg
├── myDir/picture01.jpg
└── myDir/picture02.jpg
1 directory, 9 file
As the tree command’s output above shows, under the myDir directory, we have some files and a subdirectory.
Next, let’s see how to calculate the total size of the listed files.
We know that using the ls command with the -l option lists files with detailed information. For example, let’s enter the myDir directory and list all files under it using ls -l:
$ ls -l *.*
-rw-r--r-- 1 kent kent 2 Dec 9 13:20 001.txt
-rw-r--r-- 1 kent kent 3 Dec 9 13:20 002.txt
-rw-r--r-- 1 kent kent 4 Dec 9 13:20 003.txt
-rw-r--r-- 1 kent kent 5 Dec 9 13:20 004.txt
-rw-r--r-- 1 kent kent 6 Dec 9 13:20 005.txt
-rw-r--r-- 1 kent kent 354 Dec 9 13:21 picture01.jpg
-rw-r--r-- 1 kent kent 131072 Dec 9 13:23 picture02.jpg
As the output above shows, all files under myDir are listed. Further, detailed information on each file is shown in columns. It’s worth mentioning that ls -l *.* only lists files in the current directory. Files in the subdirectories, such as images in this example, aren’t included.
Let’s take the 001.txt file as an example to understand the ls -l output:
-rw-r--r-- 1 kent kent 2 Dec 9 13:20 001.txt
_--------- - ---- ---- - ----------- -------
^ ^ ^ ^ ^ ^ ^ ^
| | | | | | | +- The filename
| | | | | | |
| | | | | | +--------- The last modification time
| | | | | +---------------- The file size in bytes
| | | | +-------------------- The file owner group
| | | +------------------------- The file owner
| | +---------------------------- The number of hard links
| +----------------------------------- File Permissions
+--------------------------------------- The file type flag, for example:
'-': regular file, 'd': directory, etc.
Now that we understand the ls -l output, if we want to sum the file sizes in the ls -l list, we need to sum the fifth column (file size in bytes) in each file record. To achieve that, we can pipe the ls -l output to the awk command:
$ ls -l *.* | awk '{ sum += $5 } END{ print sum }'
131446
As we can see, the total size (in bytes) of the listed files is calculated and printed. A compact awk one-liner solves our problem. However, we need to type the awk command whenever we want to sum up the filesizes in the ls output. It’s a bit inconvenient.
So next, let’s turn the awk command into a generic shell function, to sum the values in a given column.
First, let’s look at the sumCol function:
sumCol() {
awk -v col="$1" '{ sum += $col } END{ print sum }'
}
As we can see, it looks pretty similar to the previous awk command. The only difference is instead of hard-coding the column number, the awk command in the sumCol function accepts the column number passed to the shell function.
Next, let’s source the function and see how to use it with the ls -l command:
$ ls -l *.* | sumCol 5
131446
We can also use the sumCol function to sum other columns. Let’s see another example:
$ cat numInCol.txt
1 2 3
4 5 6
7 8 9
$ cat numInCol.txt | sumCol 2
15
$ cat numInCol.txt | sumCol 3
18
In the examples above, we use the numInCol.txt file to simulate some column-based output. We see it’s pretty straightforward to use our sumCol function to sum numbers in a given column.
As we’ve mentioned, find is another popular way to search and list files. By default, find searches files recursively. For example, we can list all *.jpg image files recursively in the myDir directory:
$ find myDir -name '*.jpg'
myDir/images/image02.jpg
myDir/images/image01.jpg
myDir/picture02.jpg
myDir/picture01.jpg
So next, let’s figure out how to calculate the total size of these found files.
We know that the du command with the -b option reports the given files or directories size in bytes, for example:
$ du -b myDir/picture01.jpg
354 myDir/picture01.jpg
Additionally, we can add the -c option to make du sum up the file sizes for all files we pass to it:
$ du -bc myDir/*.jpg
354 myDir/picture01.jpg
131072 myDir/picture02.jpg
131426 total
Instead of passing filenames directly to the du command, we can use the –files0-from=F option to tell du to read filenames from the F file. It’s worth mentioning that when F is –, du reads filenames from stdin. Further, the filenames should be terminated by a null character. This is pretty useful if we pipe a bunch of filenames to the du command.
We’ve seen that the find command prints each file’s name with a newline character. So if we want du to process the filenames found by the find command, we can use the -print0 action. find‘s -print0 action prints each filename followed by a null character. So it fits precisely du with the –files0-from=- option.
Next, let’s pipe find‘s output to du to get the total filesize:
$ find myDir -name '*.jpg' -print0 | du -bc --files0-from=-
6608 myDir/images/image02.jpg
3639 myDir/images/image01.jpg
131072 myDir/picture02.jpg
354 myDir/picture01.jpg
141673 total
As the output above shows, we’ve got a complete filesize report with a total value. In case we’re only interested in the total value, we can pipe the filesize report to the tail command:
$ find myDir -name '*.jpg' -print0 | du -bc --files0-from=- | tail -1
141673 total
We’ve seen find‘s -print0 action prints each filename followed by a null character. The find command supports other actions. For example, the -printf FORMAT action outputs various information through the given FORMAT.
Next, let’s look at a few commonly used formats:
Our problem is to get the filesizes’ sum for the found files. Therefore, we can use the “%s” format to output each file’s size in bytes:
$ find myDir -name '*.jpg' -printf "%s\n"
6608
3639
131072
354
To calculate the sum of these filesizes, we can use our sumCol function again:
$ find myDir -name '*.jpg' -printf "%s\n" | sumCol 1
141673
In this article, we’ve learned how to sum up the size of files listed by the ls -l and the find commands. We saw that the two commands cannot produce the total size on their own. But, awk and du can do the job easily.