Baeldung Pro – Linux – NPI EA (cat = Baeldung on Linux)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

Partner – Orkes – NPI EA (tag=Kubernetes)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

1. Overview

The du command is a powerful tool for analyzing disk usage in Linux. Moreover, it efficiently calculates the amount of disk space consumed by files and directories within a specified path.

However, a common pitfall arises when attempting to use du to get a complete picture: It excludes hidden directories by default. This oversight can significantly underestimate our actual disk usage, leading to inaccurate storage assessments.

In this tutorial, we’ll delve into the reasons behind this limitation and explain various methods to overcome it. We’ll explore different approaches, ranging from combining existing commands to crafting custom scripts, ensuring that we have the right tools for the job.

2. Understanding How du Works

The du command works by recursively traversing directory trees and calculating the size of each encountered file. However, it typically ignores files or directories whose names begin with a dot (.).

In essence, this convention in Linux designates hidden files and directories, which are crucial for system functions but are often user-invisible. Consequently, neglecting them skews the du output, potentially misleading us about our true storage consumption.

Let’s illustrate this through an example. Say we have a directory named Documents containing a visible file named report.txt and a hidden file named .history:

# du -sch
4.0k Documents

Here, we can only see the file named Documents. In an attempt to see the hidden file .history, we can use ./* after the du command:

# du -sch ./*
601M ./Desktop
598M ./Downloads
4.0K ./flash
4.0K ./Music
8.0M ./Pictures
4.0K ./Public

As the output shows, the .history file is not yet visible.

3. Examining Different Approaches

In this section, we’ll talk through various methods to tackle this problem. In addition, we’ll try out a combination of two or more commands, and explore scripts that meet our end goal.

3.1. Combining ls and du Commands

One approach to tackle this challenge involves combining both the ls and du commands. In particular, the ls command provides a listing of files and directories in a given path, including hidden ones when using the -A flag. Afterward, we can then pipe the output of ls -A into du to calculate the size of each listed item.

Let’s check out a real-life example:

$ ls -A Documents | while read file; do du -sh "$file"; done | sort -h
4.0K  Documents/file1.txt
8.0K  Documents/file2.pdf
12K  Documents/image.jpg
20K  Documents/document.docx
1.1M  Documents/large_file.zip

Before we dive into the above output, let’s break down what the command does:

  • ls -A Documents: lists all files and directories in the Documents directory, excluding hidden files (those starting with a dot)
  • while read file; do du -sh “$file”; done: reads each file from the output of ls and runs du -sh on it for each file, where du -sh displays disk usage statistics in human-readable format for each file
  • sort -h: sorts the output of du -sh by human-readable size, from smallest to largest

Nevertheless, we notice that this still does not show the hidden files listed in the directory. To tackle this, we can use the -ah option with du:

4.0K  Documents/file1.txt
8.0K  Documents/file2.pdf
12K  Documents/image.jpg
20K  Documents/document.docx
1.1M  Documents/large_file.zip
512K  Documents/software/app_installer.exe
1.1M  Documents/.large_file.pdf
2.5M  Documents/.videos/long_movie.mkv
10M   Documents/.data/database.db

From the output above, we can see that hidden files are present. This approach effectively includes hidden files, but we keep in mind its limitations. Looping through each file individually can be less efficient for large directories. Additionally, it might not be ideal for scripting purposes due to its verbose nature.

3.2. Utilizing find and du Commands

For increased flexibility and control, we can leverage the power of the find command. This command facilitates locating files and directories based on specific criteria. By combining find with du, we can precisely target hidden files and calculate their disk usage.

Let’s illustrate more with the below example:

$ find Downloads \( -type d -name '.*' -or -type f -name '.*' \) -print0 | xargs -0 du -ashd
3.5M Downloads/.config
1.2M Downloads/.cache
4.0M Downloads/.hidden_dir
2.5M Downloads/.hidden_file.txt

This find command searches within the Downloads directory for directories (-type d) whose names begin with a dot (.^*). Moreover, we search for the hidden files by the argument -type f.

Next, the -print0 ensures that the output of the find command is null-separated, which is safer for filenames with spaces or special characters. Finally, we take the null-separated list from find and pass each file or directory as an argument to du -ashd.

3.3. Utilizing Bash Scripts

We’ll use the combination of find and du method to generate a script for better accessibility and automation. First, let’s explore the script we’ll run:

#!/bin/bash
function du_hidden() {
    local dir="$1"
    find "$dir" \( -type d -name '.*' -or -type f -name '.*' \) -print0 | xargs -0 du -sh
}

Next, let’s ensure that the script is executable:

$ chmod +x du_hidden.sh

Finally, let’s run the script and check the output:

$ du_hidden.sh Logs
2.0M    Logs/.hidden_file1.txt
1.0M    logs/.hidden_file2.log
500K    Logs/.hidden_folder

We created the function by the keyword function that defines a function named du_hidden that takes a directory as an argument. This directory is tangible where we can call the function and pass any directory name we wish.

3.4. Leveraging the shopt -s dotglob Option

This approach utilizes a shell option called shopt to alter how the shell interprets filenames. By enabling the dotglob option, we instruct the shell to consider files and directories starting with a dot when using wildcards (*). This allows us to directly use the du command with wildcards and include hidden items.

However, this will only include hidden files and directories that are directly within the specific directory. It will not include hidden files and directories inside subdirectories of an intended directory, nor will it include hidden files and directories if we disable dotglob.

Let’s suppose we have a directory called Folders that has the below files:

Folders/
|-- file1.txt
|-- .hidden_file.txt
|-- folder/
|   |-- file2.txt
|   |-- .hidden_file2.txt
|-- .hidden_folder/
|   |-- hidden_file3.txt

Next, let’s enable dotglob:

$ shopt -s dotglob
$ du -ahd1 Documents/*
4.0K    Folders/file1.txt
4.0K    Folders/.hidden_file.txt
8.0K    Folders/folder
4.0K    Folders/.hidden_folder

There are some key takeaways from the output above:

  • Folders/file1.txt and Folders/.hidden_file.txt are both listed because dotglob allows the wildcard * to match hidden files
  • Folders/folder shows the size of the folder directory but does not list the contents, including hidden files within it
  • Folders/.hidden_folder shows the size of the .hidden_folder directory

Notably, we need to disable dotglob if we don’t need it:

$ shopt -u dotglob

It’s important to exercise caution when we use dotglob as it can affect other shell commands that rely on wildcard matching.

3.5. Using du -ahd1

After diving into more complex methods, it’s worth exploring a straightforward yet powerful du option. Particularly, the du -ahd1 command provides a concise overview of disk usage, including hidden files and directories:

$ du -ahd1 Documents/
4.0K Documents/file1.txt
8.0K Documents/file2.pdf
12K Documents/image.jpg
20K Documents/document.docx
1.1M Documents/large_file.zip
512K Documents/software/app_installer.exe
1.1M Documents/.large_file.pdf
2.5M Documents/.videos/long_movie.mkv
10M Documents/.data/database.db

Next, let’s understand the command in use:

  • -a: displays disk usage for all files and directories, including hidden ones
  • -h: prints sizes in human-readable format (for example, K, M, or G)
  • -d 1: limits the output to a single level of directories

This command will list all files and directories within the Documents directory, including hidden ones, along with their sizes, in a human-readable format. Furthermore, the -d 1 option ensures that only the direct contents of Documents are displayed, without recursively exploring subdirectories.

On the other hand, while this method offers a quick overview, it might not be sufficient for complex analysis or large directories. In such cases, the previously discussed methods provide more granular control.

4. Comparing Discussed Methods

Now, let’s compare the different approaches we’ve explored:

Method Advantages Disadvantages
Combining ls and du  Simple to understand  Inefficient for large directories, verbose output
Using find and du Flexible, can target specific files/directories Requires more complex find expressions
Leveraging dotglob Simple to use  Can affect other shell commands, potential side effects
Using Scripts Highly customizable Requires scripting knowledge
Using du -ahd1 Provides a quick overview, includes hidden files Limited to the single directory level, less detailed

The optimal method depends on specific requirements, such as directory size, desired level of detail, and scripting needs. For quick checks, du -ahd1 might suffice. For in-depth analysis or complex scenarios, custom scripts or combining find and du offer greater flexibility.

5. Conclusion

In this article, we’ve understood how to account for hidden directories, as it’s vital for accurate disk usage analysis. By mastering the various methods presented in this article, we can effectively address the limitations of the du command and gain valuable insights into our disk space consumption.

Furthermore, a good approach to start with would be to choose the method that best suits our specific needs while also considering the performance and accuracy implications. With the right tools and knowledge, we can efficiently manage our disk space and optimize system performance.