1. Introduction

When writing a Bash script, we often encounter situations where we need to analyze or manipulate the contents of several files in multiple directories.

In this tutorial, we’ll learn about writing a shell script to walk through a directory structure and automate actions on files within that structure. Firstly, we’ll use the cd command with a for loop to go through a directory tree. After that, we’ll look at the find command to traverse a path and its subdirectories. Lastly, we’ll create a function for our purposes.

2. Sample Directory Tree

For our examples, we’ll use a sample directory tree:

walkthrough-directory/
|-- dir1/
|   |-- file1.txt
|   |-- file2.txt
|   `-- file3.txt
|-- dir2/
|   |-- file1.txt
|   |-- file2.txt
|   `-- file3.txt
`-- dir3/
    |-- file1.txt
    |-- file2.txt
    `-- file3.txt

Here, we have a structure with three subdirectories. Within each subdirectory, we have three files. The top-level directory is walkthrough-directory.

All scripts we create below start with a shebang.

3. Using a for Loop and cd

A basic way to walk through a directory hierarchy is to use the cd command within a for loop that uses globbing.

For instance, let’s consider a scenario where we want to traverse through a directory and count the number of words in each file. For this purpose, we’ll create the cdDirectory.sh script and display it via cat:

$ cat cdDirectory.sh
#!/bin/bash
for d in ./*/ ; do
    # Loop through each subdirectory 
    # and count the number of words in each file
    (cd "$d" && wc -w *.txt);
done

In this script, we use a for loop with ./*/ to iterate over subdirectories of the current directory. The ./*/ is a glob pattern that comprises the ./ current directory and the */ matches all directories within.

In addition, we used a variable d that holds the subdirectory name during the iteration. Subsequently, we use the cd command to switch the current working directory to the one being iterated over.

Within the working directory, the script executes wc -w on all the files with the .txt extension to count the number of words.

Now, let’s make the script executable and run it:

$ chmod +x cdDirectory.sh
$ ./cdDirectory.sh
2 file1.txt
6 file2.txt
8 file3.txt
16 total
7 file1.txt
9 file2.txt
4 file3.txt
20 total
12 file1.txt
6 file2.txt
8 file3.txt
26 total

As we can see, each file is part of the output, although we don’t see the subdirectories.

4. Using find With exec

Another method to walk through a directory tree and process files is to use find with exec.

For example, to count the number of files in a directory and its subdirectories, we’ll utilize the find command with the exec option to execute Bash commands. To demonstrate, we’ll create the findDirectory.sh script:

$ cat findDirectory.sh
#!/bin/bash
# assigning the top-level directory
top_dir="./walkthrough_directory"
# using find to search for directory and execute the bash commands
find "$top_dir" -type d -exec bash -c '
    echo "Entering directory: $0"
    cd "$0" || exit 1
    word_count=$(wc -w *.txt 2>/dev/null/ | tail -n1)
    if [ -n "$word_count" ]; then
        echo "Word count in $0: $word_count"
    else
        echo "No .txt files in $0"
    fi
    echo "Exiting directory: $0"
    cd - >/dev/null
' {} \;

Firstly, we use the variable top_dir to supply the path of the top-level directory. Notably, the value of this variable changes to the actual path of the directory we want to start searching from.

Importantly, the find command only locates directories and executes commands over them. In particular, we employ the -exec option with the bash -c switch to run a whole script as supplied on the command line.

Within that script, $0 is a special variable that usually contains the name of the shell or script being executed. However, in the context of find -exec, $0 means the current directory being processed by find.

To get a word count, we use a pipeline with wc:

  • wc -w *.txt counts the number of words in all the files with the .txt extension
  • 2>/dev/null discards any debugging information
  • tail -n1 extracts only the last line of the output which shows the total number of files

Next, an if condition checks the word_count variable. If the file exists, then we get the word count inside word_count. Otherwise, the script prints an alternative message.

Lastly, we use the cd – command to go back to the previous directory. Here, we again suppress the output.

The characters {} \; are part of the exec option:

  • {} gets replaced by the current directory find processes
  • \; indicates the end of the -exec option of find

Finally, we make the script executable and run it:

$ chmod +x findDirectory.sh
$ ./findDirectory
Entering directory: ./walkthrough-directory
Word count in ./walkthrough-directory: 4 file.txt
Exiting directory: ./walkthrough-directory
Entering directory: ./walkthrough-directory/dir1
Word count in ./walkthrough-directory/dir1:  4 total
Exiting directory: ./walkthrough-directory/dir1
Entering directory: ./walkthrough-directory/dir2
Word count in ./walkthrough-directory/dir2:  4 total
Exiting directory: ./walkthrough-directory/dir2
Entering directory: ./walkthrough-directory/dir3
Word count in ./walkthrough-directory/dir3:  4 total
Exiting directory: ./walkthrough-directory/dir3

From the output, we can see the file count inside each directory.

5. Using a Function

Another possible approach we can take is to define a function that switches directories and executes the commands on files inside those directories.

In this example, we’ll create a function to traverse directories and subdirectories and search for a string in files. For this purpose, we’ll create the funcDir.sh script:

$ cat funcDir.sh
#!/bin/bash
# Function to process a directory
process_directory() {
    local dir="$1"
    echo "Entering directory: $dir"
    pushd "$dir" || return
    # Search for a specific string in files
    search_string="your_search_string"
    grep -r "$search_string" .
    echo "Exiting directory: $dir"
    popd >/dev/null
}
# Assuming the top-level directory is where the script is run
top_dir="./"
# Loop through each subdirectory
for sub_dir in "$top_dir"/*; do
    if [ -d "$sub_dir" ]; then
        process_directory "$sub_dir"
    fi
done

In this example, we create a function process_directory that enters a directory and searches for a string stored in search_string. We call this function inside a for loop to iterate through the top directory and its subdirectories.

Apart from that, we use the pushd and popd commands to change the directories. These commands are useful when we want to switch between multiple directories as well as maintain a history of directory switches:

  • pushd “$dir” changes the current directory to the specified directory stored in $dir
  • the current directory is added to the directory stack and the new directory becomes the current directory
  • the logical OR exits the script with status 1 if the pushd command fails
  • the popd command to switch back to the directory on top of the directory stack
  • >/dev/null/ discards any output that popd might generate to keep our output clean

Now, let’s make the script executable and run it:

$ chmod +x funcDir.sh
$ ./funcDir.sh
Entering directory: .//walkthrough-directory 
baeldung-linux/walkthrough-directory /mnt/d/baeldung-linux
./cdDirectory2.sh:search_string="my file" 
./dir1/file1.txt:this is my file 
./dir2/file2.txt:this is my file ./dir3/file3.txt:this is my file
./file.txt:this is my file 
Exiting directory: .//walkthrough-directory

The combination of pushd and popd inside a function provides an effective strategy for traversing the hierarchy.

6. Conclusion

In this article, we learned how to walk through a directory structure in a shell script.

To begin with, we discussed the use of the cd command with a for loop to traverse a tree. We also saw how to use the find command with the exec option to process files inside subdirectories.

Lastly, we looked at using functions as a possible way to traverse a directory and its subdirectories.

Comments are closed on this article!