Generic Top

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

1. Introduction

In this tutorial, we'll explore a few Bash approaches to execute the same command in multiple directories.

2. A Closer Look

First of all, let's break down the goal into smaller steps:

  1. list the contents of the current folder
  2. filter out everything that is not a hard-link sub-folder
  3. for each sub-folder run our command
  4. for each sub-folder, return to step 1

The logic looks straightforward, but step 2 is actually very important, especially if the command we execute is irreversible.

In fact, without step 2, we could end up running a potentially dangerous command on a file we didn't mean to touch, or in an unexpected location due to a symbolic link.

We are going to show how we can deal with these problems depending on the approach we choose.

3. Prepare a Test Environment

And before we dig into the actual problem solving, let's prepare our environment:

# create some folders
for folder in 1 2 3
do
    mkdir folder_$folder
done

# create a sub-directory
mkdir folder_1/sub_folder

# create an empty file
touch my_file

# create a symbolic link pointint to folder_3
ln -s folder_3 my_symbolik_link_to_3

Let's check what we have created:

tree

.
├── folder_1
│   └── sub_folder
├── folder_2
├── folder_3
├── my_file
└── my_symbolik_link_to_3 -> folder_3

5 directories, 1 file

Everything is now ready for our scripts…

4. Loops

In Bash, a loop can be programmed using different built-ins. In this section, we're going to explore them one by one.

If at any point we need to look at their man page, we have to remember that the built-ins instructions are in the main bash man page, so to access them we have to run man bash in a terminal and search in the page for the builtin keyword (for, while or until).

To implement the logic described in section 2, we're going to use two test conditions:

  • -d, which returns true if the path considered is a directory
  • -h, which returns true if the path considered is a symbolic link

We need both conditions as -d doesn't filter out symbolic links pointing to folders.

4.1. The for Loop

The for loop is very convenient, as we can use its range syntax to easily retrieve the current folder content and loop through each item:

function recursive_for_loop { 
    for f in *;  do 
        if [ -d $f  -a ! -h $f ];  
        then  
            cd -- "$f";  
            echo "Doing something in folder `pwd`/$f"; 

            # use recursion to navigate the entire tree
            recursive_for_loop;
            cd ..; 
        fi;  
    done;  
};
recursive_for_loop

The code above applies both the filters we mentioned earlier. As a result, no file or symlink is processed by our code:

# Result
Doing something in folder /home/user/workspace/folder_1
Doing something in folder /home/user/workspace/folder_1/sub_folder
Doing something in folder /home/user/workspace/folder_2
Doing something in folder /home/user/workspace/folder_3

We can observe that the conditions defined earlier have filtered out successfully both the file and the symbolic link present in our path.

4.2. The while Loop

In the while case, we cannot read from a range directly so we have to pipe the output of another command instead:

function recursive_for_loop { 
    ls -1| while read f; do
        if [ -d $f  -a ! -h $f ];  
        then  
            cd -- "$f";  
            echo "Doing something in folder `pwd`/$f"; 

            # use recursion to navigate the entire tree
            recursive_for_loop;
            cd ..; 
        fi;  
    done;  
};
recursive_for_loop

# Result
Doing something in folder /home/user/workspace/folder_1
Doing something in folder /home/user/workspace/folder_1/sub_folder
Doing something in folder /home/user/workspace/folder_2
Doing something in folder /home/user/workspace/folder_3

4.3. The until Loop

The until construct uses the same technique to read the list of folders, but it needs a negation on the loop condition.

This is due to its different logic: while runs the loop instructions if the condition is true, while until runs them if the condition is false:

function recursive_for_loop { 
    ls -1| until ! read f; do
        if [ -d $f  -a ! -h $f ];  
        then  
            cd -- "$f";  
            echo "Doing something in folder `pwd`/$f"; 

            # use recursion to navigate the entire tree
            recursive_for_loop;
            cd ..; 
        fi;  
    done; 
};
recursive_for_loop

# Result
Doing something in folder /home/user/workspace/folder_1
Doing something in folder /home/user/workspace/folder_1/sub_folder
Doing something in folder /home/user/workspace/folder_2
Doing something in folder /home/user/workspace/folder_3

5. The find Command

An alternative to loops is the find command, which has the main purpose of searching for files in a directory hierarchy.

We've seen in the article Find Files That Have Been Modified Recently in Linux how find can be used to search for recently modified files.

In this instance, we're going to explore two options, -exec and -execdir, both with the same purpose of executing the specified commands on each matched file.

Although they achieve the same result, -execdir option is deemed safer as it will run the command from inside the directory where the matched file (or in our case sub-directory) resides, thus avoiding some race conditions.

Even so, there's danger: as execdir executes the command after entering the folder, if this contains an executable with the same name as our command, find will run the local command instead of the one we intended.

But for our simple case scenario, it doesn't make any difference.

To prove this last point, let's run find with the -exec option:

find ./* -type d -exec touch {}/test \;

# Result
 tree
.
├── folder_1
│   ├── sub_folder
│   │   └── test
│   └── test
├── folder_2
│   └── test
├── folder_3
│   └── test
├── my_file
└── my_symbolik_link_to_3 -> folder_3

The command successfully generated “test” files in each sub-directory.

Before proceeding to the next step, let's remove the test files we just created.

We can use the same script already introduced in section 2.4 of the article Linux Commands – Delete Files Older Than X:

find . -type f -name test -exec rm -i {} \;

# Result
tree
.
├── folder_1
│   └── sub_folder
├── folder_2
├── folder_3
├── my_file
└── my_symbolik_link_to_3 -> folder_3

And now, let's try find with the -execdir option:

find ./* -type d -execdir touch {}/test \;

# Result
tree
.
├── folder_1
│   ├── sub_folder
│   │   └── test
│   └── test
├── folder_2
│   └── test
├── folder_3
│   └── test
├── my_file
└── my_symbolik_link_to_3 -> folder_3

So, we proved that both options have the same outcome for our case scenario.

Comparing find with loops, we can observe there's no need to use filtering conditions: the option -type d filters out anything that is not a directory for us and, by default, the command doesn't follow symbolic links.

If we want to run multiple commands we just need to repeat the same option multiple times:

find ./* -type d -execdir echo Doing something in folder {} \; -execdir echo Done something in {} \;

# Result
Doing something in folder ./folder_1
Done something in ./folder_1
Doing something in folder ./sub_folder
Done something in ./sub_folder
Doing something in folder ./folder_2
Done something in ./folder_2
Doing something in folder ./folder_3
Done something in ./folder_3

6. The xargs Command

The xargs command builds and executes command lines using the standard input.

We can then pipe the output of find and execute whatever command we want with on each directory found.

find ./* -type d | xargs -I {} echo Doing something in folder {}

# Result
Doing something in folder ./folder_1
Doing something in folder ./folder_1/sub_folder
Doing something in folder ./folder_2
Doing something in folder ./folder_3

7. Control the Search Depth

All the above cases have the assumption that we need to traverse the entire directory tree, but what if we want to limit the depth we want to start from or to reach?

For this purpose, find also features two useful options: -mindepth and -maxdepth.

To apply them, we just need to set the depth level we're interested in, where the number zero represents the current directory we're in.

Let's try to replicate the same behavior as before:

find ./* -mindepth 0 -maxdepth 1 -type d -exec echo Doing something in folder {}\;

# Result
Doing something in folder ./folder_1
Doing something in folder ./folder_1/sub_folder
Doing something in folder ./folder_2
Doing something in folder ./folder_3

Now let's change the maxdepth to search only the first level sub-folders:

find ./* -mindepth 0 -maxdepth 0 -type d -exec echo Doing something in folder {}\;

# Result
Doing something in folder ./folder_1
Doing something in folder ./folder_2
Doing something in folder ./folder_3

We observe that the command has not been executed on sub_folder.

And now, let's try to execute our command only in the 2nd level of the tree, by changing mindepth instead:

find ./* -mindepth 1 -maxdepth 1 -type d -exec echo Doing something in folder {}\;

# Result
Doing something in folder ./folder_1/sub_folder

As we've seen, these options add more control on the find search, but, as a drawback, we need to know exactly the tree structure we're interested in processing.

8. Conclusion

In this tutorial, we explored different approaches to execute a Bash command in all sub-directories found in our local path's tree, testing Bash built-ins and basic Linux tools.

Generic bottom

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE
Comments are closed on this article!