1. Overview

When we want to rename a file in Linux, we would normally use the mv command. However, the mv command cannot help us to rename files in batches.

In this tutorial, we’re going to look at some batch renaming use cases, and how to solve them with a few different methods.

2. Renaming Tools

We’ve chosen three methods that can help us with renaming:

  • the rename command
  • the perl-rename tool
  • the awk | sh approach

Let’s first look at how to install the tools, or whether they’re likely to be available on our distro already.

2.1. The rename Command

The rename command is from the util-linux package. It replaces only the first occurrence of some text in a filename.

Since the util-linux package is a standard package distributed by kernel.org, the rename command is available in all Linux distributions by default.

In this article, we’ll refer to it as rename.

2.2. The perl-rename Command

The perl-rename command is not available by default in modern Linux distributions.

In Ubuntu-family distros, we can install perl-rename with apt-get:

root# apt-get install rename

To install perl-rename on RedHat-family distributions, we need to install the prename package:

root# yum install prename

After the installation, the perl-rename command is installed at /usr/bin/prename.

In Archlinux-derived distributions we can install perl-rename with pacman:

root# pacman -Syu perl-rename

As it supports Perl Compatible Regular Expressions (PCRE), the perl-rename command is more powerful than the default rename command and is widely used.

In this article, we’ll use refer to it as prename.

2.3. awk | sh Approach

awk itself is not a file renaming tool. However, with its powerful text processing functionalities, awk can be used to process filenames and turn them into “mv oldName newName” commands.

We can pipe those “mv” commands to a shell to run them.

Since awk is pre-installed on most modern Linux distributions, we can use it to rename files in batches in cases when we don’t have root privileges to install software, such as the prename utility. And, in some scenarios, awk can do renaming jobs that prename can’t do.

We’ll use GNU awk in our renaming scenario examples.

3. Change File Extensions

The first scenario is a common one. For example, we have many *.txt files:

$ ls
file1.txt file2.txt file3.txt file4.txt file5.txt

Let’s say we want to rename all *.txt files to *.log.

3.1. rename

The job is easy for the command rename. However, before we see how to use it, let’s first learn two important options of the rename command:

  • -n: dry-run, do not make any change
  • -v: verbose, show which files were renamed

If we combine the two options, -nv, the rename command will only show what changes would be made without really applying the changes.

This is very helpful for us to check before our changes are made:

$ rename -nv txt .log *.txt
`file1.txt' -> `file1..log'
`file2.txt' -> `file2..log'
`file3.txt' -> `file3..log'
`file4.txt' -> `file4..log'
`file5.txt' -> `file5..log'

In this example, when we typed the command line, we had an extra dot in front of the replacement “log” by mistake.

The rename command with “-nv” options shows the mistake clearly and gives us a chance to correct the command.

It is recommended to always do a dry-run to make sure that the changes are correct. This is because there is no “undo” or “restore” option for a bulk renaming operation.

Now, let’s use the rename command to rename our .txt files to .log:

$ rename .txt .log *.txt 
$ ls 
file1.log file2.log file3.log file4.log file5.log

The rename command is pretty straightforward. It looks for the first occurrence of  txt in each filename, and replaces it with log.

Alternatively, we can use rename with the find command to target specific files:

$ ls
log1-backup.txt  log1.txt  log2-backup.txt  log2.txt  log3.txt  log4.txt
$ find . -iname "log*-backup.txt" -exec rename .txt .xml '{}' \;
$ ls
log1-backup.xml  log1.txt  log2-backup.xml  log2.txt  log3.txt  log4.txt

The -exec argument tells find to execute rename for every matching file found. In our case, all files with names containing “backup” are targeted.

Please bear in mind that the “.” after the find command denotes the current directory.

3.2. prename

The prename command renames files based on Perl’s search and replace expression. It also supports the -nv (dry-run and verbose) options.

Let’s see how the prename command can be used to rename the txt files:

$ prename 's/[.]txt$/.log/' *.txt
$ ls 
file1.log file2.log file3.log file4.log file5.log

In this example, we were able to use the regular expression marker $ to signify that the match had to be at the end of the filename.

3.3. awk | sh

awk is a powerful text-processing utility. We can pipe awk generated mv commands to the shell to do bulk renaming:

awk '...' | sh

There is no “dry-run” option for awk, however, if we remove the “| sh“, awk will print all generated mv commands to stdout without executing them. This can be used in place of the -nv options of rename and prename.

We can use the find command to pipe the filenames to awk as input:

$ find . -name "*.txt" | awk -v mvCmd='mv "%s" "%s"\n' \
    '{ old = $0;
       sub(/[.]txt$/,".log");
       printf mvCmd,old,$0;
     }'
mv "./file5.txt" "./file5.log"
mv "./file4.txt" "./file4.log"
mv "./file3.txt" "./file3.log"
mv "./file2.txt" "./file2.log"
mv "./file1.txt" "./file1.log"

We should note that, when we write shell scripts, we shouldn’t parse the output of ls. This is because a filename could contain whitespace, tabs, or even linebreaks. The output of ls cannot distinguish them well.

We need to use the find command to provide awk with the filenames. Though awk can take a filename expression as an input, it uses that expression to read the contents of the files, where we want to do text processing on the names themselves.

If we append | sh to the command, all *.txt files will be renamed to *.log:

$ find . -name "*.txt" | awk -v mvCmd='mv "%s" "%s"\n' \
    '{ old=$0;
       sub(/[.]txt$/,".log");
       printf mvCmd,old,$0;
     }' | sh
 
$ ls 
file1.log file2.log file3.log file4.log file5.log

4. Replace a String by Another in Filenames

We face this kind of renaming problem often. For example, here are some *.txt files:

$ ls -1
image1.txt
image2.txt
image3.txt
image4KeepMe.txt
image5KeepMe.txt

Let’s say we want to replace the text “image” with “picture” only in the first three files, leaving the last two files unchanged.

4.1. rename

If we use the rename command, we’ll have a problem:

$ rename -nv image picture *.txt     
`image1.txt' -> `picture1.txt'
`image2.txt' -> `picture2.txt'
`image3.txt' -> `picture3.txt'
`image4KeepMe.txt' -> `picture4KeepMe.txt'
`image5KeepMe.txt' -> `picture5KeepMe.txt'

The rename command doesn’t support regex pattern matching. Therefore, it cannot exclude the last two files with “KeepMe” in names by itself.

To solve this problem, we can use a globbing trick:

$ rename -nv image picture image?.txt
`image1.txt' -> `picture1.txt'
`image2.txt' -> `picture2.txt'
`image3.txt' -> `picture3.txt'

In this case, the glob expression helped us to narrow down which files got renamed.

Similarly, we can use find and rename together to achieve the same objective.

For instance, let’s replace “backup” with “ignore“:

$ ls *backup*
log1-backup.xml  log2-backup.xml
$ find . -iname "*backup*" -exec rename backup ignore '{}' \;
$ ls
log1-ignore.xml  log1.txt  log2-ignore.xml  log2.txt  log3.txt  log4.txt

4.2. prename

The prename command provides the more powerful PCRE. Therefore, excluding the last two files is not a challenge for prename at all.

A simple negative lookahead will solve the problem:

$ prename 's/image(?!.*KeepMe)/picture/' *.txt 
$ ls -1
image4KeepMe.txt
image5KeepMe.txt
picture1.txt
picture2.txt
picture3.txt

4.3. awk | sh

The regex is a fundamental part of awk programming. So, awk can easily exclude the last two “KeepMe” files as well:

$ find . -name "*.txt" | awk -v mvCmd='mv "%s" "%s"\n' \
    '!/KeepMe/ {
       old=$0;
       sub(/image/,"picture");
       printf mvCmd,old,$0;
     }'| sh
 
$ ls -1
image4KeepMe.txt
image5KeepMe.txt
picture1.txt
picture2.txt
picture3.txt

5. Replace All Occurrences of a String with Another in Filenames

Previously, we were trying to replace a single occurrence of a substring in a filename. Now, let’s look at what happens when we want to replace all occurrences.

Let’s start with these files:

$ ls -1
igm1.igm
igm2_igm3.igm.zip
igm4_igm5_igm6.igm.zip

Let’s change all occurrences of “igm” to “img” in all filenames. We should note that the number of occurrences of the string “igm” varies across the filenames.

5.1. rename

We’ve already learned that the rename command replaces only the first occurrence of a string. So, the rename command cannot handle this scenario.

5.2. prename

Perl’s substitution expression supports a “g” (global) modifier. This allows the matching operator to match all occurrences of a pattern:

$ prename 's/igm/img/g' *
$ ls -1
img1.img
img2_img3.img.zip
img4_img5_img6.img.zip

5.3. awk | sh

We have used awk’s sub function to replace the first occurrence of a string.

awk has another function, gsub, which does the same as Perl’s “g” modifier:

$ find . -type f | awk -v mvCmd='mv "%s" "%s"\n' \
    '{ old=$0;
       gsub(/igm/,"img");
       printf mvCmd,old,$0;
     }' | sh
 
$ ls -1
img1.img
img2_img3.img.zip
img4_img5_img6.img.zip

6. Format Numbers in Filenames

In this scenario, let’s format the numbers in filenames. For example, we have three *.txt files under a directory:

$ ls -1
afile-1.txt
bfile-10.txt
cfile-123.txt

To make the list easier to read, we want to format the numbers to add leading zeros:

  • afile-1.txt becomes afile-001.txt
  • bfile-10.txt becomes bfile-010.txt
  • cfile-123.txt will keep its name

6.1. rename

Because the numbers are dynamic, the rename command cannot do the renaming job in one command.

6.2. prename

Perl’s substitution expression provides the sprintf function with the “e” (eval) modifier:

$ prename 's/\d+/sprintf("%03d","$&")/e' *.txt
$ ls -1
afile-001.txt
bfile-010.txt
cfile-123.txt

6.3. awk | sh

awk has the sprintf function as well.

However, we cannot use this in the same way as we did with prename. Instead, we have to format the number with sprintf before calling the sub function:

$ find . -name "*.txt" | awk -F'[.-]' -v mvCmd='mv -n "%s" "%s"\n' \
    '{ num=sprintf("%03d", $(NF-1));
       old=$0;
       sub(/[0-9]+/,num);
       printf mvCmd,old,$0;
     }'
 
$ ls -1
afile-001.txt
bfile-010.txt
cfile-123.txt

Here we use -F'[.-]’ to split the filename by “.” or ““. In this case, the second-to-last field $(NF-1) is the number we want to format.

7. Change the Character Case in Filenames

In this scenario, let’s convert all uppercase characters in a filename into lowercase:

$ ls -1
INSTRUCTION.TxT
Query.SQL
ReadMe.MD

7.1. rename

Another dynamic renaming scenario, this is not a job for the rename command.

7.2. prename

Perl has an lc (lowercase) function, to convert the input string to lowercase. The lc function together with the “e” modifier can solve this problem:

$ prename 's/.*/lc("$&")/e' *
$ ls -1 
instruction.txt
query.sql
readme.md

Except for the lc function, we can also use Perl’s transliteration operator “y to convert all uppercase characters to lowercase:

$ prename 'y/A-Z/a-z/' *
$ ls -1 
instruction.txt
query.sql
readme.md

7.3. awk | sh

awk has two functions particularly for case conversion: tolower and toupper. Let’s use tolower here:

$ find . -type f | awk -v mvCmd='mv "%s" "%s"\n' \
    '{ new=tolower($0);
       printf mvCmd,$0,new;
     }' | sh
 
$ ls -1
instruction.txt
query.sql
readme.md

8. Swap Strings in Filenames

Let’s say we have many system log files under a directory. Each filename contains a date string in the format of DD-MM-YYYY:

$ ls -1
08-08-1992_system.log
18-11-1976_system.log
29-11-2019_system.log

We’re going to rename the files by converting the DD-MM-YYYY format into the ISO date format: YYYY-MM-DD.

8.1. rename

Another dynamic renaming scenario, this is not a job for the rename command.

8.2. prename

Here, we can make use of the backreferences and the capturing groups to achieve our goal:

$ perl-rename -nv 's/(\d\d)-(\d\d)-(\d{4})/\3-\2-\1/' *.log
$ ls -1
1976-11-18_system.log
1992-08-08_system.log
2019-11-29_system.log

The three capturing groups, defined by () in the regular expression, are referenced by their numbers \1, \2, and \3 in the substitution expression.

8.3. awk | sh

GNU awk‘s nice gensub function allows us to handle backreferences as well:

$ find . -name "*.log" | awk -v mvCmd='mv "%s" "%s"\n' \
    '{ new=gensub(/([0-9]{2})-([0-9]{2})-([0-9]{4})/, "\\3-\\2-\\1", "g");
       printf mvCmd,$0,new;
     }' | sh
 
$ ls -1
1976-11-18_system.log
1992-08-08_system.log
2019-11-29_system.log

9. Convert a Unix Timestamp into ISO Date Format

In this scenario, we’ll look at a deeper date format conversion.

Let’s have a look at the files:

$ ls -1
app_1575212161.log
app_217189800.log
app_713302200.log

Each filename contains a Unix timestamp. We’d like to convert the Unix timestamp into the ISO date format.

9.1. rename

Still, the rename command cannot do this job for us.

9.2. prename

We’ve learned that we can evaluate Perl expressions in Perl’s search and replace expressions with the “e” modifier. We’ll use this trick again to solve this problem:

$ prename 'use POSIX qw(strftime);s/\d+/strftime "%FT%H:%M:%S", localtime($&)/e' *.log
$ ls -1
app_1976-11-18T19:30:00.log
app_1992-08-08T21:30:00.log
app_2019-12-01T15:56:01.log

The localtime function converts the Unix timestamp into Perl’s time type. Then, Perl’s strftime function from the POSIX module is responsible for converting the time into ISO date format.

9.3. awk | sh

GNU awk has the strftime function, too. It can help us to get different date formats from a Unix timestamp:

$ find . -name "*.log" | awk -F'[_.]' -v mvCmd='mv "%s" "%s"\n' \
    '{ old=$0;
       sub(/[0-9]+/,strftime("%FT%T", $(NF-1)));
       printf mvCmd, old, $0;
     }' | sh
 
$ ls -1
app_1976-11-18T19:30:00.log
app_1992-08-08T21:30:00.log
app_2019-12-01T15:56:01.log

In addition to using GNU awk’s strftime function, an alternative solution with awk is also worth trying.

awk can invoke an external command and get the output for further processing. We can use the getline expression to get the output of a command and assign it to a variable:

External_Command | getline variable

Let’s use the date command to convert the Unix timestamp to ISO format, then replace the Unix timestamp with its output:

$ find . -name "*.log"|awk -F'[_.]' -v mvCmd='mv "%s" "%s"\n' \
    '{ old=$0;
       "date +%FT%T -d @"$(NF-1)|getline isoFmt;
       gsub(/[0-9]+/,isoFmt);
       printf mvCmd, old, $0;
     }' | sh
 
$ ls -1
app_1976-11-18T19:30:00.log
app_1992-08-08T21:30:00.log
app_2019-12-01T15:56:01.log

With the ability to cooperate with external commands, awk becomes more powerful. It can do many even more complicated renaming jobs, for example:

  • add the md5 hash of the file in the filename (the md5sum command)
  • add some markers in the filename if the file content contains some pattern (the grep command)
  • translate a domain name in the filename to the IP address (the nslookup or host command)

10. Format Number Dynamically in Filenames

We saw a number-formatting scenario earlier. We learned that we can add leading zeros by passing a predefined fixed format to the sprintf function — for example, sprintf(“%03d”,”$&”).

However, we may not want to manually examine the files to determine the number of leading zeroes, especially if we have a large number of files.

There is a way that number format can be calculated automatically for us, during file renaming, based on the files being processed.

Let’s add a couple of new files to the example we used before:

$ ls -1
afile-1.txt
bfile-10.txt
cfile-123.txt
dfile-4711.txt
efile-20191201.txt

Neither rename nor prename can detect the correct padding required to make all these numbers the same length.

10.1. awk | sh

awk gets a full list of filenames from the find command, so awk can solve our problem:

$ find . -name "*.txt" | awk -F'[-.]' -v mvCmd='mv -n "%s" "%s"\n' \
    '{ num = $(NF-1);
       files[$0] = num;
       max = num > max? num : max;
     }
     END { width = length(max);
           for(name in files) {
               old = new = name;
               formattedNum = sprintf("%0*d", width, files[name]);
               sub(files[name],formattedNum,new);
               printf mvCmd,old,new;
           }
     }' | sh
 
$ ls -1
afile-00000001.txt
bfile-00000010.txt
cfile-00000123.txt
dfile-00004711.txt
efile-20191201.txt

Let’s go through the awk script line by line to see what is happened there:

  1. pipe the find result to awk, then define the field separator and the mv command template
  2. for each filename, extract the number, which is the second-to-last field, and save in the num variable
  3. build a hashtable named files — the key is the full filename ($0), the value is the num variable
  4. find the largest num from all filenames and save it in the max variable
  5. awk has gone through all the filenames and saved all required data in files and max variables
  6. calculate the width of the largest number (max)
  7. for each element in the hashtable files, awk generates the mv command
  8. make a copy of the name for renaming
  9.  format the number in the current name
  10. replace the original number in the filename (new) with the formattedNum
  11. generate the mv command
  12. end the loop
  13. finally, pipe the generated mv commands to sh 

10.2. awk | sh as a General Purpose Renaming Solution

awk solutions to easy renaming problems may not look as compact as other renaming tools, such as prename.

However, awk is very useful and worth learning. This is because its powerful scripting language can solve almost any kind of bulk renaming problem.

And awk can do much more than renaming files.

11. Conclusion

In this article, we talked about some common renaming tools and how to get them.

We then reviewed some common renaming scenarios and discussed how to solve them with each tool, where possible. We saw that rename was the least flexible and that prename is able to use Perl expressions to solve all but the most complex problems.

Finally, we saw that, as a scripted method, awk with sh is the most powerful solution, and is available in most situations.

2 Comments
Oldest
Newest
Inline Feedbacks
View all comments
Comments are closed on this article!