
Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
The Bash history feature has many uses:
Naturally, a comprehensive function may also require maintenance.
In this tutorial, we look at ways to avoid and remove duplicate entries in the Bash history files. First, we explore the Bash history file format. After that, we see how to manually remove duplicates. Next, we check the history control variables, which help us prevent duplicates in the future. Finally, we explain that combining both approaches leads to the best results.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.
Importantly, the command history is kept in memory while Bash is running and only written out to the respective file periodically, on exit, or manually, with the -a or -w flags of history.
The default format of the $HOME/.bash_history (or $HISTFILE) file is fairly basic:
$ cat $HOME/.bash_history
echo First command.
echo Baeldung
mkdir /dir
cd /dir
cat <
In general, most commands in the history file appear as they were written at the prompt, one per row, in chronological order. However, there can be exceptions. Depending on the syntax, here-strings may break the mechanics as well.
By setting $HISTTIMEFORMAT to the appropriate strftime() format, we can add the exact timestamp of each command:
$ HISTTIMEFORMAT=''
$ history
1 HISTTIMEFORMAT=''
2 history
$ HISTTIMEFORMAT='%s'
$ history
1 1679906660HISTTIMEFORMAT=''
2 1679906661history
3 1679906667HISTTIMEFORMAT='%s'
4 1679906668history
This is reflected as comments in the $HOME/.bash_history file:
$ cat $HOME/.bash_history
#1679906660
HISTTIMEFORMAT=''
#1679906661
history
#1679906667
HISTTIMEFORMAT='%s'
#1679906668
history
Regardless of the chosen time display format, stored timestamps are always in epoch time. Notably, setting $HISTTIMEFORMAT doesn’t retroactively add the timestamp comments in the history file. Just like resetting it doesn’t remove them.
After getting to know the Bash history format, let’s see how to manually remove duplicates from it post-factum.
Here, we treat $HOME/.bash_history as a regular file. Because of this, it would be best to dump everything to the file before running any deduplication attempt, as the latter wouldn’t apply to commands in memory:
$ history -a
$ history -w
Of course, the standard uniq and sort commands both have the –unique or -u flag for removing duplicates. However, both require a sort, which could break the chronological order of our history.
Due to that, we take a refined approach involving awk to remove duplicates without sorting:
$ cat $HOME/.bash_history
test
VAR='value'
test
echo 'Last.'
$ awk '!a[$0]++' $HOME/.bash_history
test
VAR='value'
echo 'Last'
Here, awk removes the second instance of test from the file:
If the results are acceptable, we can commit them to the original file:
$ awk '!a[$0]++' $HOME/.bash_history > $HOME/.bash_history.tmp &&
mv $HOME/.bash_history.tmp $HOME/.bash_history
There are three main drawbacks to this solution:
While we can address all by setting a trigger to first write out the history and then modify the awk command to check for and remove timestamp comments, there is a possibly better option.
Part of the Bash history facility is the $HISTCONTROL settings variable.
By setting $HISTCONTROL to a colon-separated list of values, we can control the stream of commands committed to history:
In essence, the last option is closest to our needs. While erasedups doesn’t deduplicate the whole history retroactively before it sees a duplicate, it still avoids future duplicates both in-memory and in the file.
To be sure, we can simply set the $HISTCONTROL variable with all possible options on system start:
export HISTCONTROL=ignoreboth:erasedups
In addition, there’s the $HISTIGNORE variable, which includes a colon-separated list of patterns to ignore the history for. With its globbing support, where & ampersand matches the previous history line and * wildcards work as usual, we can simulate ignoreboth, but not erasedups.
In this article, we looked at duplicates in the Bash history list, as well as how to deal with them.
In conclusion, there are two main ways to deal with duplicates, but one works only retroactively, while the other mainly works proactively. This way, applying the first and setting up the second ensures there are no duplicates in the Bash history.