Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
When we work under the Linux command line, we often need to manipulate text files.
Removing lines from text files is a kind of common operation — for example, removing the first line of a file, removing the lines that appear in a file A from another file B, removing the last N lines from a file, and so on.
In this tutorial, we’ll have a look at how to delete lines from a given line number until the end of the file.
Although the problem isn’t difficult to understand, let’s see an example to get it straight in our heads.
Let’s say we have an input file called input.txt:
$ nl input.txt
1 I am the 1st line.
2 I am the 2nd line.
3 I am the 3rd line.
4 I am the 4th line.
5 I am the 5th line.
6 I am the 6th line.
7 I am the 7th line.
8 I am the 8th line.
As the output above shows, we’ve used the nl command to print the file’s content with line numbers.
Now, let’s say our goal is to remove all lines from line five till the end of the file.
The problem can have two variants:
We’ll discuss both scenarios in this tutorial.
There are various ways to do that in the Linux command line. In this tutorial, we’ll explore four approaches:
Next, let’s see them in action.
Bash is the default shell for most modern Linux distributions. So, if we solve a problem with pure Bash, that is to say, our solution doesn’t rely on any extra dependencies.
Next, let’s see how to solve the problem using simple Bash scripts.
First, let’s see a shell script rmLines_v1.sh to remove lines excluding the given line. That is, the given line will remain in our result:
$ cat rmLines_v1.sh
#!/bin/bash
FILE="$1"
LINE_NO=$2
i=1
while read line; do
echo "$line"
if [ "$i" -eq "$LINE_NO" ]; then
break
fi
i=$(( i + 1 ))
done <"$FILE"
The shell script looks pretty simple. It accepts two arguments: the filename and a line number.
The main part of the script is a while loop that goes through and outputs the lines until the given line number. We declare a counter variable $i and increment the counter within the loop so that we know when we should stop printing.
Let’s execute the script with our example input file:
$ ./rmLines_v1.sh input.txt 5
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
In this test, we’ve passed a “5” as the second argument, and we see line five is in the output. So, it works as we expected.
Note that although the script prints the desired output, it won’t change the original input.txt file. If we want to write the change back to the input file, shell’s redirection can help us:
$ ./rmLines_v1.sh input.txt 5 > tmp.result && mv tmp.result input.txt
$ cat input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
Next, let’s modify the script rmLines_v1.sh to remove lines, including the given line.
The requirement is easy to understand. For instance, if we get line number 5 as an argument, line five should be removed as well.
This is not a challenge to us. We can modify the rmLines_v1.sh script to solve the problem. There are two ways to make it work:
We can create two scripts to handle the including and the excluding scenarios separately. However, if new requirements are coming, we have to maintain two scripts.
It would be nice if we could create one script working for both cases.
Next, let’s see how to achieve the goal.
First, let’s take a look at the second version of the script:
$ cat rmLines_v2.sh
#!/bin/bash
err_usage(){
echo "The Arguments are not accepted!"
echo "Usage: $0 <-i or -e> <FILENAME> <FROM_LINE_NUMBER>"
echo "-i : Remove lines including the given line."
echo "-e : Remove lines excluding the given line."
exit 1
}
if [ $# -ne 3 ]; then
err_usage
fi
FILE="$2"
LINE_NO=$3
case "$1" in
-i)
LINE_NO=$(( LINE_NO - 1 ))
;;
-e)
;;
*)
err_usage
;;
esac
i=1
while read line; do
echo "$line"
if [ "$i" -eq "$LINE_NO" ]; then
break
fi
i=$(( i + 1 ))
done <"$FILE"
As the output above shows, we’ve introduced a new argument to the script: -i or -e to specify the line removal semantics, either including or excluding the given line, respectively.
Apart from that, we’ve added a new if check and a new case block to the script. The newly added if block verifies if a user has passed three arguments to the script.
The case statement checks if the first option is “-i” or “-e“. Further, it’ll reduce one from the $LINE_NO variable if the user passes the -i option.
The argument parsing we’ve shown in the script is just an example. we discuss more advanced argument parsing in greater detail in “How to use command-line arguments in a Bash script“.
Finally, let’s test version 2 of the script, with the -i and -e option, to see if it solves our problem:
$ ./rmLines_v2.sh -i input.txt 5
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
$ ./rmLines_v2.sh -e input.txt 5
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
Great! As the output shows, we’ve achieved coverage of both scenarios using a single script. Therefore, we’ve solved the problem using pure Bash.
Even though the pure Bash solution doesn’t require any external packages, we have to implement each step, such as the loop, by ourselves.
There are a lot of widely used text processing tools in the Linux command-line arsenal. Usually, they can solve our problems in a pretty compact and straightforward way.
The head command is probably the most obvious way to solve the problem. This is because the head command will output the first part of the file. It’s exactly what we’re looking for.
We can use head‘s -n X option to get the first X lines:
$ head -n 5 input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
We may want to replace the hard-coded “5” in the head command above with a shell variable to make the command easy to be assembled in a script:
$ head -n "$LINE_NO" input.txt
The head command itself doesn’t support both output inclusive and exclusive of the given line. However, we can adjust the X in the ‘-n X‘ option to achieve that:
$ LINE_NO=5
$ head -n "$(( LINE_NO - 1 ))" input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
It isn’t difficult to wrap the head command in a shell script that is similar to the one we wrote in the pure Bash solution.
We can let the shell script handle the arguments, such as the “inclusive” or “exclusive” options, and change the $LINE_NO variable used in the head command.
So, we can also achieve one single script covering two scenarios with the head command.
So far, we’ve solved the problem using pure Bash and the handy head command.
Now, let’s take a look at how sed solves the problem.
sed can solve the problem in different ways. For example, all these commands will do the job (assuming that we’ve stored the value 5 in the $LINE_NO shell variable):
sed -n "1,$LINE_NO p;$(( LINE_NO + 1 )) q" input.txt
sed "$(( LINE_NO + 1 )),$ d" input.txt
sed "1, $LINE_NO ! d" input.txt
Note that we use double quotes to wrap the commands in all sed commands above to expand the shell variables.
Moreover, we added a space between “!” and the “d” command in the last command. This is because if we put “!d” between double quotes, the ‘!‘ character will trigger, by default, Bash’s history expansion.
The three commands will produce the same output. However, if we consider the performance, the first command will have better performance than the other two, although it looks longer than both.
This is because the first command only reads until LINE_NO+1 lines. After that, the sed command will quit (q) processing the input file. This is particularly useful when we handle large input files.
Now, let’s take the first command as an example to do a test:
$ echo $LINE_NO
5
$ sed -n "1,$LINE_NO p; $(( LINE_NO + 1 )) q" input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
It’s worth mentioning that sed has a -i option allowing us to write the change back to the file.
It isn’t difficult to change our sed commands to make them work with the “inclusive deletion” scenario. We can play some math tricks on the $LINE_NO variable to achieve that:
$ sed -n "1,$(( LINE_NO - 1 )) p; $LINE_NO q" input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
We’ve solved the problem using the sed command.
If we want to make one single sed command work for both scenarios, we can wrap the sed command in a shell script that is similar to the pure Bash solution.
awk is another powerful text-processing utility. Now, let’s see how awk solves the problem.
Similar to sed, awk can also solve the problem in a pretty compact way:
awk 'NR <= 5' input.txt
However, we want to make the command more flexible. Therefore, we’ll extract the hardcoded ‘5‘ to an awk variable “lineNo“. Further, to gain better performance, we exit the file processing when the current processing line number is greater than the value of from:
$ awk -v lineNo='5' 'NR > lineNo{exit};1' input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
There are several ways to change the awk command to do “inclusive deletion”. A straightforward approach is replacing the variable from with (from -1):
$ awk -v lineNo='5' 'NR > (lineNo-1){exit};1' input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
We can also build one command to cover both “inclusive-deletion” and “exclusive-deletion”. This is easier with awk as it supports variables and awk script language:
$ awk -v opt="i" -v lineNo="5" 'NR > lineNo-( opt == "i"? 1 : 0 ){exit};1' input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
$ awk -v opt="e" -v lineNo="5" 'NR > lineNo-( opt == "i"? 1 : 0 ){exit};1' input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.
Of course, if we like, we can also wrap the awk command in a small shell script and pass the “opt” and “lineNo” from the shell variable, just like we’ve done with the sed command.
Removing lines from a text file is a common operation when we work under the Linux command line.
This article addressed how to delete all lines in a file starting from a specific line. We’ve introduced four different approaches to solve the problem through examples.
The pure Bash solution doesn’t require any external package as dependencies. However, using handy and powerful text processing tools, such as head, sed, or awk, can solve the problem in a more compact and straightforward way.
Also, we’ve discussed the technique to make the same command cover both “inclusive-deletion” and “exclusive-deletion” scenarios.