1. Overview

When we work under the Linux command line, we often need to manipulate text files.

Removing lines from text files is a kind of common operation — for example, removing the first line of a file, removing the lines that appear in a file A from another file B, removing the last N lines from a file, and so on.

In this tutorial, we’ll have a look at how to delete lines from a given line number until the end of the file.

2. Introduction to the Problem

Although the problem isn’t difficult to understand, let’s see an example to get it straight in our heads.

Let’s say we have an input file called input.txt:

$ nl input.txt
     1	I am the 1st line.
     2	I am the 2nd line.
     3	I am the 3rd line.
     4	I am the 4th line.
     5	I am the 5th line.
     6	I am the 6th line.
     7	I am the 7th line.
     8	I am the 8th line.

As the output above shows, we’ve used the nl command to print the file’s content with line numbers.

Now, let’s say our goal is to remove all lines from line five till the end of the file.

The problem can have two variants:

  • Removing all lines, starting at the given line number, until the end of the file (the given line won’t be in the result).
  • Removing all lines after the given line number until the end of the file (the given line will be in the result).

We’ll discuss both scenarios in this tutorial.

There are various ways to do that in the Linux command line. In this tutorial, we’ll explore four approaches:

  • Using pure Bash
  • Using the head command
  • Using the sed command
  • Using the awk command

Next, let’s see them in action.

3. Using Pure Bash

Bash is the default shell for most modern Linux distributions. So, if we solve a problem with pure Bash, that is to say, our solution doesn’t rely on any extra dependencies.

Next, let’s see how to solve the problem using simple Bash scripts.

3.1. Remove All Lines After the Given Line

First, let’s see a shell script rmLines_v1.sh to remove lines excluding the given line. That is, the given line will remain in our result:

$ cat rmLines_v1.sh
#!/bin/bash
FILE="$1"
LINE_NO=$2
i=1
while read line; do
    echo "$line"
    if [ "$i" -eq "$LINE_NO" ]; then
        break
    fi
    i=$(( i + 1 ))
done <"$FILE"

The shell script looks pretty simple. It accepts two arguments: the filename and a line number.

The main part of the script is a while loop that goes through and outputs the lines until the given line number. We declare a counter variable $i and increment the counter within the loop so that we know when we should stop printing.

Let’s execute the script with our example input file:

$ ./rmLines_v1.sh input.txt 5
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

In this test, we’ve passed a “5” as the second argument, and we see line five is in the output. So, it works as we expected.

Note that although the script prints the desired output, it won’t change the original input.txt file. If we want to write the change back to the input file, shell’s redirection can help us:

$ ./rmLines_v1.sh input.txt 5 > tmp.result && mv tmp.result input.txt 

$ cat input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

Next, let’s modify the script rmLines_v1.sh to remove lines, including the given line.

3.2. Remove the Given Line and All Lines After It

The requirement is easy to understand. For instance, if we get line number 5 as an argument, line five should be removed as well.

This is not a challenge to us. We can modify the rmLines_v1.sh script to solve the problem. There are two ways to make it work:

  • Move the echo “$line” line after the if block so that we can break the loop before the given line gets printed
  • Change the if condition [ “$i” -eq “$LINE_NO” ] into [ “$i” -eq $(( “LINE_NO” -1 ))]

We can create two scripts to handle the including and the excluding scenarios separately. However, if new requirements are coming, we have to maintain two scripts.

It would be nice if we could create one script working for both cases.

Next, let’s see how to achieve the goal.

3.3. One Single Script Covering Two Scenarios

First, let’s take a look at the second version of the script:

$ cat rmLines_v2.sh
#!/bin/bash
err_usage(){
    echo "The Arguments are not accepted!"
    echo "Usage: $0 <-i or -e> <FILENAME> <FROM_LINE_NUMBER>"
    echo "-i : Remove lines including the given line."
    echo "-e : Remove lines excluding the given line."
    exit 1
}

if [ $# -ne 3 ]; then
    err_usage
fi

FILE="$2"
LINE_NO=$3
case "$1" in
    -i)
	LINE_NO=$(( LINE_NO - 1 ))
        ;;
    -e)
        ;;
    *)
        err_usage
        ;;
esac

i=1
while read line; do
    echo "$line"
    if [ "$i" -eq "$LINE_NO" ]; then
         break
    fi
    i=$(( i + 1 ))
done <"$FILE"

As the output above shows, we’ve introduced a new argument to the script: -i or -e to specify the line removal semantics, either including or excluding the given line, respectively.

Apart from that, we’ve added a new if check and a new case block to the script. The newly added if block verifies if a user has passed three arguments to the script.

The case statement checks if the first option is “-i” or “-e“. Further, it’ll reduce one from the $LINE_NO variable if the user passes the -i option.

The argument parsing we’ve shown in the script is just an example. we discuss more advanced argument parsing in greater detail in “How to use command-line arguments in a Bash script“.

Finally, let’s test version 2 of the script, with the -i and -e option, to see if it solves our problem:

$ ./rmLines_v2.sh -i input.txt 5
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.

$ ./rmLines_v2.sh -e input.txt 5
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

Great! As the output shows, we’ve achieved coverage of both scenarios using a single script. Therefore, we’ve solved the problem using pure Bash.

4. Using the head Command

Even though the pure Bash solution doesn’t require any external packages, we have to implement each step, such as the loop, by ourselves.

There are a lot of widely used text processing tools in the Linux command-line arsenal. Usually, they can solve our problems in a pretty compact and straightforward way.

The head command is probably the most obvious way to solve the problem. This is because the head command will output the first part of the file. It’s exactly what we’re looking for.

4.1. Remove All Lines After the Given Line

We can use head‘s -n X option to get the first X lines:

$ head -n 5 input.txt 
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

We may want to replace the hard-coded “5” in the head command above with a shell variable to make the command easy to be assembled in a script:

$ head -n "$LINE_NO" input.txt 

4.2. Remove the Given Line and All Lines After It

The head command itself doesn’t support both output inclusive and exclusive of the given line. However, we can adjust the in the ‘-n X‘ option to achieve that:

$ LINE_NO=5
$ head -n "$(( LINE_NO - 1 ))" input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.

It isn’t difficult to wrap the head command in a shell script that is similar to the one we wrote in the pure Bash solution.

We can let the shell script handle the arguments, such as the “inclusive” or “exclusive” options, and change the $LINE_NO variable used in the head command.

So, we can also achieve one single script covering two scenarios with the head command.

5. Using the sed Command

So far, we’ve solved the problem using pure Bash and the handy head command.

Now, let’s take a look at how sed solves the problem.

5.1. Remove All Lines After the Given Line

sed can solve the problem in different ways. For example, all these commands will do the job (assuming that we’ve stored the value 5 in the $LINE_NO shell variable):

sed -n "1,$LINE_NO p;$(( LINE_NO + 1 )) q" input.txt
sed "$(( LINE_NO + 1 )),$ d" input.txt
sed "1, $LINE_NO ! d" input.txt

Note that we use double quotes to wrap the commands in all sed commands above to expand the shell variables.

Moreover, we added a space between “!” and the “d” command in the last command. This is because if we put “!d” between double quotes, the ‘!‘ character will trigger, by default, Bash’s history expansion.

The three commands will produce the same output. However, if we consider the performance, the first command will have better performance than the other two, although it looks longer than both.

This is because the first command only reads until LINE_NO+1 lines. After that, the sed command will quit (q) processing the input file. This is particularly useful when we handle large input files.

Now, let’s take the first command as an example to do a test:

$ echo $LINE_NO
5
$ sed -n "1,$LINE_NO p; $(( LINE_NO + 1 )) q" input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

It’s worth mentioning that sed has a -i option allowing us to write the change back to the file.

5.2. Remove the Given Line and All Lines After It

It isn’t difficult to change our sed commands to make them work with the “inclusive deletion” scenario. We can play some math tricks on the $LINE_NO variable to achieve that:

$ sed -n "1,$(( LINE_NO - 1 )) p; $LINE_NO q" input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.

We’ve solved the problem using the sed command.

If we want to make one single sed command work for both scenarios, we can wrap the sed command in a shell script that is similar to the pure Bash solution.

6. Using the awk Command

awk is another powerful text-processing utility. Now, let’s see how awk solves the problem.

6.1. Remove All Lines After the Given Line

Similar to sedawk can also solve the problem in a pretty compact way:

awk 'NR <= 5' input.txt

However, we want to make the command more flexible. Therefore, we’ll extract the hardcoded ‘5‘ to an awk variable “lineNo. Further, to gain better performance, we exit the file processing when the current processing line number is greater than the value of from:

$ awk -v lineNo='5' 'NR > lineNo{exit};1' input.txt 
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

6.2. Remove the Given Line and All Lines After It

There are several ways to change the awk command to do “inclusive deletion”. A straightforward approach is replacing the variable from with (from -1):

$ awk -v lineNo='5' 'NR > (lineNo-1){exit};1' input.txt
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.

We can also build one command to cover both “inclusive-deletion” and “exclusive-deletion”. This is easier with awk as it supports variables and awk script language:

$ awk -v opt="i" -v lineNo="5" 'NR > lineNo-( opt == "i"? 1 : 0 ){exit};1' input.txt 
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.

$ awk -v opt="e" -v lineNo="5" 'NR > lineNo-( opt == "i"? 1 : 0 ){exit};1' input.txt 
I am the 1st line.
I am the 2nd line.
I am the 3rd line.
I am the 4th line.
I am the 5th line.

Of course, if we like, we can also wrap the awk command in a small shell script and pass the “opt” and “lineNo” from the shell variable, just like we’ve done with the sed command.

7. Conclusion

Removing lines from a text file is a common operation when we work under the Linux command line.

This article addressed how to delete all lines in a file starting from a specific line. We’ve introduced four different approaches to solve the problem through examples.

The pure Bash solution doesn’t require any external package as dependencies. However, using handy and powerful text processing tools, such as head, sed, or awk, can solve the problem in a more compact and straightforward way.

Also, we’ve discussed the technique to make the same command cover both “inclusive-deletion” and “exclusive-deletion” scenarios.

Comments are closed on this article!