Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
In this tutorial, we’ll learn how to remove the first n characters of a line using the tools provided by GNU/Linux.
cut allows us to select certain sections of a line either by length or by a delimiter.
Let’s use the first of these to remove the first three letters of our string. We’ll tell it to remove up to the 4th character:
$ echo '123456789' | cut -c 4-
456789
Since we know that there has to be a finite number of letters to delete, then we have a pattern. sed allows us to filter and transform text, in many cases, with the help of patterns.
Using a regular expression, we can search for the first three characters and have sed remove them from the line:
$ echo '123456789' | sed -r 's/^.{3}//'
# |____||____ sed removes them
# |
# |__ search for the first three characters
With the parameter -r, we’ll be able to use extended regular expressions.
Just like sed, grep also operates using text patterns. With the same regular expression we’ll look for the first three characters:
$ echo '123456789' | grep -Po '^.{3}\K.*'
The -Po flags, instruct grep to interpret the pattern as a Perl-compatible regular expression.
The \K escape sequence causes what was previously matched (the first three characters) not to be included at the end, then .* matches everything that follows.
Further use cases and examples of grep can found on Common Linux Text Searches.
awk enables us to apply actions to certain patterns.
Recalling our regular expression, we can use it in our awk script as an argument to the sub function to remove the desired characters:
$ echo '123456789' | awk 'sub(/^.{3}/,"")'
And, there are a few other ways that awk can achieve this for us.
In the remaining examples, we’ll use a variable that we’ll define as range. While we could do this without a variable – inlining the value in the expression – but variables can make our command more readable, just like in coding.
Additionally, with the introduction of variables, we can control the size of the range by sending it through a parameter, keeping intact the awk script. So, by parametrizing, we’ll not lose generality in our script.
Going back to our first approximation, let’s make use of the variable:
$ echo '123456789' | awk -v range="3" 'sub(sprintf("^.{%s}",range),"")'
# |____________|
# |
# Here we compose our regular expression _______|
Also, we can instruct awk to consider the empty char as the field separator. Then, we can iterate over each character printing only from the desired position to the end of the line:
$ echo '123456789' | awk -F '' -v range=3 '{for (i=1; i<=NF; i++) if (i > range) printf $i; print ""}'
# |___| |________|
# | |_____ We assign the value "3" to the variable "range"
# |
# |_________ We set the input field separator as the null string and
# we let a space between the null character and the -F parameter.
A more convenient way to do this is with the substr function:
$ echo '123456789' | awk -v range=3 '{print substr($0,range+1)}'
In the latter case, we can exploit that the default behavior of awk is to print the entire record (stored in the variable $0), so we can only modify it:
$ echo '123456789' | awk -v range=3 '$0 = substr($0,range+1)'
perl is an interpreter of the Perl language bringing a great set of features to text processing.
As we did for sed, grep, and the sub function of awk, we can apply the regular expression in our perl call:
$ echo '123456789' | perl -pe 's/^.{3}//'
Available in Bash and Zsh, parameter expansion is useful to manipulate ranges of characters:
$ var="123456789"
$ echo ${var:3}
Or, only with Zsh:
$ var="123456789"
$ echo $var[4,-1]
A disadvantage of this approach is that the lines coming from the character streams will have to be assigned to a variable before they are cut. If we wanted to do something like that, we would have to use:
$ while read var || [[ -n $var ]]; do echo ${var:3}; done < example_file.txt
Or:
$ <command> | while read var || [[ -n $var ]]; do echo ${var:3}; done
In this tutorial, we use some tools provided by GNU/Linux to remove the first n characters from a string.