How To Perform Operations at the End-of-Line With sed

1. Introduction

sed, short for “Stream Editor”, is a Linux command-line stream editor that performs text transformations on the input stream (file or input from a pipeline). The capabilities of sed include searching text, finding and replacing text, or even inserting and deleting text in a file.

In this tutorial, we’ll learn to perform various operations at the end-of-line (EOL) using sed, such as deleting, replacing, or adding text. We’ll start by understanding end-of-line (EOL) characters, followed by a refresher on the sed command. Finally, we’ll see two examples of using sed to first replace, and then add, text at the end of a line.

2. End-of-Line (EOL) Characters

First, let’s comprehend what exactly end-of-line means. In the context of text files, the end-of-line (EOL) character is a fundamental element that indicates the termination of a line. Seeing EOL, the text editor knows that it should move on to processing the next line. Different operating systems employ distinct EOL conventions. Let’s see how the most common operating systems define EOL in their systems:

Unix-like systems: \n (newline)
Windows: \r\n (carriage return + newline)
Mac OS (prior to Mac OS X): \r (carriage return)
macOS (Mac OS X and after): \n (newline)

Knowing this variation becomes crucial when handling files from diverse sources. Since our focus will be on Linux systems, when we refer to “EOL”, we will assume it to be newline (\n).

3. sed Basics

Before diving into detecting end-of-line characters, let’s revisit some fundamental sed concepts. sed is a stream editor whose operation is based on applying commands to lines of input and writing it to the output stream. Conceptualized by Lee E. McMahon in 1973, sed has evolved into a powerful tool widely embraced by system administrators, developers, and data enthusiasts.

Let’s review the structure of a typical sed command:

sed [OPTIONS] COMMAND [INPUTFILE]

In this article, we’ll focus on the COMMAND argument, which we can use to perform actions such as substitution or deletion. Commands can be provided as one-liners or read from a file, and regular expressions are supported as well. The INPUTFILE can be any text file or piped input stream.

Particularly in the case of substitution, the structure will be:

sed 's/search_pattern/replacement_pattern/' input_file

Here, the s specifies the substitution operation. The / (forward slashes) are delimiters. The search_pattern part, if found, is replaced with the replacement_pattern on the fly, and the result is printed in the command line console.

Interestingly, sed operates on newline-delimited data, with the delimiters excluded from what a sed script processes. The sed command doesn’t encounter newlines in the text it reads, since it reads line by line. Each of those lines ends before the newline, thus sed‘s pattern space has no visibility of EOL characters. To overcome this and gain visibility into end-of-line (EOL) using sed, regular expressions come in handy.

4. Regular Expressions and Anchoring

Regular expressions (regex) are powerful patterns used for text matching and manipulation. These patterns, used with commands like sed, enable us to search for and match text based on specific criteria.

To anchor any regex pattern to the end of a line, we use the $ symbol at the end of the pattern. Conversely, for the start of a line, the ^ symbol is used at the beginning of the pattern. These symbols act as anchors, helping us precisely locate strings from the start or the end of a line. The $ symbol, especially, is useful for us as it matches patterns right up to the boundary of the end-of-line (EOL).

For example, if we want to match any uppercase or lowercase English letters (‘A’ to ‘Z’ as well as ‘a’ to ‘z’), we could write the regex pattern as:

[A-Za-z]

Here, A-Z and a-z mean essentially matching any letter that is in between these two ranges (A-Z and a-z), so it’ll match any uppercase or lowercase letter. The square brackets [] enclose them, allowing us to find one character matching any uppercase or lowercase letter of the English alphabet.

Now, if we want to match it to an entire word containing any of these letters, we could write:

[A-Za-z]*

The * symbol at the end matches zero or more letters in the English alphabet. This signifies a word with any number of letters.

Finally, to anchor the pattern for matching the last word only, we could use the $ symbol:

[A-Za-z]*$

The expression above matches the last word in a line before the EOL.

5. Practical Examples

Now that we know a few sed basics and how anchoring works, let’s dive into a couple of practical examples.

5.1. Replacing Text at the End-of-Line

Suppose we have a file named baeldungScripts.txt that contains a list of bash script file names:

$ cat baeldungScripts.txt
ninjaCats.sh
baconSoda.txt
encryptxt.txt
hackWifi.sh
changepwd.txt

As we can see, some of these files still contain the extension .txt, but we want them to have the .sh extension. We can easily replace the extensions at the end using sed. The trick here is to match the last part of the file names using regex patterns. We can do this using the command:

$ sed 's/txt$/sh/' baeldungScripts.txt > fixedBaeldungScripts.txt

$ cat fixedBaeldungScripts.txt
ninjaCats.sh
baconSoda.sh
encryptxt.sh
hackWifi.sh
changepwd.sh

Here, we exported the new file names to a separate text file called fixedBaeldungScripts.txt. The s specifies the substitution operation as discussed before. txt$ is the search pattern, and sh is the replacement string.

To better explain the search pattern, we’ve used a $ sign after txt, which signals to replace any txt string that occurs at the end of the line. We can notice that due to using the $ symbol, the txt substring present in the third file’s original name encryptxt.txt didn’t get replaced to encrypsh.sh, instead being left as encryptxt.sh. Note that this exact regex search pattern isn’t the only one we can use to perform this specific task.

5.2. Adding Text at the End-of-Line

The substitution operation is quite versatile, and we could, in theory, use it to add text (by replacing blank text) and remove text (by replacing some text with an empty string). To demonstrate, let’s see another example, where we use it to append text to the end of every line. Once again, we’ll use sed and the s option to add a specified string at the end of each line.

Suppose we have a sample to-do list in a text file named todolist.txt. We can again inspect the contents of that file:

$ cat todoList.txt
Buy potatoes
Clean the house
Learn about sed
Make pancakes

After completing all of the tasks in the to-do list at the end of the day, we’d like to add the string ” (done)” at the end of each line. We’ll use this command to perform the task:

$ sed 's/$/ (done)/' todoList.txt > doneList.txt

$ cat doneList.txt
Buy potatoes (done)
Clean the house (done)
Learn about sed (done)
Make pancakes (done)

We’ve used the s (substitution) operation to add the string “ (done)” at the end of each line corresponding to one task. Here, we’re replacing only the end of a particular line ($), essentially adding the string to each line. The final list is exported to the new text file named doneList.txt.

6. Conclusion

In this article, we learned about the basics of the sed command. We also learned about the end-of-line (EOL) character and how it differs among operating systems. Finally, we saw two real examples of using sed to detect and manipulate text at the end of a line.

sed is a powerful Linux utility used to find, replace, add, or delete text from input streams. Although sed cannot directly manipulate EOL characters, we can use regular expressions to aid in our task.

Full Archive

About Baeldung

Administration

Filesystems

Processes

Files

Scripting

Installation

Networking

Security