Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: January 23, 2024
Line breaks are special characters that indicate the end of a line. They’re also known as newline characters or end-of-line (EOL) characters. Different operating systems and applications use different line break types to mark the end of a line. However, the three most common line break types are CR LF, LF, and CR.
In this tutorial, we’ll learn what each line break type means and how they originated. We’ll also discuss the challenges and strategies for cross-platform compatibility when dealing with different line breaks.
CR LF, which stands for Carriage Return and Line Feed, is a two-character sequence that consists of a carriage return character (CR) followed by a line feed character (LF).
Further, a carriage return character moves the cursor to the beginning of the line while a line feed character moves the cursor to the next line. Together, they create a new line in a text data stream.
This way, we modify both the x-axis and y-axis of the cursor.
The CR LF line break is also known as \r\n or 0x0D0A in hexadecimal notation.
The CR LF line break originated from the typewriter era when a manual carriage return and a line feed were required to start a new line on the paper. Later, when computers and printers were developed, they adopted the same convention. The CR LF became the standard for many early operating systems:
Today, the CR LF line break is still widely used in the Windows environment.
LF, which stands for line feed, is a single-character sequence that consists of a line feed character. Thus, it moves the cursor to the next line without returning to the beginning of the line, thereby feeding a line on the y-axis without touching the x-axis.
Also, we use \n to represent LF line break or 0x0A in hexadecimal notation.
The LF line break type originated from the Unix operating system which is compatible with the ASCII standard. Also, the ASCII standard defined the line feed character as the control character for moving to a new line. The Unix operating system adopted the LF line break as the default for its text files. Thereafter, other operating systems such as Linux and macOS also followed suit.
Presently, the LF line break is the most common line break type in the modern computing world especially in non-Windows environments.
CR stands for carriage return. The CR line break moves the cursor to the beginning of the line to signify a line break.
We can also represent the CR line break as \r or 0x0D in hexadecimal notation.
The CR line break originated from the Macintosh operating system which Apple developed in the 1980s. Further, the Macintosh OS used the CR line break as the default for its text files.
However, the CR line break is the least common line break type in the modern computing world and it’s mostly obsolete.
Different line break types can cause compatibility issues when transferring text files across different platforms:
Some general problems may arise due to this behavior:
However, there are some strategies we can use to handle line break differences when dealing with text files across different OS or applications.
This conversion can be done using a tool or a script. For instance, we can use the dos2unix and unix2dos commands to convert between CR LF and LF line breaks in Linux systems:
$ dos2unix hello.txt
This code snippet replaces the CR LF line breaks with LF line breaks in the hello.txt file.
We can also use the tr command to translate between different line break types:
$ tr '\r' '\n' < input.txt > output.txt
In this case, we replace all CR line breaks in the input.txt file with LF line breaks and write the result to output.txt.
Some platforms and applications also have settings or options that enable the user to specify the line break type to use or to accept.
For instance, we can use the fileformat option in Vim to set or detect the line break type of a text file:
:set fileformat=unix
This command sets the line break type to Unix-style (LF). Notably, the fileformat option can take other values:
This way, we can change it according to the target platform.
As another example, we can use the newline parameter in the open() function in Python to control the line break type when reading or writing a text file:
# Writing to a file with Windows-style line endings
with open('output_windows.txt', 'w', newline='\r\n') as file:
file.write("This is a line.\nAnd this is another line.")
Here, we set the newline parameter to \r\n for Windows-style line endings. When reading a file, we can also use this parameter to control how universal newline mode is handled.
In this article, we’ve understood the difference between CR LF, LF, and CR line break types by looking at their definition and historical background.
In conclusion, differing line breaks can cause compatibility issues when dealing with text files across different applications. Therefore, it’s important to be aware of the line break type of a text file and to use appropriate strategies for handling line break differences.