Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Introduction

When configuring software for compilation or simply running any script, we might sometimes encounter errors with a relatively cryptic text:

  • /bin/sh^M: bad interpreter
  • /bin/bash^M: bad interpreter
  • syntax error: unexpected end of file
  • -bash: $’sleep\r’: command not found

In this tutorial, we’ll tackle line endings and errors that we might see when they are incorrect. First, we discuss the general concept of how lines end. Next, we explore different problems that line endings can cause. After that, we’ll talk about a specific type of script that can exhibit line-end issues. Finally, we delve into ways to make sure a file uses given characters as newlines.

For brevity and clarity, when the characters would otherwise be hidden, we often use <CR> to represent a carriage return and <LF> for the line feed character as per the standard. This is done in every context, including code snippets. Also, we use some terms interchangeably:

  • line ending
  • line break
  • newline

We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.

2. Line Endings

How a file line ends isn’t universal. Moreover, we can ignore the concept of lines altogether, instead treating files as binary, i.e., a sequence of bytes.

When it comes to text files with separate lines, Microsoft Windows, Apple macOS, and Linux differ in the default line endings that each operating system (OS) expects:

  • Microsoft Windows – <CR><LF>
  • Apple macOS – initially <CR>, currently <LF>
  • Linux – <LF>

Further, since some software products perform automatic conversions between these options, we can end up with different issues.

3. Problems With Line Endings

While many applications support all ways to end a line, some sensitive texts might not work without the correct ending.

3.1. Source Code File Newlines

Also called tokens, character sequences in any source code should adhere to the syntax of their language.

So, depending on the source file and how it’s used, we might encounter different problems stemming from its line endings.

On the one hand, some file types like the Portable Document Format (PDF) can work with any version of a newline, but we have to take the number of bytes into account for, e.g., the xref table:

$ cat lf.pdf
[...]
xref
0 5
0000000000 65535 f
0000000018 00000 n
0000000077 00000 n
0000000178 00000 n
0000000457 00000 n
[...]
$ cat crlf.pdf
[...]
xref
0 5
0000000000 65535 f
0000000021 00000 n
0000000086 00000 n
0000000195 00000 n
0000000490 00000 n
[...]

On the other hand, specific line breaks are sometimes a must as most interpreters and compilers need to know what to expect for a newline sequence in files.

3.2. Shell Newlines

Critically, attempting to mix and match line endings in a command, script, or a sensitive file can result in issues.

For instance, shells have strict rules when it comes to the type and order of character sequences:

$ test<LF>
$

Notably, the <LF> character is essential in this simple Bash test command example, as that’s what the shell expects.

To demonstrate, let’s try to prepend a <CR>, converting the line ending to another format:

$ test<CR><LF>
-bash: $'test\r': command not found

By using Ctrl-v Ctrl-m, we append <CR> to test. After that, we can either press Return or leverage Ctrl-m to add the actual terminator.

Of course, the command line is only accepted after <LF>, and that’s the only character omitted from the sequence, which the shell interprets as a command. Thus, Bash essentially tries and fails to execute the command test<CR>, which is similar to attempting to run testRANDOMCHARACTERS.

3.3. Shell Script Newlines

Of course, what applies to the shell is also valid for shell scripts. In addition, scripts have some unique constructs that can pose further issues.

One of these is the shebang, which hints at the interpreter of the source file:

$ cat script.sh
#!/bin/bash<CR><LF>
test<CR><LF>
$ chmod +x script.sh
$ ./script.sh
-bash: ./script.sh: /bin/bash^M: bad interpreter: No such file or directory

Here, we see an error, which includes the special character notation ^M instead of <CR>.

The error itself states that /bin/bash<CR> is a bad interpreter, which it is, similar to test<CR>. In this case, the problem is in the #!/bin/bash shebang, which is only activated when running an executable script directly but is otherwise ignored:

$ cat script.sh
#!/bin/bash<CR><LF>
test<CR><LF>
$ bash script.sh
script.sh: line 2: $'test\r': command not found

Notably, we don’t see a shebang problem here.

Let’s see a common scenario where errors like the above arise.

4. Autotools Configure Scripts

The Autotools configure script mechanism is mainly used to prepare code for compilation on a given platform:

  • detect available resources
  • set executable paths
  • identify library locations
  • configure linking

Often, we get configure scripts as part of a source package, which we’ve downloaded. In some cases, like when using Git or FTP, the downloads themselves can convert newlines.

Since it’s a shell script produced by Autoconf, configure is susceptible to all issues with newlines we already discussed:

$ cat configure
#! /bin/sh<CR><LF>
[...]
# Define the name and version of the package.
PACKAGE_NAME='package'<CR><LF>
[...]
$ ./configure
-bash: ./script.sh: /bin/sh^M: bad interpreter: No such file or directory

Notably, the shebang of configure causes an error due to unexpected line endings. However, the rest of the file might not raise any exceptions as it mainly, if not exclusively, consists of variable assignments.

Critically, this is a double-edged sword, as not erroring out doesn’t mean we won’t get issues down the line.

5. Fixing Line Endings

Generally, if a language or script expects given line endings, we should use them. Still, there are many options to detect and convert line breaks in files:

  • sed with a simple substitution regular expression (regex) like ‘s/\r//’
  • tr with -d and the unwanted character, e.g., ‘\r’
  • awk with the gsub() function like ‘gsub(/\r/,””)’
  • perl with an -e one-liner like ‘s/\r//’
  • dos2unix for simple <CR><LF> to <LF> conversion

In addition, we can avoid the shebang issues if we run a script as an argument of its interpreter instead of directly. This fact is especially important for configure scripts:

$ ./configure
-bash: ./script.sh: /bin/sh^M: bad interpreter: No such file or directory
$ bash configure
$

Still, to avoid any trouble, using an editor like vi to manually edit or automatically convert line endings can be beneficial.

6. Summary

In this article, we talked about newlines, how they can affect file processing, and what we can do about it.

In conclusion, having the correct line endings is often critical for a file to properly display or run, especially when it comes to shell scripts.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

guest
0 Comments
Inline Feedbacks
View all comments