When configuring software for compilation or simply running any script, we might sometimes encounter errors with a relatively cryptic text:
- /bin/sh^M: bad interpreter
- /bin/bash^M: bad interpreter
- syntax error: unexpected end of file
- -bash: $’sleep\r’: command not found
In this tutorial, we’ll tackle line endings and errors that we might see when they are incorrect. First, we discuss the general concept of how lines end. Next, we explore different problems that line endings can cause. After that, we’ll talk about a specific type of script that can exhibit line-end issues. Finally, we delve into ways to make sure a file uses given characters as newlines.
For brevity and clarity, when the characters would otherwise be hidden, we often use <CR> to represent a carriage return and <LF> for the line feed character as per the standard. This is done in every context, including code snippets. Also, we use some terms interchangeably:
- line ending
- line break
2. Line Endings
How a file line ends isn’t universal. Moreover, we can ignore the concept of lines altogether, instead treating files as binary, i.e., a sequence of bytes.
When it comes to text files with separate lines, Microsoft Windows, Apple macOS, and Linux differ in the default line endings that each operating system (OS) expects:
- Microsoft Windows – <CR><LF>
- Apple macOS – initially <CR>, currently <LF>
- Linux – <LF>
Further, since some software products perform automatic conversions between these options, we can end up with different issues.
3. Problems With Line Endings
While many applications support all ways to end a line, some sensitive texts might not work without the correct ending.
3.1. Source Code File Newlines
Also called tokens, character sequences in any source code should adhere to the syntax of their language.
So, depending on the source file and how it’s used, we might encounter different problems stemming from its line endings.
$ cat lf.pdf [...] xref 0 5 0000000000 65535 f 0000000018 00000 n 0000000077 00000 n 0000000178 00000 n 0000000457 00000 n [...] $ cat crlf.pdf [...] xref 0 5 0000000000 65535 f 0000000021 00000 n 0000000086 00000 n 0000000195 00000 n 0000000490 00000 n [...]
3.2. Shell Newlines
Critically, attempting to mix and match line endings in a command, script, or a sensitive file can result in issues.
For instance, shells have strict rules when it comes to the type and order of character sequences:
$ test<LF> $
To demonstrate, let’s try to prepend a <CR>, converting the line ending to another format:
$ test<CR><LF> -bash: $'test\r': command not found
By using Ctrl-v Ctrl-m, we append <CR> to test. After that, we can either press Return or leverage Ctrl-m to add the actual terminator.
Of course, the command line is only accepted after <LF>, and that’s the only character omitted from the sequence, which the shell interprets as a command. Thus, Bash essentially tries and fails to execute the command test<CR>, which is similar to attempting to run testRANDOMCHARACTERS.
3.3. Shell Script Newlines
Of course, what applies to the shell is also valid for shell scripts. In addition, scripts have some unique constructs that can pose further issues.
One of these is the shebang, which hints at the interpreter of the source file:
$ cat script.sh #!/bin/bash<CR><LF> test<CR><LF> $ chmod +x script.sh $ ./script.sh -bash: ./script.sh: /bin/bash^M: bad interpreter: No such file or directory
Here, we see an error, which includes the special character notation ^M instead of <CR>.
The error itself states that /bin/bash<CR> is a bad interpreter, which it is, similar to test<CR>. In this case, the problem is in the #!/bin/bash shebang, which is only activated when running an executable script directly but is otherwise ignored:
$ cat script.sh #!/bin/bash<CR><LF> test<CR><LF> $ bash script.sh script.sh: line 2: $'test\r': command not found
Notably, we don’t see a shebang problem here.
Let’s see a common scenario where errors like the above arise.
4. Autotools Configure Scripts
- detect available resources
- set executable paths
- identify library locations
- configure linking
Since it’s a shell script produced by Autoconf, configure is susceptible to all issues with newlines we already discussed:
$ cat configure #! /bin/sh<CR><LF> [...] # Define the name and version of the package. PACKAGE_NAME='package'<CR><LF> [...] $ ./configure -bash: ./script.sh: /bin/sh^M: bad interpreter: No such file or directory
Notably, the shebang of configure causes an error due to unexpected line endings. However, the rest of the file might not raise any exceptions as it mainly, if not exclusively, consists of variable assignments.
Critically, this is a double-edged sword, as not erroring out doesn’t mean we won’t get issues down the line.
5. Fixing Line Endings
Generally, if a language or script expects given line endings, we should use them. Still, there are many options to detect and convert line breaks in files:
- sed with a simple substitution regular expression (regex) like ‘s/\r//’
- tr with -d and the unwanted character, e.g., ‘\r’
- awk with the gsub() function like ‘gsub(/\r/,””)’
- perl with an -e one-liner like ‘s/\r//’
- dos2unix for simple <CR><LF> to <LF> conversion
In addition, we can avoid the shebang issues if we run a script as an argument of its interpreter instead of directly. This fact is especially important for configure scripts:
$ ./configure -bash: ./script.sh: /bin/sh^M: bad interpreter: No such file or directory $ bash configure $
In this article, we talked about newlines, how they can affect file processing, and what we can do about it.
In conclusion, having the correct line endings is often critical for a file to properly display or run, especially when it comes to shell scripts.