1. Introduction

Whitespace can be critical when formatting both regular text and source code. Knowing how to efficiently insert and remove it when editing files often saves time and prevents errors.

In this tutorial, we explore ways of removing leading and trailing whitespace from all lines in a file with the Vi editor. First, we discuss what whitespace is and why we need it. Next, we look at a way for Vi to show us spacing characters, particularly at the beginning and end of a line. After that, we check common commands for stripping leading and trailing whitespace. Finally, we turn to an advanced and more versatile method to do the same.

For brevity, we use vi (Vi) when referencing both the Vi and Vim editors.

We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.

2. Whitespace

Whitespace comprises several ASCII characters:

  • <TAB>, horizontal tabulation
  • <LF>, line feed (newline)
  • <VT>, vertical tab
  • <FF>, form feed (new page)
  • <CR>, carriage return
  • Space

Of these, the <CR> and <LF> characters are usually at the end of the line and marked by an end-of-line (EOL) character. EOL, <TAB>, and Space are perhaps the most commonly used separators.

Naturally, whitespace is vital even for this article. Without whitespace, syntactic elements of the language are hard or impossible to distinguish. In natural languages, words are separated by spaces, while paragraphs often start with tabs after a new line. Programming languages have even more elements, i.e., tokens.

Because of this, many editors, integrated development environments, and programming languages provide built-in functions that handle whitespace around lines:

  • strip leading whitespace
  • strip trailing whitespace
  • strip surrounding whitespace
  • insert leading whitespace

Often, we strip spaces and tabs to ensure proper comparisons. For instance, the lines ” text content ” and “text content” are not identical unless we omit the leading and trailing spaces. While stripping can be useful on both ends of a line, insertion is mostly done at the beginning in the form of indentation.

The Vi editor is famous for its versatility while still being part of both the SUS and POSIX standards.

Thus, among other functions, the editor provides flexible ways of showing and manipulating whitespace.

3. Identify Whitespace in Vi

The Vi editor has a list mode, which can replace multiple special and meta characters with visual alternatives:

  • EOL
  • <TAB>, horizontal tabulation
  • leading spaces
  • trailing spaces
  • Space

The replacement characters can potentially be colored depending on the editor’s settings.

To configure list mode, we use the listchars variable. Let’s check its initial value:

set listchars?
[...]
listchars=eol:$

By default, we would only see $ in place of EOL when using list mode. For example, we can have the following text:

   This line has leading spaces.
The second line has trailing spaces and a Tab character.    
     A third line with both leading and trailing spaces.     
The final line has two Tab characters here:		.

Let’s enable list mode:

:set list

Now, we see the following:

   This line has leading spaces.$
The second line has trailing spaces and a Tab character.    	$
     A third line with both leading and trailing spaces.     $
The final line has two Tab characters here:		.$

Each line end is now marked with a colored $. To see all whitespace characters, we can add some more transformations to listchars:

:set listchars=lead:<,eol:$,trail:>,tab:]-,space:&

In the line above, each type of special character is replaced by its matching one after the : colon. Of all listchars, here are the ones we set:

  • space: non-leading and non-trailing spaces
  • tab: tab characters
  • trail: trailing spaces (overwrites the space character)
  • eol: end of line, works for both and , depending on the context
  • lead: leading spaces (overwrites the space character)

Note that lead may not be available in all versions of vi. Its availability depends on the compilation options and exact features. After the transformation, our text looks like this:

<<<This&line&has&leading&spaces.$
The&second&line&has&trailing&spaces&and&a&Tab&character.>>>>]---$
<<<<<A&third&line&with&both&leading&and&trailing&spaces.>>>>>$
The&final&line&has&two&Tab&characters&here:]----]-------.$

After we identified all spots where we have leading and trailing whitespace, we can move on to manipulation.

4. Removing Leading and Trailing Whitespace in Vi

An easy way to remove leading whitespace is the :le[ft] command. In combination with the % percent prefix to work on the whole buffer, we force left-alignment of the current file:

:%le

Of course, there’s not really an equivalent for trailing whitespace.

Still, the :g[lobal] command can help:

:g/\s$/norm $diw

Let’s see how it works. First, we match all lines with the simple \s$ pattern (surrounded with /), where $ marks an EOL and \s matches any of the following:

  • Space
  • <TAB>
  • <CR>
  • <LF>
  • <FF>

For each line that matches, we execute the normal mode instruction $diw, i.e., go to the end of the line ($) and delete the inside word.

Similarly, for leading whitespace, we can use :g/^\s/norm 0diw, where ^ marks the beginning of the line, and 0 goes there before deleting a word.

5. Regular Expressions for Substitution in Vi

Indeed, one of the common methods of substitution is the usage of regular expressions (regex) via the basic s[ubstitute] command:

:%s/\s\+$//e

Here, we replace all strings of one or more whitespace characters (\s\+) followed by an EOL ($) in the current buffer with an empty string, ignoring most errors (/e). To be precise, we perform this once per line, but since a line only has one EOL, the replacement can only happen one time.

Of course, using a Space or Tab (\t) alone is also possible but less universal. The $ EOL marker prevents the trimming of the newline character since it does not get included in the match to replace.

Naturally, we can mutate the regex above for leading whitespace: :%s/^\s\+//e. To do both, we have to add some more complexity:

:%s/\(^\s\+\)\|\(\s\+$\)//ge

In this version, we use the alternation operator \| to allow both leading and trailing whitespace, each within \( parentheses \). Importantly, we added the g global flag, which ensures each line is processed for all matches.

Let’s take our example from earlier:

<<<This&line&has&leading&spaces.$
The&second&line&has&trailing&spaces&and&a&Tab&character.>>>>]---$
<<<<<A&third&line&with&both&leading&and&trailing&spaces.>>>>>$
The&final&line&has&two&Tab&characters&here:]----]-------.$

Now, let’s substitute with :%s/\(^\s\+\)\|\(\s\+$\)//ge:

This&line&has&leading&spaces.$
The&second&line&has&trailing&spaces&and&a&Tab&character.$
A&third&line&with&both&leading&and&trailing&spaces.$
The&final&line&has&two&Tab&characters&here:]----]-------.$

Note how we ensure all non-leading and non-trailing whitespace is preserved. As a side note, we can also skip lines that only consist of whitespace via a positive lookahead (\@=) and lookbehind (\@<=) for \S, which matches any non-whitespace character: :%s/\(^\s\+\S\@=\)\|\(\S\@<=\s\+$\)//ge.

Since the Vi regular expression engine is custom, we can see how it compares with, e.g., Perl Compatible Regular Expressions (PCRE) via :help perl-patterns.

As usual, we can wrap commands in Vi functions or simply automate them during saving:

autocmd BufWritePre *.c,*.pl :%s/\S\@<=\s\+$//ge

In this case, we used an autocmd to remove trailing spaces before writing a buffer (BufWritePre) to a C or Perl file.

6. Summary

In this article, we discussed whitespace and how to identify it in the Vi editor. Finally, we explored methods to remove it from the beginning and end of each line in a file.

In conclusion, as with most actions in the flexible Vi editor, there are many options to find and remove leading and trailing whitespace.

Comments are closed on this article!