Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: June 20, 2024
Text processing is a routine yet crucial task. Whether it’s inspecting log files to troubleshoot an issue, checking configuration files for specific settings, or evaluating large datasets for important details, efficiently searching and extracting relevant lines can be critical. In such situations, we might need to locate lines containing only one of the multiple specified words.
In this tutorial, we’ll learn how to find lines containing one of multiple words exclusively in Linux.
Before moving forward, let’s ensure we have a sample dataset to demonstrate the different approaches:
$ cat datafile.txt
Hey there, baeldung users
New baeldung lessons
Newly joined authors
Articles from baeldung authors
Old users of the new lessons
This sample file (datafile.txt) contains several lines of text for testing the commands used in the next sections.
Let’s suppose we have two words, baeldung and users, and we need to find all lines containing only one of these two words. Specifically, we are looking for lines that contain baeldung but not users (A and not B) or lines that contain users but not baeldung (B and not A), similar to the XOR operator.
We apply the same rationale using grep, sed, awk, and Perl to find lines containing one word exclusively.
We can use grep to find lines with only one of two words:
$ grep -E 'baeldung|users' datafile.txt | grep -vE 'baeldung.*users|users.*baeldung'
New baeldung lessons
Articles from baeldung authors
Old users of the new lessons
The first part of this command searches for lines containing either one or both words from the datafile.txt file, while the second part excludes lines containing both words in the same line. Thus, resulting in only lines containing just one of two words.
Let’s break down the regular expressions used in this command:
Similarly, we can apply the same technique to find one of three or more words exclusively:
$ grep -E 'users|authors|baeldung' datafile.txt | grep -vE 'users.*authors|users.*baeldung|authors.*users|authors.*baeldung|baeldung.*users|baeldung.*authors'
New baeldung lessons
Newly joined authors
Old users of the new lessons
This command finds all the lines containing exclusively one of the three words (users, authors, and baeldung). As we can see, this method can become tedious with multiple words.
sed is a command-line stream editor for filtering and transforming text. By default, sed prints each line of the input after processing it. However, we can suppress this behavior and instruct sed to print only lines matching a specific pattern.
For example, we can print only lines that contain either the word users or baeldung:
$ sed -ne '/users/{/baeldung/! p; d;}' -e '/baeldung/p' datafile.txt
New baeldung lessons
Articles from baeldung authors
Old users of the new lessons
Let’s take a closer look at the options used in this command:
In this command, the second script comes into play only if the line doesn’t contain the word users.
Similarly, we can apply the same method for finding lines with only one of multiple words:
$ sed -ne '/users/{/baeldung/! {/authors/! p; d;}}' -e '/baeldung/{/users/! {/authors/! p; d;}}' -e '/authors/{/users/! {/baeldung/! p; d;}}' datafile.txt
This command finds the lines from datafile.txt that contain exclusively one of three words: users, baeldung, and authors.
Furthermore, we can use extended regular expressions to retrieve lines with exclusively one of multiple words:
$ sed -nE '/user/{/author|baeldung/! p;}; /author/{/user|baeldung/! p;}; /baeldung/{/user|author/! p;}' datafile.txt
New baeldung lessons
Newly joined authors
Old users of the new lessons
This command uses an extended regular expression to merge all the conditions in a single sed script block.
The awk command-line utility executes programs written in the AWK programming language, designed for text processing and data extraction.
Let’s use awk to find lines that exclusively contain one of three specific words, i.e., users, baeldung, and authors:
$ awk '(/baeldung/+/users/+/authors/)==1' < datafile.txt
New baeldung lessons
Newly joined authors
Old users of the new lessons
The AWK script attempts to match the pattern enclosed in forward slashes (/). If the pattern is found in a line, it returns 1 (true); otherwise, it returns 0 (false). The plus sign (+) adds the results of multiple patterns. Finally, ==1 evaluates that the value returned by the preceding expression is one, ensuring that exactly one of the specified patterns is present in the line.
Furthermore, we can also use the bitwise XOR operator in awk:
$ awk 'xor(/baeldung/,/users/,/authors/)' < datafile.txt
New baeldung lessons
Newly joined authors
Old users of the new lessons
By utilizing XOR in awk, we can provide a list of patterns separated by commas to find lines with exclusively one of multiple patterns.
Perl is a highly capable and feature-rich programming language with built-in support for regular expressions and text manipulation.
We can use the if conditional statement for extracting lines with one of multiple words:
$ perl -ne 'print if /user/ && !/baeldung/ || /baeldung/ && !/user/' datafile.txt
New baeldung lessons
Articles from baeldung authors
Old users of the new lessons
This command uses the same idea, i.e., (A and not B) or (B and not A), to find lines with exclusively one of multiple words.
Let’s understand each option used in this command:
Similarly, let’s use this approach for three words:
$ perl -ne 'print if /user/ && !/author/ && !/baeldung/ || /author/ && !/user/ && !/baeldung/ || /baeldung/ && !/user/ && !/author/' datafile.txt
New baeldung lessons
Newly joined authors
Old users of the new lessons
As before, we can also employ the bitwise XOR operator (^):
$ perl -ne 'print if /user/ ^ /baeldung/ ^ /author/' datafile.txt
New baeldung lessons
Newly joined authors
Old users of the new lessons
Here, we provide a list of words as patterns separated by the ^ operator to find lines with exclusively one of multiple words.
In this article, we learned several ways to find lines containing one of multiple words exclusively.
Firstly, we created a sample dataset and discussed the rationale for finding lines with exclusively one of multiple words. Then, we used grep and sed commands to achieve our goal. Then, we explored the awk command, both with and without using the XOR operator, to accomplish the same task. Finally, we utilized if conditions in Perl to demonstrate the exclusive matching of lines based on specific criteria.
Although we can select any method depending on our preferences and needs, awk is often the simplest and most standard way to find lines containing exclusively one of multiple words in Linux.