Portable Document Format (PDF) files preserve the format of a document. This is because they’re meant to be viewed and not modified. Thus, we can open a PDF file from the command line without changing its format. In addition, learning to open PDF files from the command line is helpful when dealing with Linux servers with no graphical user interface.
In this tutorial, we discuss commands we can use in Linux to open PDF files from the command line. First, we take a look at how to make use of the less tool in detail. Then, we try other options to complete this same task.
2. Using the less Command
The less command is helpful when we want to view the contents of a file stored in a server that doesn’t have graphical user interface capabilities. That’s because it shows the file contents within the command line and can load large files one page at a time instead of loading all the pages at once.
Therefore, less makes sure there’s optimal memory usage when opening large files. Moreover, it provides a lot of options to customize its output.
With the less command, we use a common syntax:
$ less [option] file
Before we proceed, let’s note that recent versions of less use the pdftotext command in the background to extract only text from PDF files.
Luckily, pdftotext usually comes preinstalled with some Linux distributions. However, if that’s not the case when we try to read our file with less, we get a warning that pdftotext is unavailable for preprocessing and the result isn’t very legible:
$ less project.pdf pdftotext is not available for preprocessing ... %PDF-1.7 % 1 0 obj <</Type/Catalog/Pages 2 0 R/Lang(en-US) /StructTreeRoot 71 0 R/MarkInfo<</Marked true>>/Metadata 401 0 R/ViewerPreferences 402 0 R>> endobj 2 0 obj <</Type/Pages/Count 8/Kids[ 3 0 R 9 0 R 17 0 R 24 0 R 31 0 R 33 0 R 35 0 R 44 0 R] >> endobj 3 0 obj <</Type/Page/Parent 2 0 R/Resources<</Font<</F1 5 0 R>>/ExtGState<</GS7 7 0 R/GS8 8 0 R>>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 595.2 841.92] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>> endobj 4 0 obj <</Filter/FlateDecode/Length 480>> ...
Notably, the output shows a lot of the PDF source code, making it hard to read. To address the warning at the beginning of the output and make the output easier to read, we can install the Poppler package since pdftotext is part of it.
Now, let’s retry reading our PDF with less:
$ less project.pdf ...
RENT MANAGEMENT SYSTEM WEB AND APPLICATION DRAFT PAPER ^LIntroduction System / project Overview The rent management system is an idea to provide reconciliation, document management and marketing solutions to realtor investors. This solution is mainly meant to target real estate agents, landlords and other corporations aiming at gaining market share in the rental space. ...
As long as we have a recent version of less and Poppler tools, we should see the text contents of the PDF file page by page. To exit from less, we can press the q keyboard key.
Let’s now dive into pdftotext itself.
3. PDF Text Extraction Tool
The pdftotext command-line utility extracts the textual data from PDF files. It’s part of the Poppler tools package, which comes preinstalled on some Linux distributions.
Now, let’s convert our file:
$ pdftotext project.pdf
In this case, pdftotext creates a project.txt file with the textual contents of project.pdf in the same directory.
Finally, we can redirect the output above to the less command and get the contents displayed in the terminal:
$ pdftotext project.pdf - | less
Here, we push the text preprocessed by pdftotext for the less command to read. This is what happens when we use a compatible version of less while we already have pdftotext preinstalled.
4. Using the xdg-open Command
Before getting into xdg-open, first, let’s understand the idea of XDG (Cross-Desktop Group). Currently known as freedesktop.org, it’s a community that hosts a set of specifications to promote interoperability between desktop environments and graphical user interface (GUI) applications on Unix-like systems.
One of its projects is xdg-utils, a set of tools for incorporating applications with a user-preferred desktop environment. For example, xdg-open is one of these tools.
Since the Linux operating system has multiple desktop environments available to a user, there are different tools to open files specific to each. However, the xdg-open command can open files on most environments like KDE, LXDE, and GNOME.
The general syntax again just requires passing a file by name:
$ xdg-open [file]
In our case, we just pass project.pdf:
$ xdg-open project.pdf
When we provide a file name as the argument, xdg-open opens the file with the default PDF document viewer.
On the other hand, if we provide a file URL as the argument, the PDF file from that URL is opened using our default web browser. If we prefer to open the file with a specific browser, we can also do that instead.
5. Browser Options
Apart from xdg-open, another universal option is working with browsers from the command line.
For example, we can open our PDF file using google-chrome from the terminal:
$ google-chrome project.pdf
Of course, we can also open the file from the firefox browser:
$ firefox project.pdf
Since there is usually a browser preinstalled on most Linux distributions, this is a fairly universal option for opening PDF files as long as we have a GUI.
Evince is a document viewer for the GNOME desktop environment. It was created to work with many file formats like PDF, Postscript, and many more.
The way we open our PDF file is similar to the previous tools we’ve looked at, but this time using the evince command:
$ evince project.pdf
With this command, we can input the file path as an argument. If we don’t want to use the file path, we have to run the command in the same directory as the PDF file.
In this article, we learned about some tools that can help when opening PDF files from the command line. Usually, less is a good choice since it often opens large files faster than other alternatives. Furthermore, we can choose the xdg-open since it’s supported in different desktop environments. Finally, without a GUI, we can simply use pdftotext directly.