1. Overview

CSV and XLS are two common file formats for storing and sharing spreadsheet data. CSV stands for Comma-Separated Values and is a plain text format that uses commas to separate values in each row. XLS stands for Excel Spreadsheet and is a binary format that can store complex data types, formulas, charts, macros, and formatting options. XLSX is a newer Office Open XML (OOXML) format used by Excel 2007 and later.

CSV files are compatible with a variety of spreadsheet applications and can be easily parsed using Bash or other languages. The compatibility of XLS and XLSX files is not as great as that of CSV files.

In this tutorial, we’ll see how to use some Linux command-line and GUI programs to convert CSV data to XLS and XLSX formats. In all examples, we’ll use the same sample people-100.csv file from the Sample CSV Files archive.

Our test environment is a clean installation of Linux Mint 21.1.

2. CLI Tools

Before we begin, let’s take a quick look at the first few lines of our example file:

$ wget https://media.githubusercontent.com/media/datablist/sample-csv-files/main/files/people/people-100.csv
$ cat people-100.csv 
Index,User Id,First Name,Last Name,Sex,Email,Phone,Date of birth,Job Title
1,88F7B33d2bcf9f5,Shelby,Terrell,Male,[email protected],001-084-906-7849x73518,1945-10-26,Games developer
2,f90cD3E76f1A9b9,Phillip,Summers,Female,[email protected],214.112.6044x4913,1910-03-24,Phytotherapist
3,DbeAb8CcdfeFC2c,Kristine,Travis,Male,[email protected],277.609.7938,1992-07-02,Homeopath

Even without knowing the details of the CSV specification, the similarity to a database table organized into records and fields is obvious. For some simple purposes, such as listing people as in this case, spreadsheets can serve as databases.

2.1. unoserver

unoserver can convert CSV files to XLS or XLSX files by starting LibreOffice in listener mode and then using the UNO API to perform the conversion. The official README.rst explains the advantages of using LibreOffice directly without unoserver. The presence of LibreOffice is a prerequisite. In our case, it’s already preinstalled on Linux Mint.

To install unoserver, we first need to find the Python version that LibreOffice uses:

$ wget -O find_uno.py https://gist.githubusercontent.com/regebro/036da022dc7d5241a0ee97efdf1458eb/raw/find_uno.py
$ python3 find_uno.py
Found 1 Pythons with Libreoffice libraries:

On this Python version we can install unoserver:

$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo /usr/bin/python3 get-pip.py
Successfully installed pip-23.1.2 setuptools-67.8.0 wheel-0.40.0
$ sudo /usr/bin/python3 -m pip install unoserver
Successfully installed unoserver-1.4

To use unoserver, we need to start it as a background process:

$ unoserver --daemon
INFO:unoserver:Starting unoserver.
INFO:unoserver:Server PID: 3525

Let’s convert our sample CSV:

$ unoconvert people-100.csv people-100.xls
INFO:unoserver:Starting unoconverter.
INFO:unoserver:Opening people-100.csv
INFO:unoserver:Exporting to people-100.xls
INFO:unoserver:Using MS Excel 97 export filter

$ unoconvert people-100.csv people-100.xlsx
INFO:unoserver:Starting unoconverter.
INFO:unoserver:Opening people-100.csv
INFO:unoserver:Exporting to people-100.xlsx
INFO:unoserver:Using Calc Office Open XML export filter

The XLS and XLSX formats are standardized, but still owned by Microsoft. That’s why we tested the resulting XLSX file in Excel 2016:

Excel 2016 Sample CSV converted to XLSXIn this case, the result is correct and consistent with the CSV file. LibreOffice also supports this XLSX file properly.

Unfortunately, we don’t get the same result in Excel 365 because it wrongly converts some fields in people-100.xlsx to formulas:

Office 365 Sample CSV converted to XLSXThis is a common problem with CSV converted files. The solution is to open our people-100.xlsx with LibreOffice Calc 7.3 and save it again as people2-100.xlsx. Excel 365 will display people2-100.xlsx correctly. This is an undocumented trick.

In the following examples, we won’t further investigate the compatibility of the resulting XLS or XLSX files.

2.2. LibreOffice Headless

The CSV to XLS or XLSX conversion performed by LibreOffice in headless mode, i.e., without a GUI, is the same as the one performed by unoserver:

$ libreoffice --headless --convert-to xlsx people-100.csv 
convert /home/francesco/people-100.csv -> /home/francesco/people-100.xlsx using filter : Calc Office Open XML

The only real difference between unoserver and LibreOffice in headless mode is the CPU load and speed when there are many files to convert. If we need to convert many files, unoserver is the best choice.

2.3. ssconvert

ssconvert is a command line utility for Gnumeric, a Linux spreadsheet program. We need to install Gnumeric to get ssconvert:

$ sudo apt install gnumeric

Conversion from CSV to XLS or XLSX is straightforward:

$ ssconvert people-100.csv people-100.xls
$ ssconvert people-100.csv people-100.xlsx

Alternatively, we can do the conversion to XLSX this way:

$ ssconvert --export-type=Gnumeric_Excel:xlsx people-100.csv

However, we notice something different from unoserver, because ssconvert removes the dots in strings representing phone numbers, such as 277.609.7938, and converts these text strings to numbers. From this point of view, the conversion is less accurate.

2.4. csv2xls

csv2xls is a Python2 tool that converts CSV files to XLS only. There is no support for XLSX. It can take multiple CSV files and put them into different sheets in a single Excel file.

Let’s install it:

$ sudo apt install python-pip
$ pip2 install --user xlwt
Successfully installed xlwt-1.3.0
$ pip2 install --user csv2xls
Successfully installed csv2xls-0.4 xlwt-0.7.5
$ ./.local/bin/csv2xls -h
usage: csv2xls [-h] [-o OUTPUT] [-c] [-d DELIMITER] [-q QUOTECHAR] [-n]
               [-D FORMAT] [-k]
               files [files ...]

To use csv2xls, we need to pass one or more CSV files as arguments. We can also specify the output file name, delimiters, quotes, date format, and other options. In the following example, we use the –no-type-inference option to specify not to infer number types and dates when writing, because this mode is faster and preserves the input data:

$ wget -O sheet1.csv 'https://github.com/datablist/sample-csv-files/raw/main/files/customers/customers-100.csv'
$ wget -O sheet2.csv 'https://github.com/datablist/sample-csv-files/raw/main/files/organizations/organizations-100.csv'
$ ./.local/bin/csv2xls sheet1.csv sheet2.csv -o output.xls
Processing                     sheet1.csv -> output.xls/1
Processing                     sheet2.csv -> output.xls/2

The result is as expected, two sheets in one XLS:

csv2xls - Multiple CSVs to single XLSFinally, csv2xls supports Python2 only. Trying to install it with Python3 will result in unrecoverable errors.

3. GUI Tools

Using LibreOffice and Gnumeric to convert CSV to XLS or XLSX is easy and intuitive.

3.1. LibreOffice Calc

Opening our example people-100.csv file with LibreOffice Calc results in the following window:

LibreOffice CSVIn this case, the suggested import options are fine. We just need to click OK and then save the file in XLS or XLSX format.

3.2. Gnumeric

Unlike LibreOffice Calc, Gnumeric immediately opens the CSV file, already converted into a spreadsheet, without displaying a window to select import options. After that, we can save in our preferred format:

Gnumeric XLSX export

In this case, we chose to save in XLSX format.

4. Conclusion

In this article, we’ve seen several CLI and GUI tools for converting CSV files to XLS or XLSX:

  • unoserver
  • LibreOffice Calc headless
  • ssconvert
  • csv2xls
  • LibreOffice Calc GUI
  • Gnumeric

The choice of the most appropriate tool may depend on the complexity of the CSV or the need to merge multiple CSVs into a single Excel file. We need to be careful about inferring cell types, as this can compromise the integrity of the data.

Comments are closed on this article!