CSV and XLS are two common file formats for storing and sharing spreadsheet data. CSV stands for Comma-Separated Values and is a plain text format that uses commas to separate values in each row. XLS stands for Excel Spreadsheet and is a binary format that can store complex data types, formulas, charts, macros, and formatting options. XLSX is a newer Office Open XML (OOXML) format used by Excel 2007 and later.
CSV files are compatible with a variety of spreadsheet applications and can be easily parsed using Bash or other languages. The compatibility of XLS and XLSX files is not as great as that of CSV files.
In this tutorial, we’ll see how to use some Linux command-line and GUI programs to convert CSV data to XLS and XLSX formats. In all examples, we’ll use the same sample people-100.csv file from the Sample CSV Files archive.
Our test environment is a clean installation of Linux Mint 21.1.
2. CLI Tools
Before we begin, let’s take a quick look at the first few lines of our example file:
$ wget https://media.githubusercontent.com/media/datablist/sample-csv-files/main/files/people/people-100.csv [...] $ cat people-100.csv Index,User Id,First Name,Last Name,Sex,Email,Phone,Date of birth,Job Title 1,88F7B33d2bcf9f5,Shelby,Terrell,Male,[email protected],001-084-906-7849x73518,1945-10-26,Games developer 2,f90cD3E76f1A9b9,Phillip,Summers,Female,[email protected],214.112.6044x4913,1910-03-24,Phytotherapist 3,DbeAb8CcdfeFC2c,Kristine,Travis,Male,[email protected],277.609.7938,1992-07-02,Homeopath [...]
Even without knowing the details of the CSV specification, the similarity to a database table organized into records and fields is obvious. For some simple purposes, such as listing people as in this case, spreadsheets can serve as databases.
unoserver can convert CSV files to XLS or XLSX files by starting LibreOffice in listener mode and then using the UNO API to perform the conversion. The official README.rst explains the advantages of using LibreOffice directly without unoserver. The presence of LibreOffice is a prerequisite. In our case, it’s already preinstalled on Linux Mint.
To install unoserver, we first need to find the Python version that LibreOffice uses:
$ wget -O find_uno.py https://gist.githubusercontent.com/regebro/036da022dc7d5241a0ee97efdf1458eb/raw/find_uno.py [...] $ python3 find_uno.py [...] Found 1 Pythons with Libreoffice libraries: /usr/bin/python3
On this Python version we can install unoserver:
$ wget https://bootstrap.pypa.io/get-pip.py [...] $ sudo /usr/bin/python3 get-pip.py [...] Successfully installed pip-23.1.2 setuptools-67.8.0 wheel-0.40.0 [...] $ sudo /usr/bin/python3 -m pip install unoserver [...] Successfully installed unoserver-1.4 [...]
To use unoserver, we need to start it as a background process:
$ unoserver --daemon INFO:unoserver:Starting unoserver. [...] INFO:unoserver:Server PID: 3525 [...]
Let’s convert our sample CSV:
$ unoconvert people-100.csv people-100.xls INFO:unoserver:Starting unoconverter. INFO:unoserver:Opening people-100.csv INFO:unoserver:Exporting to people-100.xls INFO:unoserver:Using MS Excel 97 export filter $ unoconvert people-100.csv people-100.xlsx INFO:unoserver:Starting unoconverter. INFO:unoserver:Opening people-100.csv INFO:unoserver:Exporting to people-100.xlsx INFO:unoserver:Using Calc Office Open XML export filter
The XLS and XLSX formats are standardized, but still owned by Microsoft. That’s why we tested the resulting XLSX file in Excel 2016:
Unfortunately, we don’t get the same result in Excel 365 because it wrongly converts some fields in people-100.xlsx to formulas:
This is a common problem with CSV converted files. The solution is to open our people-100.xlsx with LibreOffice Calc 7.3 and save it again as people2-100.xlsx. Excel 365 will display people2-100.xlsx correctly. This is an undocumented trick.
In the following examples, we won’t further investigate the compatibility of the resulting XLS or XLSX files.
2.2. LibreOffice Headless
The CSV to XLS or XLSX conversion performed by LibreOffice in headless mode, i.e., without a GUI, is the same as the one performed by unoserver:
$ libreoffice --headless --convert-to xlsx people-100.csv convert /home/francesco/people-100.csv -> /home/francesco/people-100.xlsx using filter : Calc Office Open XML
The only real difference between unoserver and LibreOffice in headless mode is the CPU load and speed when there are many files to convert. If we need to convert many files, unoserver is the best choice.
$ sudo apt install gnumeric
Conversion from CSV to XLS or XLSX is straightforward:
$ ssconvert people-100.csv people-100.xls $ ssconvert people-100.csv people-100.xlsx
Alternatively, we can do the conversion to XLSX this way:
$ ssconvert --export-type=Gnumeric_Excel:xlsx people-100.csv
However, we notice something different from unoserver, because ssconvert removes the dots in strings representing phone numbers, such as 277.609.7938, and converts these text strings to numbers. From this point of view, the conversion is less accurate.
csv2xls is a Python2 tool that converts CSV files to XLS only. There is no support for XLSX. It can take multiple CSV files and put them into different sheets in a single Excel file.
Let’s install it:
$ sudo apt install python-pip [...] $ pip2 install --user xlwt [...] Successfully installed xlwt-1.3.0 $ pip2 install --user csv2xls [...] Successfully installed csv2xls-0.4 xlwt-0.7.5 $ ./.local/bin/csv2xls -h usage: csv2xls [-h] [-o OUTPUT] [-c] [-d DELIMITER] [-q QUOTECHAR] [-n] [-D FORMAT] [-k] files [files ...] [...]
To use csv2xls, we need to pass one or more CSV files as arguments. We can also specify the output file name, delimiters, quotes, date format, and other options. In the following example, we use the –no-type-inference option to specify not to infer number types and dates when writing, because this mode is faster and preserves the input data:
$ wget -O sheet1.csv 'https://github.com/datablist/sample-csv-files/raw/main/files/customers/customers-100.csv' [...] $ wget -O sheet2.csv 'https://github.com/datablist/sample-csv-files/raw/main/files/organizations/organizations-100.csv' [...] $ ./.local/bin/csv2xls sheet1.csv sheet2.csv -o output.xls Processing sheet1.csv -> output.xls/1 Processing sheet2.csv -> output.xls/2
The result is as expected, two sheets in one XLS:
3. GUI Tools
Using LibreOffice and Gnumeric to convert CSV to XLS or XLSX is easy and intuitive.
3.1. LibreOffice Calc
Opening our example people-100.csv file with LibreOffice Calc results in the following window:
Unlike LibreOffice Calc, Gnumeric immediately opens the CSV file, already converted into a spreadsheet, without displaying a window to select import options. After that, we can save in our preferred format:
In this case, we chose to save in XLSX format.
In this article, we’ve seen several CLI and GUI tools for converting CSV files to XLS or XLSX:
- LibreOffice Calc headless
- LibreOffice Calc GUI
The choice of the most appropriate tool may depend on the complexity of the CSV or the need to merge multiple CSVs into a single Excel file. We need to be careful about inferring cell types, as this can compromise the integrity of the data.