Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Introduction

Understanding data mapping is essential in software development. Excel is a widely used data management software, making it crucial for Java developers to know how to map data between Excel and Java objects.

In this tutorial, we’ll explore converting Excel data into a list of Java objects.

Several Java libraries are available on the Maven repository to work with Excel files in Java, with Apache POI being the most common. However, in this tutorial, we’ll use four Java Excel libraries, including Apache POI, Poiji, FastExcel, and JExcelApi (Jxl), to convert Excel data into a Java object list.

2. Model Setup

To get started, we need to create our object’s blueprint, the FoodInfo class:

public class FoodInfo {

    private String category; 
    private String name; 
    private String measure;
    private double calories; 
   
    // standard constructors, toString, getters and setters
}

3. Apache POI

Apache POI (Poor Obfuscation Implementation) is a Java API for Microsoft documents. It’s a collection of pure Java libraries used to read and write data from/to Microsoft Office files such as Word, Outlook, Excel, and others.

3.1. Maven Dependency

Let’s add our Maven dependencies to the pom.xml file:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>5.2.5</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.2.5</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml-schemas</artifactId>
    <version>4.1.2</version>
</dependency>

3.2. Converting Excel Data into a List of Objects

By using its Workbook interface, we can access various features to read the sheets and cells of an Excel file. This interface has two implementations, one for each Excel format – HSSFWorkbook for .xls and XSSFWorkbook for .xlsx.

This snippet reads and converts Excel data using the Apache POI library from a .xlsx file into a list of FoodInfo objects:

public static List<FoodInfo> excelDataToListOfObjets_withApachePOI(String fileLocation)
  throws IOException {
    FileInputStream file = new FileInputStream(new File(fileLocation));
    Workbook workbook = new XSSFWorkbook(file);
    Sheet sheet = workbook.getSheetAt(0);
    List<FoodInfo> foodData = new ArrayList<FoodInfo>();
    DataFormatter dataFormatter = new DataFormatter();
    for (int n = 1; n < sheet.getPhysicalNumberOfRows(); n++) {
        Row row = sheet.getRow(n);
        FoodInfo foodInfo = new FoodInfo();
        int i = row.getFirstCellNum();

        foodInfo.setCategory(dataFormatter.formatCellValue(row.getCell(i)));
        foodInfo.setName(dataFormatter.formatCellValue(row.getCell(++i)));
        foodInfo.setMeasure(dataFormatter.formatCellValue(row.getCell(++i)));
        foodInfo.setCalories(row.getCell(++i).getNumericCellValue());
       
        foodData.add(foodInfo);
    }
    return foodData;
}

To determine the number of non-empty rows in the sheet object, we utilize the getPhysicalNumberOfRows() method. We then loop through the rows, excluding the header row (i = 1).

Depending on the field of the food object that we need to populate, we either use the dataFormatter object or the getNumericValue() method to convert and assign the cell value to the appropriate data type.

Let’s verify our code by writing a unit test to make sure it works as expected using an Excel file named food_info.xlsx:

@Test
public void whenParsingExcelFileWithApachePOI_thenConvertsToList() throws IOException {
    List<FoodInfo> foodInfoList = ExcelDataToListApachePOI
      .excelDataToListOfObjets_withApachePOI("src\\main\\resources/food_info.xlsx");

    assertEquals("Beverages", foodInfoList.get(0).getCategory());
    assertEquals("Dairy", foodInfoList.get(3).getCategory());
}

The Apache POI library offers assistance for both older and newer versions of Excel files, namely .xls and .xlsx.

4. Poiji

Poiji is a thread-safe Java library that provides an API for one-way data mapping from Excel sheets to Java classes. It’s built on top of the Apache POI library. But unlike Apache POI, it’s much simpler to use and directly converts each Excel row into a Java object.

4.1. Setting Up the Maven Dependency

Here’s the Poiji Maven dependency we need to add to the pom.xml file:

<dependency>
    <groupId>com.github.ozlerhakan</groupId>
    <artifactId>poiji</artifactId>
    <version>4.1.1</version>
</dependency>

4.2. Setting Up the Class with Annotations

The Poiji library simplifies Excel data retrieval by requiring class fields to be annotated with either @ExcelCellName(String cellName) or @ExcelCell(int cellIndex).

Below, we’re setting up our FoodInfo class for the Poiji library by adding annotations:

public class FoodInfo { 
    
    @ExcelCellName("Category") 
    private String category; 
    
    @ExcelCellName("Name") 
    private String name;
 
    @ExcelCellName("Measure") 
    private String measure;
 
    @ExcelCellName("Calories") 
    private double calories;
 
    // standard constructors, getters and setters 
}

The API offers support for mapping an Excel workbook that has multiple sheets. When our file has several sheets, we can employ the @ExcelSheet(String sheetName) annotation on our class to indicate which sheet we want to work with. Any other sheets will be disregarded.

However, if we don’t use this annotation, only the first Excel sheet in the workbook will be taken into account.

In certain situations, we may not need to extract data from every row in the Excel sheet we are targeting. To address this, we can include a private int rowIndex property that is annotated with @ExcelRow in our class. This will allow us to specify the index of the row items we want to access.

4.3. Converting Excel Data into a List of Objects

Unlike the libraries mentioned in this article, the Poiji library, by default, ignores the header row of the Excel sheet.

The following snippet extracts data from an Excel file and converts its data into a list of FoodInfo:

public class ExcelDataToListOfObjectsPOIJI {
    public static List<FoodInfo> excelDataToListOfObjets_withPOIJI(String fileLocation){
        return Poiji.fromExcel(new File(fileLocation), FoodInfo.class);
    }
}

The program translates the first Excel sheet of the fileLocation file into a FoodInfo class. Each row becomes an instance of the FoodInfo class, with the cell values representing the object’s properties. The output is a list of FoodInfo objects with a size equivalent to the number of rows (excluding the header row) in the original Excel sheet.

There are some cases where a password may protect our Excel sheet. We can define the password via PoijiOptionsBuilder:

PoijiOptions options = PoijiOptionsBuilder.settings()
  .password("<excel_sheet_password>").build();
List<FoodInfo> foodData = Poiji.fromExcel(new File(fileLocation), FoodInfo.class, options);

To make sure our code works as expected, we write a unit test:

@Test
public void whenParsingExcelFileWithPOIJI_thenConvertsToList() throws IOException {
    List<FoodInfo> foodInfoList = 
      ExcelDataToListOfObjectsPOIJI
        .excelDataToListOfObjets_withPOIJI("src\\main\\resources/food_info.xlsx");

    assertEquals("Beverages", foodInfoList.get(0).getCategory());
    assertEquals("Dairy", foodInfoList.get(3).getCategory());
}

5. FastExcel

FastExcel is an efficient library that utilizes minimal memory and provides high performance for creating and reading basic Excel workbooks in Java. It exclusively supports the newer version of Excel files (.xlsx) and has limited features compared to Apache POI.

It only reads cell content and does not include graphs, styles, or other cell formatting.

5.1. Setting Up the Maven Dependency

Below are the FastExcel and the FastExcel reader Maven dependencies added to pom.xml:

<dependency>
      <groupId>org.dhatim</groupId>
      <artifactId>fastexcel</artifactId>
      <version>0.15.7</version>
</dependency>
<dependency>
      <groupId>org.dhatim</groupId>
      <artifactId>fastexcel-reader</artifactId>
      <version>0.15.7</version>
</dependency>

5.2. Converting Excel Data into a List of Objects

When dealing with large files, the FastExcel reader is a great option despite its limited features. It’s easy to use, and we can access the entire Excel workbook using the ReadableWorkbook class.

This allows us to retrieve sheets individually, either by name or index.

In the method below, we read data from an Excel sheet and convert it into a list of FoodInfo objects:

public static List<FoodInfo> excelDataToListOfObjets_withFastExcel(String fileLocation)
   throws IOException, NumberFormatException {
    List<FoodInfo> foodData = new ArrayList<FoodInfo>();

    try (FileInputStream file = new FileInputStream(fileLocation);
      ReadableWorkbook wb = new ReadableWorkbook(file)) {
        Sheet sheet = wb.getFirstSheet();
        for (Row row: sheet.read()) {
            if (row.getRowNum() == 1) {
                continue;
            }
            FoodInfo food = new FoodInfo();
            food.setCategory(row.getCellText(0));
            food.setName(row.getCellText(1));
            food.setMeasure(row.getCellText(2));
            food.setCalories(Double.parseDouble(row.getCellText(3)));
            
            foodData.add(food);
        }
    }
    return foodData;
}

Because the API reads all the rows(including the header row) in the sheet, we need to skip the first row(non-zero-based index) when looping through rows.

Accessing a cell is done either by instantiating a Cell class: Cell cell = row.getCell(), which has two implementations, one that takes an int cellIndex and the other a String cellAddress(e.g., “C4”) arguments. Or by directly getting the data in the cell: e.g., row.getCellText().

Either way, after extracting each cell’s content, we need to make sure to cast it, where necessary, to the appropriate food object field type.

Let’s write a unit test to make sure the conversion works:

@Test
public void whenParsingExcelFileWithFastExcel_thenConvertsToList() throws IOException {
    List<FoodInfo> foodInfoList = ExcelDataToListOfObjectsFastExcel
      .excelDataToListOfObjets_withFastExcel("src\\main\\resources/food_info.xlsx");

    assertEquals("Beverages", foodInfoList.get(0).getCategory());
    assertEquals("Dairy", foodInfoList.get(3).getCategory());
}

6. JExcelApi (Jxl)

JExcelApi (or Jxl) is a lightweight Java library for reading, writing, and modifying Excel spreadsheets.

6.1. Setting Up the Maven Dependency

Let’s add the Maven dependency for JExcelApi to the pom.xml file:

<dependency>
    <groupId>net.sourceforge.jexcelapi</groupId>
    <artifactId>jxl</artifactId>
    <version>2.6.12</version>
</dependency>

6.2. Converting Excel Data into a List of Objects

Though it only supports the older Excel format(.xls) files, the JExcel library offers a range of classes for manipulating Excel files. The Workbook class is used to access the list of Excel sheets within the file.

The code below uses the library to convert data from a .xls file to a list of FoodInfo objects, foodData:

public static List<FoodInfo> excelDataToListOfObjets_withJxl(String fileLocation) 
  throws IOException, BiffException {

    List<FoodInfo> foodData = new ArrayList<FoodInfo>();

    Workbook workbook = Workbook.getWorkbook(new File(fileLocation));
    Sheet sheet = workbook.getSheet(0);

    int rows = sheet.getRows();

    for (int i = 1; i < rows; i++) {
        FoodInfo foodInfo = new FoodInfo();

        foodInfo.setCategory(sheet.getCell(0, i).getContents());
        foodInfo.setName(sheet.getCell(1, i).getContents());
        foodInfo.setMeasure(sheet.getCell(2, i).getContents());
        foodInfo.setCalories(Double.parseDouble(sheet.getCell(3, i).getContents()));
        
        foodData.add(foodInfo);

    }
    return foodData;
}

Since the header row is not ignored by the library, we must loop from i = 1. The sheet object is a zero-based index list of rows.

Retrieving cell data using the JExcel library is quite similar to the FastExcel library. Both libraries use the getCell() method with two implementations.

However, in JExcel, this method is directly accessed from a Sheet object rather than a Row object. Additionally, one of the implementations for the getCell() method in JExcel takes two arguments,  colNum and rowNum, both of which are integers: sheet.getCell(colNum, rowNum).

To make sure the conversion works well, let’s write a unit test for our method:

@Test
public void whenParsingExcelFileWithJxl_thenConvertsToList()
  throws IOException, BiffException {
    List<FoodInfo> foodInfoList = ExcelDataToListOfObjectsJxl
      .excelDataToListOfObjets_withJxl("src\\main\\resources/food_info.xls");

    assertEquals("Beverages", foodInfoList.get(0).getCategory());
    assertEquals("Dairy", foodInfoList.get(3).getCategory());
}

7. Conclusion

In this article, we explored the usage of several libraries, such as Apache POI, Poiji, FastExcel, and JExcelApi, to read and convert data from Excel files into Java objects. However, the choice of which library to use depends on the specific needs, considering the advantages and limitations of each of them.

For instance, we might choose to use the Poiji library if we prioritize the simplest way to just read data from an Excel file and directly convert it into a list of Java objects.

When it comes to performance and simplicity for two-way Excel data mapping in Java, FastExcel, and JExcelApi libraries are excellent options. However, they offer fewer features compared to Apache POI, which is a feature-rich library that supports styles and graphs.

As always, the complete source code for this article is available over on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.