eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (cat=Spring Boot)
announcement - icon

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Course – LJB – NPI EA (cat = Core Java)
announcement - icon

Code your way through and build up a solid, practical foundation of Java:

>> Learn Java Basics

Partner – LambdaTest – NPI EA (cat= Testing)
announcement - icon

Distributed systems often come with complex challenges such as service-to-service communication, state management, asynchronous messaging, security, and more.

Dapr (Distributed Application Runtime) provides a set of APIs and building blocks to address these challenges, abstracting away infrastructure so we can focus on business logic.

In this tutorial, we'll focus on Dapr's pub/sub API for message brokering. Using its Spring Boot integration, we'll simplify the creation of a loosely coupled, portable, and easily testable pub/sub messaging system:

>> Flexible Pub/Sub Messaging With Spring Boot and Dapr

1. Overview

Jsoup is an open-source library used to scrape HTML pages. It provides an API for data parsing, extraction, and manipulation using DOM API methods.

In this article, we will see how to parse an HTML table using Jsoup. We will be retrieving and updating data from the HTML table and also, adding and deleting rows in the table using Jsoup.

2. Dependencies

To use the Jsoup library, add the following dependency to the project:

<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.17.2</version>
</dependency>

We can find the latest version of the Jsoup library in the Maven central repository.

3. Table Structure

To illustrate parsing HTML tables via jsoup, we will be using a sample HTML structure. The complete HTML structure is available in the code base provided in the GitHub repository mentioned at the end of the article. Here, we are showing a table with only two rows of data for representational purposes:

<table>
    <thead>
        <tr>
            <th>Name</th>
            <th>Maths</th>
            <th>English</th>
            <th>Science</th>
         </tr>
    </thead>
    <tbody>
        <tr>
            <td>Student 1</td>
            <td>90</td>
            <td>85</td>
            <td>92</td>
        </tr>
     </tbody>
</table>

As we can see, we are parsing the table with a header row with thead tag followed by data rows in the tbody tag. We are assuming that the table in the HTML document will be in the above format.

4. Parsing Table

Firstly, to select an HTML table from the parsed document, we can use the code snippet below:

Element table = doc.select("table");
Elements rows = table.select("tr"); 
Elements first = rows.get(0).select("th,td");

As we can see, the table element is selected from the document, and then, to get the row element, tr is selected from the table element. As there are multiple rows in the table, we have selected the th or td elements in the first row. By using these functions, we can write the below function to parse table data.

Here, we are assuming no colspan or rowspan elements are used in the table, and the first row is present with header th tags.

Following is the code for parsing the table:

public List<Map<String, String>> parseTable(Document doc, int tableOrder) {
    Element table = doc.select("table").get(tableOrder);
    Element tbody = table.select("tbody").get(0);
    Elements dataRows = tbody.select("tr");
    Elements headerRow = table.select("tr")
      .get(0)
      .select("th,td");

    List<String> headers = new ArrayList<String>();
    for (Element header : headerRow) {
        headers.add(header.text());
    }

    List<Map<String, String>> parsedDataRows = new ArrayList<Map<String, String>>();
    for (int row = 0; row < dataRows.size(); row++) {
        Elements colVals = dataRows.get(row).select("th,td");

        int colCount = 0;
        Map<String, String> dataRow = new HashMap<String, String>();
        for (Element colVal : colVals) {
            dataRow.put(headers.get(colCount++), colVal.text());
        }
        parsedDataRows.add(dataRow);
    }
    return parsedDataRows;
}

In this function, parameter doc is the HTML document loaded from the file, and tableOrder is the nth table element in the document. We are using List<Map<String, String>> to store a list of dataRows in the table under the tbody element. Each element of the list is a Map representing a dataRow. This Map stores the column name as a key and the row value for that column as a map value. Using a list of Maps makes it easy to access the retrieved data.

The list index represents row numbers, and we can get specific cell data by its map key.

We can verify if table data is retrieved correctly using the test case below:

@Test
public void whenDocumentTableParsed_thenTableDataReturned() {
    JsoupTableParser jsoParser = new JsoupTableParser();
    Document doc = jsoParser.loadFromFile("Students.html");
    List<Map<String, String>> tableData = jsoParser.parseTable(doc, 0);
    assertEquals("90", tableData.get(0).get("Maths")); 
}

From the JUnit test case, we can confirm that since we have parsed the text of all table cells and stored it in an ArrayList of HashMap objects, each element of the list represents a data row in the table. The row is represented by a HashMap with the key as the column header and cell text as the value. Using this structure, we can easily access table data.

5. Update Elements of the Parsed Table

To insert or update elements while parsing, we can use the below code on the td element retrieved from the row:

colVals.get(colCount++).text(updateValue);

or

colVals.get(colCount++).html(updateValue);

The function to update values in the parsed table would look like below:

public void updateTableData(Document doc, int tableOrder, String updateValue) {
    Element table = doc.select("table").get(tableOrder);
    Element tbody = table.select("tbody").get(0);
    Elements dataRows = tbody.select("tr");

    for (int row = 0; row < dataRows.size(); row++) {
        Elements colVals = dataRows.get(row).select("th,td");

        for (int colCount = 0; colCount < colVals.size(); colCount++) {
            colVals.get(colCount).text(updateValue);
        }
    }
}

In the above function, we are getting data rows from the tbody element of the table. The function traverses each cell of the table and sets its value to the parameter value, updatedValue. It updates all cells to the same value to demonstrate that cell values can be updated using Jsoup. We can update the individual cell values by specifying the row and column index for the data row.

The test below verifies the update function:

@Test
public void whenTableUpdated_thenUpdatedDataReturned() {
    JsoupTableParser jsoParser = new JsoupTableParser();
    Document doc = jsoParser.loadFromFile("Students.html");
    jsoParser.updateTableData(doc, 0, "50");
    List<Map<String, String>> tableData = jsoParser.parseTable(doc, 0);
    assertEquals("50", tableData.get(2).get("Maths"));
}

The JUnit test case confirms that the update operation updates all table cell values to 50. Here we are verifying data from the third data row of the Maths column.

Similarly, we can set desired values for specific cells of the table.

6. Adding Row to the Table

We can add a row to the table using the following function:

public void addRowToTable(Document doc, int tableOrder) {
    Element table = doc.select("table").get(tableOrder);
    Element tbody = table.select("tbody").get(0);

    Elements rows = table.select("tr");
    Elements headerCols = rows.get(0).select("th,td");
    int numCols = headerCols.size();

    Elements colVals = new Elements(numCols);
    for (int colCount = 0; colCount < numCols; colCount++) {
        Element colVal = new Element("td");
        colVal.text("11");
        colVals.add(colVal);
    }
    Elements dataRows = tbody.select("tr");
    Element newDataRow = new Element("tr");
    newDataRow.appendChildren(colVals);
    dataRows.add(newDataRow);
    tbody.html(dataRows.toString());
}

In the above function, we are getting the number of columns from the header row and the data rows from the tbody element of the table. After adding a new row to the dataRows list, we are updating the tbody HTML content with the dataRows.

We can verify row addition using the following test case:

@Test
public void whenTableRowAdded_thenRowCountIncreased() {
    JsoupTableParser jsoParser = new JsoupTableParser();
    Document doc = jsoParser.loadFromFile("Students.html");
    List<Map<String, String>> tableData = jsoParser.parseTable(doc, 0);
    int countBeforeAdd = tableData.size();
    jsoParser.addRowToTable(doc, 0);
    tableData = jsoParser.parseTable(doc, 0);
    assertEquals(countBeforeAdd + 1, tableData.size());
}

We can confirm from the JUnit test case that the addRowToTable operation on the table increases the number of rows in the table by 1. This operation adds a new row at the end of the list.

Similarly, we can add a row at any position by specifying the index while adding it to the row elements collection.

7. Delete the Row From the Table

We can delete a row from the table using the following function:

public void deleteRowFromTable(Document doc, int tableOrder, int rowNumber) {
    Element table = doc.select("table").get(tableOrder);
    Element tbody = table.select("tbody").get(0);
    Elements dataRows = tbody.select("tr");
    if (rowNumber < dataRows.size()) {
        dataRows.remove(rowNumber);
    }
}

In the above function, we are getting the tbody element of the table. From tbody, we are getting a list of dataRows. From the list of dataRows, we are deleting the row at the rowNumber position in the table. We can verify row deletion using the following test case:

@Test
public void whenTableRowDeleted_thenRowCountDecreased() {
    JsoupTableParser jsoParser = new JsoupTableParser();
    Document doc = jsoParser.loadFromFile("Students.html");
    List<Map<String, String>> tableData = jsoParser.parseTable(doc, 0);
    int countBeforeDel = tableData.size();
    jsoParser.deleteRowFromTable(doc, 0, 2);
    tableData = jsoParser.parseTable(doc, 0);
    assertEquals(countBeforeDel - 1, tableData.size());
}

The JUnit test case confirms that the deleteRowFromTable operation on the table reduces the number of rows in the table by 1.

Similarly, we can delete a row at any position by specifying the index while removing it from the row elements collection.

8. Conclusion

In this article, we have seen how we can use jsoup to parse HTML tables from HTML documents. Also, we can update table structure as well as table cell data.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
Baeldung Pro – NPI EA (cat = Baeldung)
announcement - icon

Baeldung Pro comes with both absolutely No-Ads as well as finally with Dark Mode, for a clean learning experience:

>> Explore a clean Baeldung

Once the early-adopter seats are all used, the price will go up and stay at $33/year.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (tag=Refactoring)
announcement - icon

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

eBook Jackson – NPI EA – 3 (cat = Jackson)
2 Comments
Oldest
Newest
Inline Feedbacks
View all comments