DOM parsing with Xerces

Azure Spring Apps is a fully managed service from Microsoft (built in collaboration with VMware), focused on building and deploying Spring Boot applications on Azure Cloud without worrying about Kubernetes.

And, the Enterprise plan comes with some interesting features, such as commercial Spring runtime support, a 99.95% SLA and some deep discounts (up to 47%) when you are ready for production.

>> Learn more and deploy your first Spring Boot app to Azure.

You can also ask questions and leave feedback on the Azure Spring Apps GitHub page.

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

Accelerate Your Jakarta EE Development with Payara Server!

With best-in-class guides and documentation, Payara essentially simplifies deployment to diverse infrastructures.

Beyond that, it provides intelligent insights and actions to optimize Jakarta EE applications.

The goal is to apply an opinionated approach to get to what's essential for mission-critical applications - really solid scalability, availability, security, and long-term support:

>> Download and Explore the Guide (to learn more)

The AI Assistant to boost Boost your productivity writing unit tests - Machinet AI.

AI is all the rage these days, but for very good reason. The highly practical coding companion, you'll get the power of AI-assisted coding and automated unit test generation.
Machinet's Unit Test AI Agent utilizes your own project context to create meaningful unit tests that intelligently aligns with the behavior of the code.
And, the AI Chat crafts code and fixes errors with ease, like a helpful sidekick.

Simplify Your Coding Journey with Machinet AI:

>> Install Machinet AI in your IntelliJ

Looking for the ideal Linux distro for running modern Spring apps in the cloud?

Meet Alpaquita Linux: lightweight, secure, and powerful enough to handle heavy workloads.

This distro is specifically designed for running Java apps. It builds upon Alpine and features significant enhancements to excel in high-density container environments while meeting enterprise-grade security standards.

Specifically, the container image size is ~30% smaller than standard options, and it consumes up to 30% less RAM:

>> Try Alpaquita Containers now.

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

>> Take a look at DBSchema

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

1. Overview

In this tutorial, we’ll discuss how to parse DOM with Apache Xerces – a mature and established library for parsing/manipulating XML.

There are multiple options to parse an XML document; we’ll focus on DOM parsing in this article. The DOM parser loads a document and creates an entire hierarchical tree in memory.

For an overview of XML libraries support in Java check out our previous article.

2. Our Document

Let’s start with the XML document we’re going to use in our example:

<?xml version="1.0"?>
<tutorials>
    <tutorial tutId="01" type="java">
        <title>Guava</title>
        <description>Introduction to Guava</description>
        <date>04/04/2016</date>
        <author>GuavaAuthor</author>
    </tutorial>
...
</tutorials>

Note that our document has a root node called “tutorials” with 4 “tutorial” child nodes. Each of these has 2 attributes: “tutId” and “type”. Also, each “tutorial” has 4 child nodes: “title”, “description”, “date” and “author”.

Now we can continue with parsing this document.

3. Loading XML File

First, we should note that the Apache Xerces library is packaged with the JDK, so we don’t need any additional setup.

Let’s jump right into loading our XML file:

DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(new File("src/test/resources/example_jdom.xml"));
doc.getDocumentElement().normalize();

In the example above, we first obtain an instance of the DocumentBuilder class, then use the parse() method on the XML document to get a Document object representing it.

We also need to use the normalize() method to ensure that the document hierarchy isn’t affected by any extra white spaces or new lines within nodes.

4. Parsing the DOM

Now, let’s explore our XML file.

Let’s start by retrieving all elements with tag “tutorial”. We can do this using the getElementsByTagName() method, which will return a NodeList:

@Test
public void whenGetElementByTag_thenSuccess() {
    NodeList nodeList = doc.getElementsByTagName("tutorial");
    Node first = nodeList.item(0);

    assertEquals(4, nodeList.getLength());
    assertEquals(Node.ELEMENT_NODE, first.getNodeType());
    assertEquals("tutorial", first.getNodeName());        
}

It’s important to note that Node is the primary datatype for the DOM components. All the elements, attributes, text are considered nodes.

Next, let’s see how we can get the first element’s attributes using getAttributes():

@Test
public void whenGetFirstElementAttributes_thenSuccess() {
    Node first = doc.getElementsByTagName("tutorial").item(0);
    NamedNodeMap attrList = first.getAttributes();

    assertEquals(2, attrList.getLength());
    
    assertEquals("tutId", attrList.item(0).getNodeName());
    assertEquals("01", attrList.item(0).getNodeValue());
    
    assertEquals("type", attrList.item(1).getNodeName());
    assertEquals("java", attrList.item(1).getNodeValue());
}

Here, we get the NamedNodeMap object, then use the item(index) method to retrieve each node.

For every node, we can use getNodeName() and getNodeValue() to find their attributes.

5. Traversing Nodes

Next, let’s see how to traverse DOM nodes.

In the following test, we’ll traverse the first element’s child nodes and print their content:

@Test
public void whenTraverseChildNodes_thenSuccess() {
    Node first = doc.getElementsByTagName("tutorial").item(0);
    NodeList nodeList = first.getChildNodes();
    int n = nodeList.getLength();
    Node current;
    for (int i=0; i<n; i++) {
        current = nodeList.item(i);
        if(current.getNodeType() == Node.ELEMENT_NODE) {
            System.out.println(
              current.getNodeName() + ": " + current.getTextContent());
        }
    }
}

First, we get the NodeList using the getChildNodes() method, then iterate through it, and print the node name and text content.

The output will show the contents of the first “tutorial” element in our document:

title: Guava
description: Introduction to Guava
date: 04/04/2016
author: GuavaAuthor

6. Modifying the DOM

We can also make changes to the DOM.

As an example, let’s change the value of the type attribute from “java” to “other”:

@Test
public void whenModifyDocument_thenModified() {
    NodeList nodeList = doc.getElementsByTagName("tutorial");
    Element first = (Element) nodeList.item(0);

    assertEquals("java", first.getAttribute("type")); 
    
    first.setAttribute("type", "other");
    assertEquals("other", first.getAttribute("type"));     
}

Here, changing the attribute value is a simple matter of calling an Element‘s setAttribute() method.

7. Creating a New Document

Besides modifying the DOM, we can also create new XML documents from scratch.

Let’s first have a look at the file we want to create:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<users>
    <user id="1">
        <email>[email protected]</email>
    </user>
</users>

Our XML contains a users root node with one user element that also has a child node email.

To achieve this, we first have to call the Builder‘s newDocument() method which returns a Document object.

Then, we’ll call the createElement() method of the new object:

@Test
public void whenCreateNewDocument_thenCreated() throws Exception {
    Document newDoc = builder.newDocument();
    Element root = newDoc.createElement("users");
    newDoc.appendChild(root);

    Element first = newDoc.createElement("user");
    root.appendChild(first);
    first.setAttribute("id", "1");

    Element email = newDoc.createElement("email");
    email.appendChild(newDoc.createTextNode("[email protected]"));
    first.appendChild(email);

    assertEquals(1, newDoc.getChildNodes().getLength());
    assertEquals("users", newDoc.getChildNodes().item(0).getNodeName());
}

To add each element to the DOM, we’re also calling the appendChild() method.

8. Saving a Document

After modifying our document or creating one from scratch, we’ll need to save it in a file.

We’ll start with creating a DOMSource object, then use a simple Transformer to save the document in a file:

private void saveDomToFile(Document document,String fileName) 
  throws Exception {
 
    DOMSource dom = new DOMSource(document);
    Transformer transformer = TransformerFactory.newInstance()
      .newTransformer();

    StreamResult result = new StreamResult(new File(fileName));
    transformer.transform(dom, result);
}

Similarly, we can print our document in the console:

private void printDom(Document document) throws Exception{
    DOMSource dom = new DOMSource(document);
    Transformer transformer = TransformerFactory.newInstance()
        .newTransformer();

    transformer.transform(dom, new StreamResult(System.out));
}

9. Conclusion

In this quick article, we learned how to use the Xerces DOM parser to create, modify and save an XML document.

As always, the full source code for the examples is available over on GitHub.

Working with XML Files in Java Using DOM Parsing

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Overview

2. Our Document

3. Loading XML File

4. Parsing the DOM

5. Traversing Nodes

6. Modifying the DOM

7. Creating a New Document

8. Saving a Document

9. Conclusion

Get started with Spring and Spring Boot, through the Learn Spring course:

REST with Spring

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Persistence

REST

Security

Full Archive

Baeldung Ebooks

About Baeldung

Write for Baeldung

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Overview

2. Our Document

3. Loading XML File

4. Parsing the DOM

5. Traversing Nodes

6. Modifying the DOM

7. Creating a New Document

8. Saving a Document

9. Conclusion

Get started with Spring and Spring Boot, through the Learn Spring course: