Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Introduction

One common activity when we are working with XML is working with its attributes. In this tutorial, we’ll explore how to modify an XML attribute using Java.

2. Dependencies

In order to run our tests, we’ll need to add the JUnit and xmlunit-assertj dependencies to our Maven project:

<dependency>
    <groupId>org.junit.jupiter</groupId>
    <artifactId>junit-jupiter</artifactId>
    <version>5.8.1</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.xmlunit</groupId>
    <artifactId>xmlunit-assertj</artifactId>
    <version>2.6.3</version>
    <scope>test</scope>
</dependency>

3. Using JAXP

Let’s start with an XML document:

<?xml version="1.0" encoding="UTF-8"?>
<notification id="5">
    <to customer="true">[email protected]</to>
    <from>[email protected]</from>
</notification>

In order to process it, we’ll use the Java API for XML Processing (JAXP), which has been bundled with Java since version 1.4.

Let’s modify the customer attribute and change its value to false.

First, we need to build a Document object from the XML file, and to do that, we’ll use a DocumentBuilderFactory:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Document input = factory
  .newDocumentBuilder()
  .parse(resourcePath);

Note that in order to disable external entity processing (XXE) for the DocumentBuilderFactory class, we configure the XMLConstants.FEATURE_SECURE_PROCESSING and http://apache.org/xml/features/disallow-doctype-decl features. It’s a good practice to configure it when we parse untrusted XML files.

After initializing our input object, we’ll need to locate the node with the attribute we’d like to change. Let’s use an XPath expression to select it:

XPath xpath = XPathFactory
  .newInstance()
  .newXPath();
String expr = String.format("//*[contains(@%s, '%s')]", attribute, oldValue);
NodeList nodes = (NodeList) xpath.evaluate(expr, input, XPathConstants.NODESET);

In this case, the XPath evaluate method returns us a node list with the matched nodes.

Let’s iterate over the list to change the value:

for (int i = 0; i < nodes.getLength(); i++) {
    Element value = (Element) nodes.item(i);
    value.setAttribute(attribute, newValue);
}

Or, instead of a for loop, we can use an IntStream:

IntStream
    .range(0, nodes.getLength())
    .mapToObj(i -> (Element) nodes.item(i))
    .forEach(value -> value.setAttribute(attribute, newValue));

Now, let’s use a Transformer object to apply the changes:

TransformerFactory factory = TransformerFactory.newInstance();
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer xformer = factory.newTransformer();
xformer.setOutputProperty(OutputKeys.INDENT, "yes");
Writer output = new StringWriter();
xformer.transform(new DOMSource(input), new StreamResult(output));

If we print the output object content, we’ll get the resulting XML with the customer attribute modified:

<?xml version="1.0" encoding="UTF-8"?>
<notification id="5">
    <to customer="false">[email protected]</to>
    <from>[email protected]</from>
</notification>

Also, we can use the assertThat method of XMLUnit if we need to verify it in a unit test:

assertThat(output.toString()).hasXPath("//*[contains(@customer, 'false')]");

4. Using dom4j

dom4j is an open-source framework for processing XML that is integrated with XPath and fully supports DOM, SAX, JAXP, and Java Collections.

4.1. Maven Dependency

We need to add the dom4j and jaxen dependencies to our pom.xml to use dom4j in our project:

<dependency>
    <groupId>org.dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>2.1.1</version>
</dependency>
<dependency>
    <groupId>jaxen</groupId>
    <artifactId>jaxen</artifactId>
    <version>1.2.0</version>
</dependency>

We can learn more about dom4j in our XML Libraries Support article.

4.2. Using org.dom4j.Element.addAttribute

dom4j offers the Element interface as an abstraction for an XML element. We’ll be using the addAttribute method to update our customer attribute.

Let’s see how this works.

First, we need to build a Document object from the XML file — this time, we’ll use a SAXReader:

SAXReader xmlReader = new SAXReader();
Document input = xmlReader.read(resourcePath);
xmlReader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
xmlReader.setFeature("http://xml.org/sax/features/external-general-entities", false);
xmlReader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

We set the additional features in order to prevent XXE.

Like JAXP, we can use an XPath expression to select the nodes:

String expr = String.format("//*[contains(@%s, '%s')]", attribute, oldValue);
XPath xpath = DocumentHelper.createXPath(expr);
List<Node> nodes = xpath.selectNodes(input);

Now, we can iterate and update the attribute:

for (int i = 0; i < nodes.size(); i++) {
    Element element = (Element) nodes.get(i);
    element.addAttribute(attribute, newValue);
}

Note that with this method, if an attribute already exists for the given name, it will be replaced. Otherwise, it’ll be added.

In order to print the results, we can reuse the code from the previous JAXP section.

5. Using jOOX

jOOX (jOOX Object-Oriented XML) is a wrapper for the org.w3c.dom package that allows for fluent XML document creation and manipulation where DOM is required but too verbose. jOOX only wraps the underlying document and can be used to enhance DOM, not as an alternative.

5.1. Maven Dependency

We need to add the dependency to our pom.xml to use jOOX in our project.

For use with Java 9+, we can use:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox</artifactId>
    <version>1.6.2</version>
</dependency>

Or with Java 6+, we have:

<dependency>
    <groupId>org.jooq</groupId>
    <artifactId>joox-java-6</artifactId>
    <version>1.6.2</version>
</dependency>

We can find the latest versions of joox and joox-java-6 in the Maven Central repository.

5.2. Using org.w3c.dom.Element.setAttribute

The jOOX API itself is inspired by jQuery, as we can see in the examples below. Let’s see how to use it.

First, we need to load the Document:

DocumentBuilder builder = JOOX.builder();
Document input = builder.parse(resourcePath);

Now, we need to select it:

Match $ = $(input);

In order to select the customer Element, we can use the find method or an XPath expression. In both cases, we’ll get a list of the elements that match it.

Let’s see the find method in action:

$.find("to")
    .get()
    .stream()
    .forEach(e -> e.setAttribute(attribute, newValue));

To get the result as a String, we simply need to call the toString() method:

$.toString();

6. Benchmark

In order to compare the performance of these libraries, we used a JMH benchmark.

Let’s see the results:

| Benchmark                          Mode  Cnt  Score   Error  Units |
|--------------------------------------------------------------------|
| AttributeBenchMark.dom4jBenchmark  avgt    5  0.150 ± 0.003  ms/op |
| AttributeBenchMark.jaxpBenchmark   avgt    5  0.166 ± 0.003  ms/op |
| AttributeBenchMark.jooxBenchmark   avgt    5  0.230 ± 0.033  ms/op |

As we can see, for this use case and our implementation, dom4j and JAXP have better scores than jOOX.

7. Conclusion

In this quick tutorial, we’ve introduced how to modify XML attributes using JAXP, dom4j, and jOOX. Also, we measured the performance of these libraries with a JMH benchmark.

As usual, all the code samples shown here are available over on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.