Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

XMLUnit 2.x is a powerful library that helps us test and verify XML content, and comes in particularly handy when we know exactly what that XML should contain.

And so we’ll mainly be using XMLUnit inside unit tests to verify that what we have is valid XML, that it contains certain information or conforms to a certain style document.

Additionally, with XMLUnit, we have control over what kind of difference is important to us and which part of the style reference to compare with which part of your comparison XML.

Since we are focusing on XMLUnit 2.x and not XMLUnit 1.x, whenever we use the word XMLUnit, we are strictly referring to 2.x.

Finally, we’ll also be using Hamcrest matchers for assertions, so it’s a good idea to brush up on Hamcrest in case you are not familiar with it.

2. XMLUnit Maven Setup

To use the library in our maven projects, we need to have the following dependencies in pom.xml:

<dependency>
    <groupId>org.xmlunit</groupId>
    <artifactId>xmlunit-core</artifactId>
    <version>2.2.1</version>
</dependency>

The latest version of xmlunit-core can be found by following this link. And:

<dependency>
    <groupId>org.xmlunit</groupId>
    <artifactId>xmlunit-matchers</artifactId>
    <version>2.2.1</version>
</dependency>

The latest version of xmlunit-matchers is available at this link.

3. Comparing XML

3.1. Simple Difference Examples

Let’s assume we have two pieces of XML. They are deemed to be identical when the content and sequence of the nodes in the documents are exactly the same, so the following test will pass:

@Test
public void given2XMLS_whenIdentical_thenCorrect() {
    String controlXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    String testXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    assertThat(testXml, CompareMatcher.isIdenticalTo(controlXml));
}

This next test fails as the two pieces of XML are similar but not identical as their nodes occur in a different sequence:

@Test
public void given2XMLSWithSimilarNodesButDifferentSequence_whenNotIdentical_thenCorrect() {
    String controlXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    String testXml = "<struct><boolean>false</boolean><int>3</int></struct>";
    assertThat(testXml, assertThat(testXml, not(isIdenticalTo(controlXml)));
}

3.2. Detailed Difference Example

Differences between two XML documents above are detected by the Difference Engine.

By default and for efficiency reasons, it stops the comparison process as soon as the first difference is found.

To get all the differences between two pieces of XML we use an instance of the Diff class like so:

@Test
public void given2XMLS_whenGeneratesDifferences_thenCorrect(){
    String controlXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    String testXml = "<struct><boolean>false</boolean><int>3</int></struct>";
    Diff myDiff = DiffBuilder.compare(controlXml).withTest(testXml).build();
    
    Iterator<Difference> iter = myDiff.getDifferences().iterator();
    int size = 0;
    while (iter.hasNext()) {
        iter.next().toString();
        size++;
    }
    assertThat(size, greaterThan(1));
}

If we print the values returned in the while loop, the result is as below:

Expected element tag name 'int' but was 'boolean' - 
  comparing <int...> at /struct[1]/int[1] to <boolean...> 
    at /struct[1]/boolean[1] (DIFFERENT)
Expected text value '3' but was 'false' - 
  comparing <int ...>3</int> at /struct[1]/int[1]/text()[1] to 
    <boolean ...>false</boolean> at /struct[1]/boolean[1]/text()[1] (DIFFERENT)
Expected element tag name 'boolean' but was 'int' - 
  comparing <boolean...> at /struct[1]/boolean[1] 
    to <int...> at /struct[1]/int[1] (DIFFERENT)
Expected text value 'false' but was '3' - 
  comparing <boolean ...>false</boolean> at /struct[1]/boolean[1]/text()[1] 
    to <int ...>3</int> at /struct[1]/int[1]/text()[1] (DIFFERENT)

Each instance describes both the type of difference found between a control node and test node and the detail of those nodes (including the XPath location of each node).

If we want to force the Difference Engine to stop after the first difference is found and not proceed to enumerate further differences – we need to supply a ComparisonController:

@Test
public void given2XMLS_whenGeneratesOneDifference_thenCorrect(){
    String myControlXML = "<struct><int>3</int><boolean>false</boolean></struct>";
    String myTestXML = "<struct><boolean>false</boolean><int>3</int></struct>";
    
    Diff myDiff = DiffBuilder
      .compare(myControlXML)
      .withTest(myTestXML)
      .withComparisonController(ComparisonControllers.StopWhenDifferent)
       .build();
    
    Iterator<Difference> iter = myDiff.getDifferences().iterator();
    int size = 0;
    while (iter.hasNext()) {
        iter.next().toString();
        size++;
    }
    assertThat(size, equalTo(1));
}

The difference message is simpler:

Expected element tag name 'int' but was 'boolean' - 
  comparing <int...> at /struct[1]/int[1] 
    to <boolean...> at /struct[1]/boolean[1] (DIFFERENT)

4. Input Sources

With XMLUnit, we can pick XML data from a variety of sources that may be convenient for our application’s needs. In this case, we use the Input class with its array of static methods.

To pick input from an XML file located in the project root, we do the following:

@Test
public void givenFileSource_whenAbleToInput_thenCorrect() {
    ClassLoader classLoader = getClass().getClassLoader();
    String testPath = classLoader.getResource("test.xml").getPath();
    String controlPath = classLoader.getResource("control.xml").getPath();
    
    assertThat(
      Input.fromFile(testPath), isSimilarTo(Input.fromFile(controlPath)));
}

To pick an input source from an XML string, like so:

@Test
public void givenStringSource_whenAbleToInput_thenCorrect() {
    String controlXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    String testXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    
    assertThat(
      Input.fromString(testXml),isSimilarTo(Input.fromString(controlXml)));
}

Let’s now use a stream as the input:

@Test
public void givenStreamAsSource_whenAbleToInput_thenCorrect() {
    assertThat(Input.fromStream(XMLUnitTests.class
      .getResourceAsStream("/test.xml")),
        isSimilarTo(
          Input.fromStream(XMLUnitTests.class
            .getResourceAsStream("/control.xml"))));
}

We could also use Input.from(Object) where we pass in any valid source to be resolved by XMLUnit.

For example, we can pass a file in:

@Test
public void givenFileSourceAsObject_whenAbleToInput_thenCorrect() {
    ClassLoader classLoader = getClass().getClassLoader();
    
    assertThat(
      Input.from(new File(classLoader.getResource("test.xml").getFile())), 
      isSimilarTo(Input.from(new File(classLoader.getResource("control.xml").getFile()))));
}

Or a String:

@Test
public void givenStringSourceAsObject_whenAbleToInput_thenCorrect() {
    assertThat(
      Input.from("<struct><int>3</int><boolean>false</boolean></struct>"),
      isSimilarTo(Input.from("<struct><int>3</int><boolean>false</boolean></struct>")));
}

Or a Stream:

@Test
public void givenStreamAsObject_whenAbleToInput_thenCorrect() {
    assertThat(
      Input.from(XMLUnitTest.class.getResourceAsStream("/test.xml")), 
      isSimilarTo(Input.from(XMLUnitTest.class.getResourceAsStream("/control.xml"))));
}

and they will all be resolved.

5. Comparing Specific Nodes

In section 2 above, we only looked at identical XML because similar XML needs a little bit of customization using features from xmlunit-core library:

@Test
public void given2XMLS_whenSimilar_thenCorrect() {
    String controlXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    String testXml = "<struct><boolean>false</boolean><int>3</int></struct>";
    
    assertThat(testXml, isSimilarTo(controlXml));
}

The above test should pass since the XMLs have similar nodes, however, it fails. This is because XMLUnit compares control and test nodes at the same depth relative to the root node.

So an isSimilarTo condition is a little bit more interesting to test than an isIdenticalTo condition. The node <int>3</int> in controlXml will be compared with <boolean>false</boolean> in testXml, automatically giving failure message:

java.lang.AssertionError: 
Expected: Expected element tag name 'int' but was 'boolean' - 
  comparing <int...> at /struct[1]/int[1] to <boolean...> at /struct[1]/boolean[1]:
<int>3</int>
   but: result was: 
<boolean>false</boolean>

This is where the DefaultNodeMatcher and ElementSelector classes of XMLUnit come in handy

The DefaultNodeMatcher class is consulted by XMLUnit at comparison stage as it loops over nodes of controlXml, to determine which XML node from testXml to compare with the current XML node it encounters in controlXml.

Before that, DefaultNodeMatcher will have already consulted ElementSelector to decide how to match nodes.

Our test has failed because in the default state, XMLUnit will use a depth-first approach to traversing the XMLs and based on document order to match nodes, hence <int> is matched with <boolean>.

Let’s tweak our test so that it passes:

@Test
public void given2XMLS_whenSimilar_thenCorrect() {
    String controlXml = "<struct><int>3</int><boolean>false</boolean></struct>";
    String testXml = "<struct><boolean>false</boolean><int>3</int></struct>";
    
    assertThat(testXml, 
      isSimilarTo(controlXml).withNodeMatcher(
      new DefaultNodeMatcher(ElementSelectors.byName)));
}

In this case, we are telling DefaultNodeMatcher that when XMLUnit asks for a node to compare, you should have sorted and matched the nodes by their element names already.

The initial failed example was similar to passing ElementSelectors.Default to DefaultNodeMatcher.

Alternatively, we could have used a Diff from xmlunit-core rather than using xmlunit-matchers:

@Test
public void given2XMLs_whenSimilarWithDiff_thenCorrect() throws Exception {
    String myControlXML = "<struct><int>3</int><boolean>false</boolean></struct>";
    String myTestXML = "<struct><boolean>false</boolean><int>3</int></struct>";
    Diff myDiffSimilar = DiffBuilder.compare(myControlXML).withTest(myTestXML)
      .withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byName))
      .checkForSimilar().build();
    
    assertFalse("XML similar " + myDiffSimilar.toString(),
      myDiffSimilar.hasDifferences());
}

6. Custom DifferenceEvaluator

A DifferenceEvaluator makes determinations of the outcome of a comparison. Its role is restricted to determining the severity of a comparison’s outcome.

It’s the class that decides whether two XML pieces are identical, similar or different.

Consider the following XML pieces:

<a>
    <b attr="abc">
    </b>
</a>

and:

<a>
    <b attr="xyz">
    </b>
</a>

In the default state, they are technically evaluated as different because their attr attributes have different values. Let’s take a look at a test:

@Test
public void given2XMLsWithDifferences_whenTestsDifferentWithoutDifferenceEvaluator_thenCorrect(){
    final String control = "<a><b attr=\"abc\"></b></a>";
    final String test = "<a><b attr=\"xyz\"></b></a>";
    Diff myDiff = DiffBuilder.compare(control).withTest(test)
      .checkForSimilar().build();
    assertFalse(myDiff.toString(), myDiff.hasDifferences());
}

Failure message:

java.lang.AssertionError: Expected attribute value 'abc' but was 'xyz' - 
  comparing <b attr="abc"...> at /a[1]/b[1]/@attr 
  to <b attr="xyz"...> at /a[1]/b[1]/@attr

If we don’t really care about the attribute, we can change the behavior of DifferenceEvaluator to ignore it. We do this by creating our own:

public class IgnoreAttributeDifferenceEvaluator implements DifferenceEvaluator {
    private String attributeName;
    public IgnoreAttributeDifferenceEvaluator(String attributeName) {
        this.attributeName = attributeName;
    }
    
    @Override
    public ComparisonResult evaluate(Comparison comparison, ComparisonResult outcome) {
        if (outcome == ComparisonResult.EQUAL)
            return outcome;
        final Node controlNode = comparison.getControlDetails().getTarget();
        if (controlNode instanceof Attr) {
            Attr attr = (Attr) controlNode;
            if (attr.getName().equals(attributeName)) {
                return ComparisonResult.SIMILAR;
            }
        }
        return outcome;
    }
}

We then rewrite our initial failed test and supply our own DifferenceEvaluator instance, like so:

@Test
public void given2XMLsWithDifferences_whenTestsSimilarWithDifferenceEvaluator_thenCorrect() {
    final String control = "<a><b attr=\"abc\"></b></a>";
    final String test = "<a><b attr=\"xyz\"></b></a>";
    Diff myDiff = DiffBuilder.compare(control).withTest(test)
      .withDifferenceEvaluator(new IgnoreAttributeDifferenceEvaluator("attr"))
      .checkForSimilar().build();
    
    assertFalse(myDiff.toString(), myDiff.hasDifferences());
}

This time it passes.

7. Validation

XMLUnit performs XML validation using the Validator class. You create an instance of it using the forLanguage factory method while passing in the schema to use in validation.

The schema is passed in as a URI leading to its location, XMLUnit abstracts the schema locations it supports in the Languages class as constants.

We typically create an instance of Validator class like so:

Validator v = Validator.forLanguage(Languages.W3C_XML_SCHEMA_NS_URI);

After this step, if we have our own XSD file to validate against our XML, we simply specify its source and then call Validator‘s validateInstance method with our XML file source.

Take for example our students.xsd:

<?xml version = "1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name='class'>
        <xs:complexType>
            <xs:sequence>
                <xs:element name='student' type='StudentObject'
                   minOccurs='0' maxOccurs='unbounded' />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:complexType name="StudentObject">
        <xs:sequence>
            <xs:element name="name" type="xs:string" />
            <xs:element name="age" type="xs:positiveInteger" />
        </xs:sequence>
        <xs:attribute name='id' type='xs:positiveInteger' />
    </xs:complexType>
</xs:schema>

And students.xml:

<?xml version = "1.0"?>
<class>
    <student id="393">
        <name>Rajiv</name>
        <age>18</age>
    </student>
    <student id="493">
        <name>Candie</name>
        <age>19</age>
    </student>
</class>

Let’s then run a test:

@Test
public void givenXml_whenValidatesAgainstXsd_thenCorrect() {
    Validator v = Validator.forLanguage(Languages.W3C_XML_SCHEMA_NS_URI);
    v.setSchemaSource(Input.fromStream(
      XMLUnitTests.class.getResourceAsStream("/students.xsd")).build());
    ValidationResult r = v.validateInstance(Input.fromStream(
      XMLUnitTests.class.getResourceAsStream("/students.xml")).build());
    Iterator<ValidationProblem> probs = r.getProblems().iterator();
    while (probs.hasNext()) {
        probs.next().toString();
    }
    assertTrue(r.isValid());
}

The result of the validation is an instance of ValidationResult which contains a boolean flag indicating whether the document has been validated successfully.

The ValidationResult also contains an Iterable with ValidationProblems in case there is a failure. Let’s create a new XML with errors called students_with_error.xml. Instead of <student>, our starting tags are all </studet>:

<?xml version = "1.0"?>
<class>
    <studet id="393">
        <name>Rajiv</name>
        <age>18</age>
    </student>
    <studet id="493">
        <name>Candie</name>
        <age>19</age>
    </student>
</class>

Then run this test against it:

@Test
public void givenXmlWithErrors_whenReturnsValidationProblems_thenCorrect() {
    Validator v = Validator.forLanguage(Languages.W3C_XML_SCHEMA_NS_URI);
    v.setSchemaSource(Input.fromStream(
       XMLUnitTests.class.getResourceAsStream("/students.xsd")).build());
    ValidationResult r = v.validateInstance(Input.fromStream(
      XMLUnitTests.class.getResourceAsStream("/students_with_error.xml")).build());
    Iterator<ValidationProblem> probs = r.getProblems().iterator();
    int count = 0;
    while (probs.hasNext()) {
        count++;
        probs.next().toString();
    }
    assertTrue(count > 0);
}

If we were to print the errors in the while loop, they would look like:

ValidationProblem { line=3, column=19, type=ERROR,message='cvc-complex-type.2.4.a: 
  Invalid content was found starting with element 'studet'. 
    One of '{student}' is expected.' }
ValidationProblem { line=6, column=4, type=ERROR, message='The element type "studet" 
  must be terminated by the matching end-tag "</studet>".' }
ValidationProblem { line=6, column=4, type=ERROR, message='The element type "studet" 
  must be terminated by the matching end-tag "</studet>".' }

8. XPath

When an XPath expression is evaluated against a piece of XML a NodeList is created that contains the matching Nodes.

Consider this piece of XML saved in a file called teachers.xml:

<teachers>
    <teacher department="science" id='309'>
        <subject>math</subject>
        <subject>physics</subject>
    </teacher>
    <teacher department="arts" id='310'>
        <subject>political education</subject>
        <subject>english</subject>
    </teacher>
</teachers>

XMLUnit offers a number of XPath related assertion methods, as demonstrated below.

We can retrieve all the nodes called teacher and perform assertions on them individually:

@Test
public void givenXPath_whenAbleToRetrieveNodes_thenCorrect() {
    Iterable<Node> i = new JAXPXPathEngine()
      .selectNodes("//teacher", Input.fromFile(new File("teachers.xml")).build());
    assertNotNull(i);
    int count = 0;
    for (Iterator<Node> it = i.iterator(); it.hasNext();) {
        count++;
        Node node = it.next();
        assertEquals("teacher", node.getNodeName());
        
        NamedNodeMap map = node.getAttributes();
        assertEquals("department", map.item(0).getNodeName());
        assertEquals("id", map.item(1).getNodeName());
        assertEquals("teacher", node.getNodeName());
    }
    assertEquals(2, count);
}

Notice how we validate the number of child nodes, the name of each node and the attributes in each node. Many more options are available after retrieving the Node.

To verify that a path exists, we can do the following:

@Test
public void givenXmlSource_whenAbleToValidateExistingXPath_thenCorrect() {
    assertThat(Input.fromFile(new File("teachers.xml")), hasXPath("//teachers"));
    assertThat(Input.fromFile(new File("teachers.xml")), hasXPath("//teacher"));
    assertThat(Input.fromFile(new File("teachers.xml")), hasXPath("//subject"));
    assertThat(Input.fromFile(new File("teachers.xml")), hasXPath("//@department"));
}

To verify that a path does not exist, this is what we can do:

@Test
public void givenXmlSource_whenFailsToValidateInExistentXPath_thenCorrect() {
    assertThat(Input.fromFile(new File("teachers.xml")), not(hasXPath("//sujet")));
}

XPaths are especially useful where a document is made up largely of known, unchanging content with only a small amount of changing content created by the system.

9. Conclusion

In this tutorial, we have introduced most of the basic features of XMLUnit 2.x and how to use them to validate XML documents in our applications.

The full implementation of all these examples and code snippets can be found in the XMLUnit GitHub project.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are closed on this article!