Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

In this tutorial, we’ll demonstrate how to validate an XML file against an XSD file.

2. Definition of an XML and Two XSD Files

Let’s consider the following XML file baeldung.xml, which contains a name and an address, itself constituted of a zip code and a city:

<?xml version="1.0" encoding="UTF-8" ?>
<individual>
    <name>Baeldung</name>
    <address>
        <zip>00001</zip>
        <city>New York</city>
    </address>
</individual>

The content of baeldung.xml matches exactly the description of the person.xsd file:

<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="individual">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="name" type="xs:string" />
                <xs:element name="address">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="zip" type="xs:positiveInteger" />
                            <xs:element name="city" type="xs:string" />
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

However, our XML is not valid regarding the following XSD file full-person.xsd:

<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="individual">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="name">
                    <xs:simpleType>
                        <xs:restriction base="xs:string">
                            <xs:maxLength value="5" />
                        </xs:restriction>
                    </xs:simpleType>
                </xs:element>
                <xs:element name="address">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="zip" type="xs:positiveInteger" />
                            <xs:element name="city" type="xs:string" />
                            <xs:element name="street" type="xs:string" />
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

There are two issues:

  • The name attribute is limited to 5 characters maximum
  • The address expects a street attribute

Let’s see how we can use Java to obtain this information.

3. Validating an XML File Against an XSD File

The javax.xml.validation package defines an API for the validation of XML documents.

First, we’ll prepare a SchemaFactory capable of reading files that follow the XML Schema 1.0 specification. Then, we’ll use this SchemaFactory to create the Schema corresponding to our XSD file. A Schema represents a set of constraints.

Lastly, we’ll retrieve the Validator from the Schema. A Validator is a processor that checks an XML document against a Schema:

private Validator initValidator(String xsdPath) throws SAXException {
    SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    Source schemaFile = new StreamSource(getFile(xsdPath));
    Schema schema = factory.newSchema(schemaFile);
    return schema.newValidator();
}

In this code, the getFile method allows us to read the XSD into a File. In our example, we’ll put the file under the resources directory, so this method reads:

private File getFile(String location) {
    return new File(getClass().getClassLoader().getResource(location).getFile());
}

Let’s note that when we create the Schema, a SAXException can be thrown if the XSD file is not valid.

We can now use the Validator to validate that the XML file matches the XSD description. The validate method requires us to transform the File into a StreamSource:

public boolean isValid() throws IOException, SAXException {
    Validator validator = initValidator(xsdPath);
    try {
        validator.validate(new StreamSource(getFile(xmlPath)));
        return true;
    } catch (SAXException e) {
        return false;
    }
}

The validate method throws a SAXException if there is an error during the parsing. This indicates that the XML file is not valid, given the XSD specification.

The validate method can also throw an IOException if there is an underlying problem while reading the File.

We can now wrap up the code in an XmlValidator class and check that baeldung.xml matches the person.xsd description but not full-person.xsd:

@Test
public void givenValidXML_WhenIsValid_ThenTrue() throws IOException, SAXException {
    assertTrue(new XmlValidator("person.xsd", "baeldung.xml").isValid());
}

@Test
public void givenInvalidXML_WhenIsValid_ThenFalse() throws IOException, SAXException {
    assertFalse(new XmlValidator("full-person.xsd", "baeldung.xml").isValid());
}

4. Listing All Validation Errors

The basic behaviour of the validate method is to exit once the parsing throws a SAXException.

Now that we want to gather all validation errors, we need to change this behaviour. For this, we have to define our own ErrorHandler:

public class XmlErrorHandler implements ErrorHandler {

    private List<SAXParseException> exceptions;

    public XmlErrorHandler() {
        this.exceptions = new ArrayList<>();
    }

    public List<SAXParseException> getExceptions() {
        return exceptions;
    }

    @Override
    public void warning(SAXParseException exception) {
        exceptions.add(exception);
    }

    @Override
    public void error(SAXParseException exception) {
        exceptions.add(exception);
    }

    @Override
    public void fatalError(SAXParseException exception) {
        exceptions.add(exception);
    }
}

We can now tell the Validator to use this specific ErrorHandler:

public List<SAXParseException> listParsingExceptions() throws IOException, SAXException {
    XmlErrorHandler xsdErrorHandler = new XmlErrorHandler();
    Validator validator = initValidator(xsdPath);
    validator.setErrorHandler(xsdErrorHandler);
    try {
        validator.validate(new StreamSource(getFile(xmlPath)));
    } catch (SAXParseException e) 
    {
        // ...
    }
    xsdErrorHandler.getExceptions().forEach(e -> LOGGER.info(String.format("Line number: %s, Column number: %s. %s", e.getLineNumber(), e.getColumnNumber(), e.getMessage())));
    return xsdErrorHandler.getExceptions();
}

Since baeldung.xml meets the requirements of the person.xsd, no error is listed in this case. However, calling in full-person.xsd, we will print the following error messages:

XmlValidator - Line number: 3, Column number: 26. cvc-maxLength-valid: Value 'Baeldung' with length = '8' is not facet-valid with respect to maxLength '5' for type '#AnonType_nameindividual'.
XmlValidator - Line number: 3, Column number: 26. cvc-type.3.1.3: The value 'Baeldung' of element 'name' is not valid.
XmlValidator - Line number: 7, Column number: 15. cvc-complex-type.2.4.b: The content of element 'address' is not complete. One of '{street}' is expected. 

All the errors we mentioned in section 1. were found by the program.

5. Conclusion

In this article, we’ve seen how to validate an XML file against an XSD file and that we can also list all validation errors.

As always, the code is available on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.