Nicolas Frankel is an IT consultant with 10 years experience in Java / JEE environments. He likes his job so much he writes technical articles on his blog and reviews technical books in his spare time. He also tries to find other geeks like him in universities, as a part-time lecturer. Nicolas is a DZone MVB and is not an employee of DZone and has posted 228 posts at DZone. You can read more from them at their website. View Full User Profile

Use Local Resources When Validating XML

09.30.2012
| 3954 views |
  • submit to reddit

Depending of you enterprise security policy, some – if not most of your middleware servers have no access to Internet. It’s even worse when your development infrastructure is isolated from the Internet (such as banks or security companies). In this case, validating your XML against schemas becomes a real nightmare.

Of course, you could set the XML schema location to a location on your hard drive. But about about your co-workers then? They would have to have the schema in exactly the same filesystem hierarchy, and that wouldn’t solve your problem about the production environment…

XML catalogs

XML catalogs are the nominal solution for your quandary. In effect, you plug in a resolver that knows how to map an online location to a location on your filesystem (or more precisely a location to another one). This resolver is set through the xml.catalog.files system property that may be different for each developer or in the production environment:

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
 
    <system systemId="http://blog.frankel.ch/xmlcatalog/person.xsd" uri="person.xsd" />
    <system systemId="http://blog.frankel.ch/xmlcatalog/order.xsd" uri="order.xsd" />
</catalog>

Now, when validation happens, instead of looking up the online schema URL location, it looks up the local schema.

How-to

XML catalogs can be used with JAXP, either SAX, DOM or STaX. The following details the code how to use them with SAX.

SAXParserFactory factory = SAXParserFactory.newInstance();
 
factory.setNamespaceAware(true);
factory.setValidating(true);
 
XMLReader reader = factory.newSAXParser().getXMLReader();
 
// Set XSD as the schema language
reader.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");
 
// Use XML catalogs
reader.setEntityResolver(new CatalogResolver());
 
InputStream stream = getClass().getClassLoader().getResourceAsStream("person.xml");
 
reader.parse(new InputSource(stream)); 

Two lines are important:

  • In line 13, we tell the reader to use XML catalogs
  • But line 7 is almost as important: we absolutely have to use the reader, not the SAX parser to parse the XML or validation will be done online!

Use cases

There are basically two use cases for XML catalogs:

  1. As seen above, the first use-case is to validate XML files against cached local schemas files
  2. In addition to that, XML catalogs can also be used as an alternative to LSResourceResolver (as seen previously in XML validation with imported/included schemas)

Important note

Beware: the CatalogResolver class is available in the JDK in an internal com.sun package, but the Apache XML resolver library does the job. In fact, the internal class is just a repackaging of the Apache one.

If you use Maven, the dependency is the following:

<dependency>
    <groupId>xml-resolver</groupId>
    <artifactId>xml-resolver</artifactId>
    <version>1.2</version>
</dependency>

Beyond catalogs

Catalog are a file-system based standard way to map an online schema location.

However, this isn’t always the best strategy to choose from. For example, Spring provides its different schemas inside JARs, along classes and validation is done against those schemas.

In order to do the same, replace line 13 in the code above, and instead of a CatalogResolver, use your own implementation of EntityResolver.

You can find the source for this article here (note that associated tests use a custom security policy file to prevent network access, and if you want to test directly inside your IDE, you should reuse the system properties used inside the POM).

XML Toolkit

Published at DZone with permission of Nicolas Frankel, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)