Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2576 posts at DZone. You can read more from them at their website. View Full User Profile

Xerces-J Adds XML Schema 1.1

  • submit to reddit
A Java library for parsing and editing XML documents, Xerces-J, released version 2.10.0 this weekend.  At version 2.x, the Apache project introduced a modular framework for building parser components and configurations.  It is called the Xerces Native Interface (XNI).  The major addition to this weekend's release is the experimental implementation of the XML Schema 1.1 Structures and Datatypes.

The Xerxes Java Parser is a fully conforming XML Schema 1.0 processor.  It is also a complete implementation of the Document Object Model Level 3 Core and Load/Save W3C recommendations.  It provides an implementation of the XML Inclusions (XInclude) W3C recommendation as well.

The new version of Xerxes-J can parse documents using the XML 1.1 recommendation, but it doesn't have an option yet to enable normalization checking.  Currently, it can handle namespaces according the the XML 1.1 Namespaces Recommendation and it correctly serializes XML 1.1 documents if the DOM level 3 load/save APIs are in use.

There's no learning curve for Xerxes' XNI if standard interfaces like JAXP, DOM, and SAX are used.  However, developers can get more power and flexibility than what's provided by the standard interfaces if they understand XNI.  Check out the XNI Manual for more information.

Xerxes-J 2.10 contains parser-related sections of JAXP 1.4 and partial support for StAX 1.0.  Enhancements to javax.xml.validation that were introduced in JAXP 1.4 include support for StAXSource/StAXResult as an input/output to the JAXP Validator, StreamResult as an output to the JAXP Validator, and StAXSource as an input to the SchemaFactory.  The implementation of the DOM Element Traversal API is complete in 2.10.

Here are the other features added to Xerces-J 2.10.0:

  • Implemented a property for starting schema assessment from a specific element declaration and enhanced the existing property for starting schema assessment from a type definition to accept a javax.xml.namespace.QName as a value.
  • Added a property for specifying the locale to use when reporting error and warning messages.
  • Added support for matching multi-digit back references in regular expressions.
  • Added a method to the ItemPSVI interface in the XML Schema API to expose error messages corresponding to the error codes that were already available in the PSVI.
  • Implemented native support for UTF-16.
  • Improved usability of the XML Schema API by updating XSNamedMap and all of the list type interfaces to extend java.util.Map and java.util.List respectively.
  • Improved performance by eliminating excessive calls to XMLSchemaValidator.findSchemaGrammar() when processing local elements with no namespace.
  • Improved recovery from schema loading errors.
  • Improved performance of Element.getBaseURI() when the depth of the node to the document root is longer.
  • Implemented several improvements in the DOM implementation to help the garbage collector reclaim objects which are no longer reachable by the application but were held on to strongly by the Document node.
  • Improved the messages reported for minOccurs/maxOccurs related schema validation errors.

Xerces-J 2.10 supports the following APIs:

  • XML 1.0 (4th Edition)
  • Namespaces in XML 1.0 (2nd Edition)
  • XML 1.1 (2nd Edition)
  • Namespaces in XML 1.1 (2nd Edition)
  • W3C XML Schema 1.0 (2nd Edition)
  • W3C XML Schema 1.1 (Working Drafts, December 2009)
  • XInclude 1.0 (2nd Edition)
  • OASIS XML Catalogs 1.1
  • SAX 2.0.2
  • DOM Level 3 Core, Load and Save
  • DOM Level 2 Core, Events, Traversal and Range
  • Element Traversal (org.w3c.dom.ElementTraversal)
  • JAXP 1.4
  • StAX 1.0 Event API (


Jose Maria Arranz replied on Wed, 2010/06/23 - 7:50am

I have two serious complaints against Xerces:

* DOM implementation is almost COMPLETELY USELESS in a multi-thread environment unless you use a GLOBAL LOCK when you are going to access DOM API with Xerces, accessing different documents DO NOT solve the thread problem. 

* Absurdly the HTML parser is deprecated (HTMLSerializer class), there is NO replacement because Xalan just know XHTML. Keep on using this parser for HTML, is good enough.

Alternative: use the DOM implementation provided by Batik SVG, by far better designed but there is a problem,  the X/HTML DOM API is not implemented, anyway is not too difficult.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.