Performance Zone is brought to you in partnership with:

I am the founder and CEO of Data Geekery GmbH, located in Zurich, Switzerland. With our company, we have been selling database products and services around Java and SQL since 2013. Ever since my Master's studies at EPFL in 2006, I have been fascinated by the interaction of Java and SQL. Most of this experience I have obtained in the Swiss E-Banking field through various variants (JDBC, Hibernate, mostly with Oracle). I am happy to share this knowledge at various conferences, JUGs, in-house presentations and on our blog. Lukas is a DZone MVB and is not an employee of DZone and has posted 249 posts at DZone. You can read more from them at their website. View Full User Profile

How to Speed Up Apache Xalan’s XPath Processor by a Factor of 10

08.27.2013
| 5134 views |
  • submit to reddit

There has been a bit of an awkward bug in Apache Xalan for a while now, and that bug is XALANJ-2540. The effect of this bug is that an internal SPI configuration file is loaded by Xalan thousands of times per XPath expression evaluation, which can be measured easily as such:

Element e = (Element)
  document.getElementsByTagName("SomeElementName").item(0);
String result = ((Element) e).getTextContent();

Which seems to be an incredible 100 times faster than this:

// Accounts for 30%, can be cached
XPathFactory factory = XPathFactory.newInstance();
 
// Negligible
XPath xpath = factory.newXPath();
 
// Negligible
XPathExpression expression =
  xpath.compile("//SomeElementName");
 
// Accounts for 70%
String result = (String) expression
  .evaluate(document, XPathConstants.STRING);

It can be seen that every one of the 10,000 test XPath evaluations led to the classloader trying to look up the DTMManager instance in some sort of default configuration. This configuration is not loaded into memory but accessed every time. Furthermore, this access seems to be protected by a lock on the ObjectFactory.class itself. When the access fails (by default), then the configuration is loaded from the xalan.jar file’s configuration file:

META-INF/service/org.apache.xml.dtm.DTMManager

Every time!

A profiling session on Xalan

Fortunately, this behavior can be overridden by specifying a JVM parameter like this:

-Dorg.apache.xml.dtm.DTMManager=
  org.apache.xml.dtm.ref.DTMManagerDefault

or this:

-Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
  com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault

The above works, as this will allow bypassing the expensive work in lookUpFactoryClassName() if the factory class name is the default anyway:

// Code from c.s.o.a.xml.internal.dtm.ObjectFactory
static String lookUpFactoryClassName(
       String factoryId,
       String propertiesFilename,
       String fallbackClassName) {
  SecuritySupport ss = SecuritySupport
    .getInstance();
 
  try {
    String systemProp = ss
      .getSystemProperty(factoryId);
    if (systemProp != null) {
 
      // Return early from the method
      return systemProp;
    }
  } catch (SecurityException se) {
  }
 
  // [...] "Heavy" operations later

References

The above text is an extract from a Stack Overflow question and answer that I contributed to the public a while ago. I’m posting it again, here on my blog, such that the community’s awareness for this rather heavy bug can be raised. Feel free to upvote on this ticket here, as every Sun/Oracle JDK on this planet is affected:

https://issues.apache.org/jira/browse/XALANJ-2540

Contributing a fix to Apache would be even better, of course …



Published at DZone with permission of Lukas Eder, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)