How Do You Handle XML in Your Java Applications?
There's a lot of choice out there for handling XML data in Java applications. Although the technologies have been around for ages, the main options for most developers seems to be DOM or SAX. But even these well-established approaches have their limits. DOM can be a lot of overhead if you want to read a large document, as it keeps it in memory; SAX is more efficient, but can involve a lot of messy code if you want to keep any of the data that you read in memory. A good halfway for me is JAXB - seemingly fast and keeps a meaningful object structure in memory.
But what do you use most commonly? I know there are pros and cons for each approach, as well as some reasons that rule out an entire approach. Please leave comments to let us know which approach you typically take in your development. I think this could be an interesting roundup of technologies.






Comments
Kode Ninja replied on Thu, 2010/04/29 - 7:37am
-Kodeninja
Blaise Doughan replied on Thu, 2010/04/29 - 8:18am
in response to:
Jilles Van Gurp
EclipseLink JAXB (aka MOXy) was designed to solve just that problem. Its XPath based mappings allows you to map your own domain model to any XML schema (or multiple XML schemas).
It is also the only binding solution I'm aware of with specific support for mapping JPA entities. This includes bi-directional relationships, compound keys, and embedded key classes.
Plus, if you are already using EclipseLink/TopLink for your JPA layer, you will already be familiar with some of the extension points.
Henk De Boer replied on Thu, 2010/04/29 - 5:42pm
Am I the only one who is slightly appalled by the overwhelming amount of XML parsing options available in Java?
I'm all for choice and such, but it can be overdone really. A while back I did maintenance on a rather large 6 year old Java application and it had accumulated not less than some 14(!) different libraries for XML parsing. Many of those where not used by the application directly, but were transitive dependencies of libs the app used.
Nevertheless, apparently 2 different developers of that app at some time felt the existing options weren't enough and independently developed yet another XML to Java object library, one that was supposedly better, faster and easier to use than everything else out there, yet for some strange reason their own code was the only client of it and all other developers seemed to avoid it like the plague
Seriously, people, where's this fascination in Java coming from? How many home-grown loggers, web frameworks, ORM solutions and XML parsers do we really need?
Blaise Doughan replied on Thu, 2010/04/29 - 10:41pm
in response to:
Henk De Boer
I agree, and this is where standards/specifications can really help out. Gone are the days of XML parsers with proprietary APIs. If I'm doing StAX parsing, I can use the parser that comes with my JDK or with minimal config (without recompiling) I can swap in Woodstox StAX parser.
The same thing is happening with ORM. In the past TopLink, Hibernate, and others had proprietary APIs. But now we have the JPA spec (which is implemented by TopLink, EclipseLink, Hibernate, and others), and their are standard APIs for doing the most common things. Implementations compete on their extensions points, and the best extensions are brought back into the specification.
JAXB is to OXM what JPA is to ORM. The JAXB (JSR 222) APIs were created by a committee in which members of Metro, TopLink (EclipseLink), XMLBeans, EMF, and others participated. Many think of Metro as JAXB, but there are other implementations such as EclipseLink MOXy. Switching between JAXB implementations also requires minimal config (no recompiling).
In general the specs represent the commoditized behaviour (do we really need another way to report a characters event, query by primary key, or map a property to an XML attribute?). Innovation is in the area beyond the specifications. Luckily today the majority of parser, JPA, and JAXB implementations are open source and available to be contributed to.
Arek Stryjski replied on Fri, 2010/04/30 - 6:40am
... but I think people who use DOM should use comment to describe why they use this slow and developer unfriendly API in 2010.
To me all binding frameworks are much superior to they ancient ancestors. This is truly other, better way not something "halfway" SAX and DOM as the author suggest.
Ivan Lazarte replied on Fri, 2010/04/30 - 1:32pm
Gregory Fernandez replied on Fri, 2010/04/30 - 2:07pm
Hello !
I mostly use Digester for reading big files, and Xstream + XPP3 for writing, or reading small files.
Very often, I need to handle files that mimics collection, I mean :
<foos><foo>...</foo>...<foo>...</foo>...</foos>
API of Digester and XStream are simple and efficient for these kind of files.
Vadim Babushkin replied on Tue, 2010/05/04 - 1:56am
VTD-XML (http://vtd-xml.sourceforge.net/)
use for update big XML files by XPath. Good.
Daniel Kec replied on Tue, 2010/10/05 - 6:48am
http://www.syntea.cz/syntea_web/?page=xdefinice&hl=en_US
http://www.syntea.cz/xdefinice/en/