I've been a zone leader with DZone since 2008, and I'm crazy about community. Every day I get to work with the best that JavaScript, HTML5, Android and iOS has to offer, creating apps that truly make at difference, as principal front-end architect at Avego. James is a DZone Zone Leader and has posted 639 posts at DZone. You can read more from them at their website. View Full User Profile

How Do You Handle XML in Your Java Applications?

04.28.2010
| 16411 views |
  • submit to reddit

There's a lot of choice out there for handling XML data in Java applications. Although the technologies have been around for ages, the main options for most developers seems to be DOM or SAX. But even these well-established approaches have their limits. DOM can be a lot of overhead if you want to read a large document, as it keeps it in memory; SAX is more efficient, but can involve a lot of messy code if you want to keep any of the data that you read in memory. A good halfway for me is JAXB - seemingly fast and keeps a meaningful object structure in memory.

But what do you use most commonly? I know there are pros and cons for each approach, as well as some reasons that rule out an entire approach. Please leave comments to let us know which approach you typically take in your development. I think this could be an interesting roundup of technologies.

Comments

Kode Ninja replied on Thu, 2010/04/29 - 7:37am

Specifically for writing XML, has anyone tried WAX? Has a Fluent api, which seems quite nice.
-Kodeninja

Blaise Doughan replied on Thu, 2010/04/29 - 8:18am in response to: Jilles Van Gurp

Your db model classes tend to be different than your jaxb classes. So you end up with a lot of stupid conversion: request xml -> jaxb dtos ->db model->sql -> result set -> db model -> jaxb dtos -> response xml. I have a lot of code and tests that are about babysitting data through this chain instead of actual business logic.

EclipseLink JAXB (aka MOXy) was designed to solve just that problem. Its XPath based mappings allows you to map your own domain model to any XML schema (or multiple XML schemas).

It is also the only binding solution I'm aware of with specific support for mapping JPA entities. This includes bi-directional relationships, compound keys, and embedded key classes.

Plus, if you are already using EclipseLink/TopLink for your JPA layer, you will already be familiar with some of the extension points.

Henk De Boer replied on Thu, 2010/04/29 - 5:42pm

Am I the only one who is slightly appalled by the overwhelming amount of XML parsing options available in Java?

I'm all for choice and such, but it can be overdone really. A while back I did maintenance on a rather large 6 year old Java application and it had accumulated not less than some 14(!) different libraries for XML parsing. Many of those where not used by the application directly, but were transitive dependencies of libs the app used.

Nevertheless, apparently 2 different developers of that app at some time felt the existing options weren't enough and independently developed yet another XML to Java object library, one that was supposedly better, faster and easier to use than everything else out there, yet for some strange reason their own code was the only client of it and all other developers seemed to avoid it like the plague

Seriously, people, where's this fascination in Java coming from? How many home-grown loggers, web frameworks, ORM solutions and XML parsers do we really need?

Blaise Doughan replied on Thu, 2010/04/29 - 10:41pm in response to: Henk De Boer

Seriously, people, where's this fascination in Java coming from? How many home-grown loggers, web frameworks, ORM solutions and XML parsers do we really need?

 

I agree, and this is where standards/specifications can really help out.  Gone are the days of XML parsers with proprietary APIs.  If I'm doing StAX parsing, I can use the parser that comes with my JDK or with minimal config (without recompiling) I can swap in Woodstox StAX parser.

The same thing is happening with ORM.  In the past TopLink, Hibernate, and others had proprietary APIs.  But now we have the JPA spec (which is implemented by TopLink, EclipseLink, Hibernate, and others), and their are standard APIs for doing the most common things.  Implementations compete on their extensions points, and the best extensions are brought back into the specification.

JAXB is to OXM what JPA is to ORM.  The JAXB (JSR 222) APIs were created by a committee in which members of Metro, TopLink (EclipseLink), XMLBeans, EMF, and others participated.  Many think of Metro as JAXB, but there are other implementations such as EclipseLink MOXy.  Switching between JAXB implementations also requires minimal config (no recompiling).

In general the specs represent the commoditized behaviour (do we really need another way to report a characters event, query by primary key, or map a property to an XML attribute?).  Innovation is in the area beyond the specifications.  Luckily today the majority of parser, JPA, and JAXB implementations are open source and available to be contributed to.

Arek Stryjski replied on Fri, 2010/04/30 - 6:40am

Apache XmlBeans

... but I think people who use DOM should use comment to describe why they use this slow and developer unfriendly API in 2010.
To me all binding frameworks are much superior to they ancient ancestors. This is truly other, better way not something "halfway" SAX and DOM as the author suggest.

Ivan Lazarte replied on Fri, 2010/04/30 - 1:32pm

dom4j for convenience, but speedy parsing, all stax. why isn't it on the list? it's a great standard.

Gregory Fernandez replied on Fri, 2010/04/30 - 2:07pm

Hello !

 

I mostly use Digester for reading big files, and Xstream + XPP3 for writing, or reading small files.

Very often, I need to handle files that mimics collection, I mean :

<foos><foo>...</foo>...<foo>...</foo>...</foos>

API of Digester and XStream are simple and efficient for these kind of files.

Vadim Babushkin replied on Tue, 2010/05/04 - 1:56am

VTD-XML (http://vtd-xml.sourceforge.net/)

use for update big XML files by XPath. Good.

Daniel Kec replied on Tue, 2010/10/05 - 6:48am

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.