I've been a zone leader with DZone since 2008, and I'm crazy about community. Every day I get to work with the best that JavaScript, HTML5, Android and iOS has to offer, creating apps that truly make at difference, as principal front-end architect at Avego. James is a DZone Zone Leader and has posted 639 posts at DZone. You can read more from them at their website. View Full User Profile

An Enhancement to NanoXML, the Extremely Compact Java XML Parser

07.08.2008
| 9702 views |
  • submit to reddit

image The proliferation of XML for data interchange and configuration file format have resulted in numerous open-source Java XML Parser libraries (left image). Indeed, Java includes its own full-fledged XML library obviating the need to download additional XML library. However, the in-built and majority of open-source Java XML Parsers tend to suffer from few considerable issues like complexity and bloated size, and this tend to be normal rather than exceptional.

This is because the majority of  Java XML Parsers are designed for enterprise usage and therefore the support for latest XML technologies like XPATH, SCHEMA are built  into the libraries resulting in increasing sophistication and bloat. Moreover, most of these libraries actually reuse additional external libraries which greatly increase its size and complexity and prohibit its widespread usage in development of desktop and mobile applications (majority of open-source projects suffer from poor documentation and support, and combined with their complexity give Java a undeserved bad name). In addition, the large library size and overwhelming complexity are just overkill if one just want to use XML in rudimentary way (like configuration file, data download through Web 2.0 service eg Amazon).

Therefore a simple and easy-to-program solution is needed if one want to bundle it with mobile, applet, desktop solution that will be downloaded and deployed.

NanoXML

NanoXML is just the perfect Java XML Parser solution for those who value ease of use, simplicity and compactness. As its name implies, it is unprecedentedly lightweight and compact in size, taking less than 50kb (after modification) in space while still retaining important functionality ( a far cry from megabyte-size Java XML Parsers). Even though its development becomes inactive since 2003, its current version is still very much useful for processing simple XML stuff.  It may not support the advanced technologies like XPATH, SCHEMA, however it is definitely capable of holding its own through its rich and easy API for searching, adding, updating and removing XML tag and attributes.

The reasons that I prefer NanoXML over competing solutions are because it is very simple to use, extremely compact, very fast and most importantly, it is much easier to extend its feature due to ‘lesser advanced’ features. Its compactness is particularly enticing for application that need to be downloaded over the web. In fact, NanoXML becomes a important component for those current projects I working now. This includes replacing the XML handling mechanism in gwtClassRun currently using string manipulation with NanoXML as having Java XML Parser will make XML processing more robust and easier to maintain.

Enhancement

Despite NanoXML in its current version 2.2.3, is a very useful library, it could definitely be made more flexible. Few caveats of NanoXML remain in this version that might deter its usage. Currently it ignores all comment in XML. Another problem is that adding of tag element can only be added to the last position. These limitations may deter others from considering it as a viable solution.

After failure to receive a response from the author over the request for those desired features, I ended up ‘hacking’  the source code and build those desired features. So after hours of dabbling with the code, the following ‘critical’ features are finally added.

- Parsing and generation of comment

- Adding tag element in specific position

See example section.

Download

The modified codes and binary are available for download

http://geekycoder.files.wordpress.com/2008/07/nanoxml-224.doc

Rename the file to nanoxml-224.zip because WordPress.com does not allow zip file to be stored in its service.

For documentation and support, please check the NanoXML’s original site.

 

For those who are interested to learn and use NanoXML, they can download through the following site (Click on the image)

Note that the last official version is version 2.2.3 . Since I have modified the code, I unofficially distinguished it by making it version v2.2.4 without official approval from the author (After failure to receive reply from email)

image

Note that the changes is only made for nanoxml-2.2.3.jar file, not the lite or SAX version

image

 

Example

For those who want to learn about NanoXML and the use of ‘enhanced’ features, the following is the example.

test.xml

<root name=”main”>
    <child name=”me1″/>
    <child name=”me2″/>
    <child name=”me3″/>
</root> 

XmlTest.java

import net.n3.nanoxml.*; 

import java.io.File; 

public class XmlTest
{ 

// ## means new features added. 

    public static void main(String[] _args) throws Exception
    {
        IXMLParser parser = XMLParserFactory.createDefaultXMLParser(); 

        /*// If pass string, use stringReader
        IXMLReader reader = StdXMLReader.stringReader(”<root></root>”);
         */
        // Pass by file. Important to use toURL method otherwise exception will be thrown.
        IXMLReader reader = StdXMLReader.fileReader(
                new File(”c:/test.xml”).toURI().getPath());
        parser.setReader(reader); 

        // parse() method does not include comment
        IXMLElement xml = (IXMLElement) parser.parse(true);   // ## true means parse comment too 

        IXMLElement _x = xml.createElement(”newChild”);
        _x.setComment(”This is new child”); // ## Adding comment
        _x.setAttribute(”att1″, “me1″);
        _x.setAttribute(”att2″, “me2″);
        xml.addChild(_x, 0);  // ## Adding at specific position. 

        IXMLElement _b = xml.getChildAtIndex(1);
        xml.removeChild(_b);  // Remove tag 

        XMLWriter writer = new XMLWriter(System.out);
        // Default for write is excluded comment
        writer.setIncludeComment(true); // ## Include comment at generation.
        writer.write(xml, true);
    } 

} 

Result

After running the code against the testfile, the output should display:

<root name=”main”>
    <!–This is new child–>
    <newChild att1=”me1″ att2=”me2″/>
    <child name=”me2″/>
    <child name=”me3″/>
</root>

References
Reference: http://geekycoder.wordpress.com/2008/07/07/using-the-enhanced-nanoxml-the-extremely-compact-java-xml-parser/
Tags:

Comments

Emmanuel Bourg replied on Tue, 2008/07/08 - 10:58am

What's the benefit of using this parser instead of the 3 already included with the JRE for 0 additional byte ?

GeekyCoder coder replied on Tue, 2008/07/08 - 3:06pm

Emmanuel,

The reason is simplicity . The free 'version' of Java XML Parser that comes with Java runtime  is very powerful and flexible, however with power comes complexity. I used to program with the bundled XML Parser and other "heavy-weight" open-source parsers, and endup in dismay.That until I discover NanoXML. What can be done easily in NanoXML seems like rocket science in other parsers. Suggest trying creating a example in NanoXML, and check how simple to manipulate XML  If someone want to simply manipulate a XML document without much fuss, do check out NanoXML especially its overall libray size is less than 50kb.

Thomas Boshell replied on Tue, 2008/07/15 - 3:35am

I had also used Nano for a long time, even got the source myself and updated it for the newer Java versions.  I liked how it stored the elements in a java.utils.Collection and added the types. 

But eventually, we noticed that the parser did not always release the file connection; especially in error situations or if the finalize method was not yet called by the gc.  The result was that our source control system kept having conflicts due to the xml parser still maintaining an open connection.  This would need to be fixed so that the parser releases the connection once it is done and not by the garbage collector at the best case.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.