Daniel have worked with modular RCP platforms as are NetBeans and Eclipse, developed support for proprietary technology which is processing huge volumes of XML data. Now is working on Register of traffic accidents(J2EE application build on Websphere) for Ministry of Interior of the Slovak Republic. Daniel has posted 8 posts at DZone. You can read more from them at their website. View Full User Profile

XML Processing and Validation Merging Together

06.10.2010
| 4609 views |
  • submit to reddit

Where does the border lie between validation and the processing itself? The question is if those should be even two separate procedures at all. Let's start with something simple like what the default values are. The most of the other Schema languages allow you to specify the default value. But when should the default value be used? What if some external resources are needed for the default value itself, like a database or a different value in the same document? Does the default value really have to be static? Yes that sounds familiar, it is processing already.

XML Processing

And what about the processing, didn't you ever need to be sure that the input is valid? And there is another common case, when you need to take actions during the processing based on the fact that some part of the document is valid or not. Sometimes you can't determine the variation of invalidness but you need to take some action based upon that.

XDefinition merging processing and validation

XDefinition is a schema language developed from the beginning with close respect to the natural readability by keeping the form of the XML data source. Designed to be understandable not only for programmers but also to analysts, system architects and all the other parties concerned with the project. This kind of approach leaves no space to misinterpret data description during it's exchange, starting from the architects to database specialists. XDefinition merges the validation and processing of the XML document as much as you need and what is more important if you need.

In the example below we show the usage of an external method, based on the information obtained during the validation is called method. An External method could be any static method in the class supplied to the XDefinition processor through it's API.

<?xml version="1.0" encoding="UTF-8"?>
<xd:def xmlns:xd="http://www.syntea.cz/xdef/2.0"
xd:name="invalidness"
xd:root="Books"
xd:classes="cz.syntea.bookcase.Autors">
<Books>
<!-- addAuthor(String author) is user defined Java/.NET method in Autors class,
supplied to XDefinition thru processor's API -->
<Book title="required string"
author="required string(5,40); onTrue addAuthor(getText());"
pagecount="required int"/>
</Books>
</xd:def>

 If you find this kind of approach interesting read the article about readability of the schema languages here on the Javalobby called XSD Schema is not the only way or try the Tutorial with examples

Resources:

Published at DZone with permission of its author, Daniel Kec.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Alessandro Santini replied on Fri, 2010/06/11 - 10:14am

Validation and processing are indeed two distinct processes and they have to.

If we set such academic and pointless examples aside for a moment and we take some industry standard schemas like SWIFT, MISMO, IFX or EDIFACT (this one is not XML-based, but supports the same argument), we may notice how semantic validation and syntax validation are two different steps.

There are validations that require access to a backend system (e.g. IBAN/PostCode validation) or elements that are mandatory in some cases, and optional in others.

Sometimes these conditions vary from party to party - in other words, some banks may want to enforce a validation rule where others don't (this is a typical example of EDIFACT). This is way banks are normally writing "implementation guides" for the external parties using such protocols.

So, to conclude, embedding such complex validations in a schema is - in my opinion - definitely a bad idea. May work for trivial or strictly defined domains, but very hardly will work in contexts where each participant has its own (the majority of cases), that would turn in a nightmare - a part of checks in the schema definition and some others in the back-end business logic.

Do you still want to convince me? Do me a favour: take an industry standard schema as MISMO, IFX, SWIFT MX, or anything you like - and show me where the benefits with adopting XDefinition are. Until then, it will stay safely in a drawer.

 

Daniel Kec replied on Mon, 2010/06/14 - 5:37am in response to: Alessandro Santini

I'm sorry for my "pointless example", it is actually a ramification of my struggle to keep the examples as simple as possible. You are absolutely right about that each party need to evaluate the validated data differently. But everyone of those parties needs to keep description of that evaluation, description of the processing itself. There are two kinds of the XDefinition, the one called "validation interface" and the implementation. The point is, you can keep one interface for all the parties and let them to extend it for the processing purposes at pleasure. It's little deeper but I do promise that I'm going to write an article about that eventually.

Alessandro Santini replied on Mon, 2010/06/14 - 6:33am in response to: Daniel Kec

Thanks Daniel... a bit more context will surely help people to fully understand the value of your proposition. Not everybody has time to get through the specs and play with them.

If you let me a suggestion - safely assume that the reader is proficient enough with XML and alike; if he's not, he won't understand your posting anyway.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.