Bruno has posted 17 posts at DZone. You can read more from them at their website. View Full User Profile

iText Summit – iText 5.2.0 – XML Worker 1.1.2

02.29.2012
| 2066 views |
  • submit to reddit

On March 29, Ghent (Belgium) will be the place to be. iText is organizing a summit about PDF with speakers from all over the world: Leonard Rosenthol (USA) will talk about the upcoming ISO standards. Mark Stephens (UK) will compare HTML5 to PDF and discuss if and when it makes sense to see HTML5 as an alternative to PDF, and you'll also be able to meet the original iText developers, Paulo Soares (Portugal) and yours truly (Bruno from Belgium).

We'll focus on digital signatures with Paulo explaining PAdES and Paul van Brauwershaven demonstrating best practices in certifying and signing PDFs. We'll also reveal how iText is successfully used on Android and Google App Engine. One of our customers will present "Damn, the new generation kids are getting iPads in Highschool!", a talk about a project where Google App Engine platform is connected to a native iPad solution distributing books and content.

Please visit the schedule for more info.

In the meantime, we've release iText 5.2.0 and XML Worker 1.1.2. Take a look at the changelog for the details, read on for a human-readible overview of the changes.

The major enhancement in iText involves parsing PDFs. We received plenty of feedback regarding PDF parsing, and we've taken into account almost all the issues that were reported. This means that PDF to text conversion with iText has now improved dramatically. Soon the Belgian IRS will start using iText to parse thousands of documents looking for a national number on the first page. We're using different strategies to do this: we parse the text at a specific position if we know it; or we parse the whole page looking for a pattern if the number can be anywhere on the page. We've also improved the parsing of PDF documents in languages such as Chinese, Korean, Japanese,...

Of course, we also fixed some other bugs, and we introduced some new functionality. For instance: with the unification of cmap handling, we can now support all encodings for CJKFonts. We also extended the maximum limit for PDF files. iText can now be used to create PDFs up to 1 TeraByte! As we're investing in development for Android and GAE, we reduced the dependency of iText on Java AWT.

As for XML Worker: after writing support for the XML Forms Architecture (XFA by Adobe), we decided to use the lessons learned to improve XML Worker. The internal structure has been improved, for instance: it's now much easier to define a custom FontProvider (and therefore easier to create PDF from HTML using fonts with special characters).

We also took into account the many issues that were reported on the mailing list, for instance regarding tables that weren't rendered correctly or CSS styles that were ignored.

Finally we started implementing an SVG parser. We released a first version, but it's still very experimental.

In short, plenty of news from iText, and plenty of new stuff to try out!

0
Published at DZone with permission of its author, Bruno Lowagie.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)