I am a lifelong student of computer science, music, and literature. In pursuit of those interests, I work by day as a programmer Chemical Abstracts Services, moonlight as the creator and curator of Mashed Code Magazine. review books for The Pragmatic Bookshelf and Manning and listen to a fascinating collection of music while performing all of my duties. I have specialized in working with web services and mastering various testing techniques and tools. I am finally done with formal education, having a B.A. in English from The Ohio State University and a M.S. in Computer Science from Franklin University. Nick is a DZone MVB and is not an employee of DZone and has posted 15 posts at DZone. You can read more from them at their website. View Full User Profile

Ant Task for Encoding Text Files – Reencode 0.1 Launched

10.29.2012
| 2119 views |
  • submit to reddit
I am announcing the initial launch of a tiny open source project called reencode.

Reencode is a small set of tools—aimed at developers in the Java ecosystem—that help with changing the character encoding of text files. Initially, the tools consist of an API (if you can call one class an API :) ) and, far more useful, an Ant task. A Gradle task will be following shortly.

The code is licensed under the Apache License v2.0 and is hosted on GitHub here. You can obtain the .jar file for version 0.1 here.

Using the Ant task is simple. Detailed instructions are on the projects wiki but here is a quick example:

<?xml version="1.0" encoding="UTF-8"?>
<project name="sample" default="default">
    <taskdef name="ReEncode" 
             classname="org.reencode.tools.ant.CharEncodingConverter" 
             classpath="reencode-0.1.jar"
    />
    <target name="default">
        <ReEncode inputEncoding="UTF-8" outputEncoding="ISO-8859-1" todir="out">
          <fileset dir="samples" includes=".txt"/>
        </ReEncode>
    </target>
</project>

 

The sample above shows how you can re-encode all the .txt files in the samples directory from their current, known encoding UTF-8 to ISO-8859-1. The re-encoded files are dumped into the directory named out.

If you put these tools to use, please don’t hesitate to provide feedback or ask questions as comments below or on GitHub. Pull requests are welcome too! Enjoy.

Published at DZone with permission of Nick Watts, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)