Hamlet D'Arcy has been writing software for over a decade, and has spent considerable time coding in C++, Java, and Groovy. He's passionate about learning new languages and different ways to think about problems, and recently he's been discovering the joys of both F# and Scheme. He's an active member of the Groovy Users of Minnesota and the Object Technology User Group, is a committer on the Groovy project, and is a contributor on a few open source projects (including JConch and the IDEA Groovy Plugin). He blogs regularly at http://hamletdarcy.blogspot.com and can be found on Twitter as HamletDRC (http://twitter.com/hamletdrc). Hamlet is a DZone MVB and is not an employee of DZone and has posted 28 posts at DZone. You can read more from them at their website. View Full User Profile

Groovy ANTLR Plugins for Better DSLs

02.20.2010
| 7138 views |
  • submit to reddit

DSLs, ANTLR, and Groovy in one blog post? Oh yes, this should be good: a trifecta of interesting keywords. There is a horrible scourge upon Groovy based Domain Specific Languages, and no I'm not talking about curly braces. That awful syntax blocking our users from natural language based productivity is that insidious amalgamation of about 5 pixels called "the comma". If only we could rid ourselves and our users of this terrible burden!

Some are trying: GEP-3 is the Groovy Enhancement Proposal called "Command Expression based DSL" that will allow some commas to be optional. While interesting, there has not been much public activity on it recently. One of the requirements of dropping commas from methods invocations is that "the evaluation must be easily explainable". I think most people agree that we have more work to do before the proposal is easily explained.

So where does that leave us? Does Groovy force these spurious commas on unsuspecting programmers? Hardly. The compiler architecture of Groovy is fairly open, and with a little creativity you can make the commas optional in your DSL. As long as you have access to the CompilerConfiguration then you have options, whether it be an AST Transformation or an ANTLR Plugin. And remember, if you are using GroovyShell then you have access to it.

As an example, consider the easyb syntax for behaviors. What I would like to see is the comma dropped. Mmmm... it looks so much better without the commas:

given "some data" {
println '... setting expectations'
}
when "a method is called" {
println '... calling some method'
}
then "some condition should exist" {
println '... making an assertion'
}

At some point in the compilation process, this source code will be represented as a text stream. What we're going to do is intercept that text stream, and provide some simple rewrite rules to add the comma in where it should be. An ANTLR plugin can intercept this text, add commas into it, and then pass it only to the Groovy compiler: Groovy is none the wiser. There is some boilerplate wiring together to do, but the bulk of the work is defining the rewrite rule (also known as a production). For the simple case, we can use a regular expression to add the comma in:

String addCommas(text) {
def pattern = ~/(.*)(given|when|then) "([^"\\]*(\\.[^"\\]*)*)" \{(.*)/
def replacement = /$1$2 "$3", {$4/
(text =~ pattern).replaceAll(replacement)
}

Does a regular expression scale well to larger problems? No really. At some point you will need an alternative (maybe even at this point!) For instance, this Regex matches nested quotes, but the quotes must be double quotes, and Groovy's single quotes and multiline strings are not supported. Oh well, it is just an example.

The "wiring together boilerplate" consists of subclassing AntlrParserPlugin so that you can write the text and subclassing ParserPluginFactory so you can wire in your AntlrParserPlugin subclass. The ParserPluginFactory can then be passed directly to the CompilerConfiguration which is passed to GrovoyShell. That makes no sense to me even as I write it, so it is probably best to go look at the full source code listing in Groovy Web Console.

For those of you using browsers not supporting anchor tags, here is the code inline:

class SourceModifierParserPlugin extends AntlrParserPlugin {
Reduction parseCST(SourceUnit sourceUnit, Reader reader) throws CompilationFailedException {
def text = addCommas(reader.text)
StringReader stringReader = new StringReader(text)
super.parseCST(sourceUnit, stringReader)
}
}

def parserPluginFactory = new ParserPluginFactory() {
ParserPlugin createParserPlugin() {
new SourceModifierParserPlugin()
}
}

def conf = new CompilerConfiguration(pluginFactory: parserPluginFactory)
def binding = ...
def shell = new GroovyShell(binding, conf)


And once you have a GroovyShell you can evaluate the world! Including the pseudo-easyb script from the beginning of the post. It runs with no problems... missing commas and all.

ANTLR plugins have been around Groovy for a long time, and this example is based off of Guillaume Laforge's famous Groovy Web Console Script #3. There is nothing particularly hard about writing an ANTLR plugin, but there might be something difficult with maintaining it. DSLs come with a host of issues including versioning. If you create an external DSL then you've published a language. It's fun at first but not so much later.

And there you have it. No more of those despicable commas! Now we just need to do something about those dispicable DSLs.

From http://hamletdarcy.blogspot.com

Published at DZone with permission of Hamlet D'Arcy, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: