Stephen has posted 1 posts at DZone. You can read more from them at their website. View Full User Profile

Enhancing Java - Multi-lingual blocks

05.02.2008
| 4715 views |
  • submit to reddit

The reality for Java is that there are many other programming languages, and many of those have features that Java developers sometimes wish they could access. But its simply impossible to add all those features. Is there a possible alternative if we think 'outside the box'?

Multi-lingual

What I'm thinking about in this blog is the possibility of embedding Groovy, Ruby, Jython or Scala code directly within Java code.

Why might that be useful? Well each language has their own benefits, whether Scala's functional style or Groovy's GStrings. Including a small part of another language within the main code body could be useful, although obviously this would be a technique to be used with care.

And it doesn't have to stop at known languages. What about a dedicated 'SQL language'? Or a dedicated 'XML language'? These would be more than just DSLs, but actual languages with whatever syntax rules are most applicable.

So, what might a syntax look like:

public String fetchRow(int id) 
{
:groovy:
{ println "Row id: $id!" }


:sql:
{ SELECT %text% FROM my_table WHERE row_id = %id%; }
return text;
}

The idea is that a block of code, surrounded by curly brackets, can be identified as belonging to a different language. In this case I've used the syntax of the name of the language (which would have to be imported) surrounded by colons. Note that there is nothing specific about the syntax within the block. Bear in mind that the syntax isn't that important - its the concept that matters.

The Groovy example - just normal Groovy code - outputs the row id using an embedded string. The SQL example is an invented 'language' where a column is read by id, and then returned to the Java code as the variable text.

So, what about the detail? Well, the approach requires two parts.

Firstly, there needs to be a parser for each language that understands the relevant syntax. This will typically be a variation of the normal parser for a 'real' language like Scala or Ruby. For a new language like SQL or XML, it would be written from scratch. The parser also needs to be able to recognise when the block of code in that language is complete.

Secondly, the parser needs to be able to share variables with the surrounding code. As a basic principle, this can be thought of as a map, where the other language code can both read and write to the map. Of course this requires there to be a mapping between the various type systems - for Groovy this should be easy, other languages might find that more tricky.

So how hard is this to implement? Probably pretty hard. But it does open up lots of possibiilties - wheter for embedded DSLs or larger blocks of code in another language.

Summary

This is an outline of an idea to allow other languages, whether existing or new, to be easily embedded directly in existing code. Any thoughts?

Published at DZone with permission of its author, Stephen Colebourne.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Christopher Brown replied on Fri, 2008/05/02 - 7:07am

<p>I suspect something similar to LINQ for .NET might be a better overall approach than templated SQL (or JPA QL) for many cases.  There might be some corner cases where it's useful, but I suspect it'll just cause readability headaches (different models of program flow, different data representations).  Furthermore, if this idea were adopted, one might want to use language-mixing where Java is NOT the primary language (Groovy with embedded Java), which could wreak havoc with symbols and DSLs.</p>

<p>From a practical viewpoint however, apart from identifying the language, it'd also be necessary to identify the language version (to take into account semantic differences, different keywords and other capabilities), and as such it'd be a big burden on IDEs (which already use declared "language version" parameters on Java-only projects).</p>

<p>This could be solvable by splitting a class into multiple source files (like .NET partial classes), but I don't see any advantage of that when compared to external sources being invoked as at present through script engine interfaces. </p>

<p>It also reminds me of the "spaghetti code" you get when mixing HTML, JSP tags, and Java in the same file.</p>

<p>- Chris </p>

Austin Kotlus replied on Fri, 2008/05/02 - 7:52am

I really like the concept.  It may be more clear/structured if this were limited to method implementations.  Simply allow a user to specify the implementation language/version for a method as follows:

@Language("Groovy", "1.5")
public String fetchRow(int id)
{
     println "Row id: "+id;
}  

Rory Marquis replied on Fri, 2008/05/02 - 8:43am

I guess I would prefer it if something like this worked in the same vein as Native methods work. Failing that the Annotation method is nicer

However, I would have 3 primary objections:

1. It could become horribly messy in there, it's a future maintenance nightmare.

2. A developer may have to know more than just Java to get a pure Java Development role thus diluting required skill se ts even more.

3. A developer may decided to do something in Ruby that other developers of the same software know nothing about. So great for the lone wolf, rubbish for the team.

 

Vlad Koval replied on Fri, 2008/05/02 - 9:11am

Integration nightmares are expected.

Artur Biesiadowski replied on Fri, 2008/05/02 - 2:16pm

Certainly not inside java, but I don't see a reason why java could not be one of the possible dialects allowed inside.

Basically, new language, with syntax focused on defining interfaces and allowing embedding another languages for implementation inside. Compiler with pluggable modules for those languages (with java being one of them). Eclipse plugin supporting all this. Special source file extention to distinguish it from normal java code (I would propose .mess).

Java bytecode can survive a lot of abuse. As far as maintenance is concerned, refactoring would be a major pain for sure, decoding criptic one-liners in scheme not so fun, but the biggest problem IMHO would be to keep the build system working. With each new version of java, some of the plugins would fail here and there - forcing you to spend time upgrading them (possibly rewriting parts) just to support few lines of cryptic languages, or rewriting offending parts to less cryptic languages.

Stefan Schulz replied on Fri, 2008/05/02 - 4:03pm

It's funny, how people tend to raise warning signs on proposals being a bit broader in capabilities. So what's the current situation? Ranging from DSLs being implemented by stretching API-Design to an extreme to proposals to add XML as native Java syntax.

While I do not see Groovy or JRuby as main targets for such a "nested DSL" feature, integrating structured and readable DSL elements certainly is advantageous. More than that, it gives compile-time control on embedded code like XML, SQL, or even allows for multi-line strings (heredoc). A DSL-parser can specify the API to be used, including some type checking to match parameters and maybe a return type.

Restricting the feature to method level would actually defeat its purpose. I'd rather have compile-time supported nested DSLs than Java-ified DSL-APIs that probably become a runtime nightmare for debugging.

nitin pai replied on Sat, 2008/05/03 - 12:47pm

I completely agree with Roridge. 

It would be a nightmare. But the idea that you have suggested is already being made possible. What else are JRuby, Jython or Groovy? They are simply concepts or programming methodology applied by different languages into Java. There atleast has to be a least common denominator to run the overall application. Here it is Java. How can you think of integrating Java code into Ruby with the environments being different. How would you go about doing memory management or object refercing into two disparate environments.

 There is a solution which has made applications multi lingual. Its called SOA. But it is not programming for multi programming environments. .NET maybe a concept you are trying to suggest but hey C# and VB, both are Microsoft's products. You are getting the point? There has to be a common runtime environment without which it isn't possible. The JVM in Java is the least level possible where you can bring different programming concepts together. How would you go about integrating a JVM with a C runtime or a Perl interpreter. 

Java had made it possible to experiment with Groovy or Scala with the JVM. But still the compile time environments are different. Component modularity might be achievable like Groovy has made it possible at the byte code level. As in the example you stated, compilers for different environments will have to be integrated which would make it a mess for the creators as well as the developers who would have to adhere to complex specs then. 

 Overall, I don't like the idea in the first instance. Mixing and matching programming environments in an atomic component as you have shown in your example, is simply undigestible.

Mrmagoo Magoo replied on Sun, 2008/05/04 - 4:16pm

God no!?!?

 Hey Zeus this would be a complete nightmare! You mean on top of developers being able to write in arbitrarily terrible code in many arbitrarily bad ways, they can now do it in any number of languages?!?!?

 

Are you trying to kill people like me?? 

 

Mr M.

aka: The Fixer Guy who has to go through and clean up other people's messes once they are done... 

Collin Fagan replied on Mon, 2008/05/05 - 10:39am

I like the idea of blocks of code that have custom syntax rules. The SQL block seems like a good fit. I would like to see something like a :RegEx: block too. This way all the escaping necessary to embed a regular expression into a string could be avoided, and the compiler/ide could check the RegEx for syntax at compile time. As for the maintenance nightmare subject. Language mixing happens today except right now everything is embedded in string variables. That just makes things even messier. Could this be abused? Yes, but embedding scripting languages can be abused anyways.

Mrmagoo Magoo replied on Mon, 2008/05/05 - 3:54pm in response to: Collin Fagan

Sorry, totally disagree. :) I might be half blind, but I can still smell bad code when I hear it!

Right now multi-language environments of today happen in "areas". The project we are currently working on has groovy in the ant scripts. But that is the only place. 

 JSP is on the web front end.

 Javascript in the java script files. (because putting them in the JSP directly is generally a bad idea)

Java in the java files. 

 Should we feel the need to use groovy/jpython/javaFX etc they will also go in their  own files and there will be clear seams between these areas! (so I don't go batsh*t crazy when the next script monkey they hire goes nuts with embedded groovy in my well designed and type safe architecture!)

What is being proposed here is an "anything goes" approach across the entire code base. Imagine your architecture being littered with scriptlets??

Why are people saying that sciptlets in a strongly typed language (yes, that is a GOOD thing for enterprise code!) when it is common knowledge/sense/understanding they are bad in a JSP page?? (I will naively assume that the people here agree on this point...??)

The day I found hibernate and SQL in my java pages was banned (actually that required moving to a new company...but whatever) was a great day indeed!

scriptlets in JSP == B.A.D code smell. SQL in java also stinks.

How on earth is putting "scriptlets" in JAVA code any better?!! Sounds like web developers trying to infect us enterprise guys with their dreams of a rubyesk world?? :)

In fact: QED. What more proof than this analogy do I need??

Smells like code candy to me? Tastes nice, gives you diabetes down the line.

But what would I know? I am just the janitor?  

 

 

Collin Fagan replied on Mon, 2008/05/05 - 8:07pm in response to: Mrmagoo Magoo

I understand what you are saying, and the more I think about it the more I wonder about the ramification of multiple intertwined groovy/jruby/* runtimes on garbage collection, thread management and a entire host of other details. I looked at this and saw a better way to do regex. One that does not include embedding the expression in a string and having to escape \ with another \ every time. Have you ever tried to match against \? It's like 8 \'s to get it into a pattern. There are plenty of things that get shoved into strings because there is no concise syntactic way of expressing them. A customizable syntactic block sounds like an attractive, but probably dangerous, way of achieving it.

Mrmagoo Magoo replied on Mon, 2008/05/05 - 8:26pm in response to: Collin Fagan

<clicking more than once is bad for your health>

Mrmagoo Magoo replied on Mon, 2008/05/05 - 8:25pm in response to: Collin Fagan

I have indeed. In general, I tend to place the regex's outside of a java class (like with SQL/JS etc) beause it is generally config. This is not always practical of course and is a real pain. 

I have did a paper on multiparadigm programming at Uni. It was fun. Basically used a java precompiler to allow functions as first class variables/prologue assertions/AOP etc etc. All in Java. Was kind of neat, especially since the actual "under the hood" was just plain java and not that much of it either.

Allowed a lot of "power" in the code, with little overhead. 

But now I have been in industry - no way hose ay! There are day to day realities of being a developer that cannot be ignored with this stuff. Its all fine for us hardened vets, but not everyone is like that. 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.