NoSQL Zone is brought to you in partnership with:

Tim Boudreau is a noted technology consultant, evangelist and author. While perhaps most broadly known for his leadership on Sun Microsystems’ NetBeans, those who’ve worked with Tim remark most on his technical chops, passion for a great challenge and rare gift of great communication. And, as a former troubadour, he’s pretty tough when it comes to bad 70s rock lyrics too. A real renaissance programmer. Tim has posted 25 posts at DZone. You can read more from them at their website. View Full User Profile

Experiences With Gradle? Or, What I Really Want in a Build Tool...

09.07.2010
| 7884 views |
  • submit to reddit

I'm looking at build tools for a fairly large project (mostly Java with some Jython), and pretty dissatisfied with all of the standard options out there. Gradle looks like it might be able to do the things I want. I am wondering, is anybody out there is using it who could talk about their experiences with it?

Another interesting project, though it looks quite young, is Raven.

Here are the cases where, IMO, the usual suspects (Maven, Ant w/ customizations, Ant+Ivy) fail:
  1. I strongly believe in decomposing software into small libraries that do one thing well. Build systems that have a concept of inter-project dependencies at all tend to treat other projects exactly the same way as third-party libraries. This adds enormous overhead to the development process when you are working across multiple libraries. 

    Building/running/testing/profiling a project which you have the source to locally is fundamentally a different activity than doing the same against a 3rd party library, and you should be able to run against .class files rather than a packaged JAR in that case (you should also, of course, be able to run against packaged resources too).

    While IDEs can do some compile-on-save magic, this really needs to be a first-class feature of the build environment, not a situation where IDEs hack around limitations of the build tool.

  2. XML is great for describing data and horrible for describing behavior. Builds - the parts that need customization, anyway - are mostly behavior.
  3. Convention over configuration is great when it works, but when the convention is wrong for the project, there should be a way to override it in a straightforward, readable way (i.e. I do not want to write Maven plugins, and if I do, nobody will be able to see the nature of the customization from the pom.xml file, they'll just see that the build uses some opaque thing they've never heard of before).

     Along with this, there needs to be a way for plugged-in code to figure things out in the case that customization is happening - for example, I have an Ant task that cross-links and merges Javadoc. Convention should not be the only solution to that problem.

  4. 3rd party libraries do need to be handled well, and builds need to be portable without putting JARs in version control. The one (only) redeeming trait of Maven is that it gets this part right enough to be valuable.

     We tried using Apache Ivy as an alternative - I was hoping that Ivy would give us the library handling of Maven without dictating every other aspect of how our software is built. However, a strange thing happened on the way to librating library-downloading from totalitarianism - Ivy seems to be totally focused on defining the One True Way All Repositories Must Be Organized. Of course, to be actually useful, you need Ivy to talk to a messy, disorganized Maven repository. To do that, you (cue drumroll)...write regular expressions to map that Maven repository into The Way God Intended Libraries To Be Organized. I will not waste my or anybody else's time doing that.

  5. Version labels should be based on DVCS ids and that should be normal and easy. If you are using a versioning system such as Mercurial, then you have an easy way to, instead of arbitrary numbers, encode complete information about how to reconstruct the exact bits as the "version" of the software - just run hg id. That's much more reliable than knowing that 1.0 maps to an SVN tag.

     Incrementing version numbers may be needed for other reasons, but it should be possible to embed this information in the software without extreme measures or resorting to something like magic CVS version tags, or rewriting the way JARs get built.

  6. Automatically downloading the closure of a project's dependencies is nice, but some control is needed. Maven projects tend to grow giant tails of library dependencies. If you know you can safely give your dependency tree a haircut, there should be a non-hacky way to do that.

     Ironically, this happens because many projects are not factored out into enough small libraries that do one thing well. Why aren't most projects factored this way? Because all the available build tools add too much overhead to the development process if you do that (mainly in the form of rebuilding things you shouldn't need to rebuild, if you are working in multiple projects at the same time).

  7. Circular test dependencies should be possible. If Project A defines a data access layer, Project B implements it on a production database, and Project C implements it on an embedded database for ease of development, I should be able put implementation tests in Project A. The alternatives are, duplicate the tests in A and B and have them get out-of-sync, or have a Project D that contains the tests for Project A. The former is obviously a bad idea, and the latter is pointless overhead (not to mention that with Maven, you'll be doing extra cleaning and building to pick up changes you made in B or C).

     Absolutely, circular compile-time dependencies should be forbidden. But running tests is not the same thing as compiling, and a build system should not treat it as if it were.

  8. Test output should be optimized for development-time, not for report generation. For me, being a System.out.println() kind of guy, Maven's test output handling is mind-bogglingly awful. All I really ever want out of development-time logging (which is not the same thing as deployment-time logging) is, show me the console output, all of it, in the order it was logged. Do nothing clever with separating System.out and System.err, nothing clever with separating different tests in different files. Just give me the fewest gestures possible to get to the failure-point when a test fails.
  9. Parent-child relationships between projects should be a function of their dependency graph. Shared configuration (such as library versions or file encoding) is not a kind of project - and shoehorning it into something like a Maven "parent" project just places pointless limitations on what you can do (try having two parent projects some time - why should that be impossible?). Configuration is tree-like, and may indeed map to a dependency-tree, but not always.
The point of all of this (sorry if it devolved into a litany of complaints) is that all of the build tools I have worked with either don't solve the problems I want solved at all (roll-your-own-dependency mangement with Ant) or make trade-offs which waste time the development process (extra clean-and-build cycles, running tests you know will pass), or push the development process away from good practices, in the interest of making automation and reporting easier (for a continuous build, running against a project you have the source to is like running against a library jar). I don't think these things are necessarily in conflict - just that IDE developers are interested in a different set of issues than build-tool developers, but are fundamentally working on the same problem.
Published at DZone with permission of its author, Tim Boudreau.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Loren Kratzke replied on Tue, 2010/09/07 - 12:24pm

You have way too many issues to reply to individually, but generally speaking, if you can't do it in Maven then you probably shouldn't be doing it at all. The same applies when you try to bend Maven into a shape that is unnatural. Maven is not Ant and generally you will want to retire all of your bizzare Ant tasks when moving to Maven because they should no longer be needed.

We use Maven, Subversion, Nexus, and Hudson with 70 Maven modules and probably about the same number of 3rd party libs without issues. We have one custom Maven plugin that unpacks some stuff and a few others that validate some stuff, but it's cool. It's not the end of the world. Testing and reporting is awesome, practically right out of the box.

 Besides manually declaring dependencies beyond those which you actually need, Maven does not attract trivial dependencies. So I am not sure what you mean about giving your dependency tree a haircut. Migrating from Ant to Maven/Nexus/Hudson is probably the smartest thing that anybody can do when considering a large project. I strongly recommend this.

Jilles Van Gurp replied on Tue, 2010/09/07 - 2:23pm

You are on to something here. Personally, I see the need for build tools as an architectural problem in the Java world. Things have gotten so hideously complex that the only way to stay sane is to automate the moving around of files, the gazillion of transformations of artifacts & zipping + unzipping of them, the downloading and manipulation of depending artifacts etc and other repetitive rituals that dominate the dull work of a Java developer (as opposed to the actual fun and creative bits). Maven is not a solution, it just tries to hide the problem and like all poor abstractions ends up making the problem worse rather than better.

Part of the problem is that artifacts are file based. This is a convention that dates back to the seventies (before then, things were tape and punchcard based) but by no means the only way to deal with artifacts. For example, small talk had an image based system where the image contained all the artifacts, including the IDE and tools. Source code was not stored in separate files. Visual age (IBMs predecessor of Eclipse) had a similar concept where all the java code lived in a database and source code files only came into existence when it was exported from the  IDE, which was an optional step since normally you'd only need the binary class files and jar files.

Another part of the problem is that version control systems (VCS) are also file based. You can store a java file in a VCS but you can't store a method or inner class. The VCS tooling is especially poor at keeping track of relations between the stuff it stores. E.g. the notion this artifact was produced from those artifacts is not something that VCS tooling keeps track off, generally. Hence the need for conventions and tooling to do it instead.

Then there is the whole ritual of transforming and moving around artifacts. You can't just deploy a single change you just made to some method. You have to first compile its source artifact together with loads of depending artifacts; then put the resulting artifact into a zip file with a weird manifest file that describes its content (aka jar or war file) which then needs to be copied over to some special location where something else (e.g. an application server) can pick it up and do something useful with it (which generally requires killing and restarting it).

Allow me to spell out the single command with which I used to update my staging server for a python Django project:

svn commit -m 'cool new feature' foo.py

That's it. A cron job on the staging server did a svn up and since it is all interpreted (or rather JITed) anyway, the change is effective immediately. No build tools required. Edit, commit, F5, Debug. Anything more is just poor architecture getting in your way. Maven is a work around and not a solution.

BTW. http://www.playframework.org/ does this more or less, showing that it is possible to work this way in Java as well.

 

 

Loren Kratzke replied on Tue, 2010/09/07 - 2:56pm in response to: Jilles Van Gurp

I'm just a simple caveman, but allow me to spell out a single command which I use quite often:

svn commit -m 'cool new feature' foo.java

That's it. Hudson scans SVN for changes, detects my commit, builds the modified module and all downstream modules that depend upon that module, runs all unit tests, and then upon success uploads each artifact to the Nexus repository from where they can be fetched by other Maven builds (there are ~100 devs). I need no cron job, and I deploy my webapp to testing server on demand with a second command, because I want it that way.

How exactly is your system better than my system? How is your system (with it's hand rolled cron script) a solution and my system just a work around? And regarding VAJ, that was the worst system ever designed - keeping code in DB2 as opposed to on the file system meant you were totally locked into VAJ/DB2 with zero options, YACK! HORRIBLE! For the record, I also hate Eclipse.

Martijn Verburg replied on Wed, 2010/09/08 - 3:59am

I think Maven 3 with its ability to have Groovy as a POM language will be a compelling build tool for complex builds.  Convention ofver configuration for the easy stuff, Groovy goodness for the bits that aren't covered by a convinient Maven plugin.  Has anyone experimented with this yet?

Tim Boudreau replied on Thu, 2010/09/09 - 3:35am

@Loren:
if you can't do it in Maven then you probably shouldn't be doing it at all.
The answer to this depends on what you mean by "can't". If you mean "it's really, truly impossible" then, sure, you can write custom plugins for anything and find a way to do things - the question is how much time do I want to spend hacking my build tool rather than working on the software it's supposed to build. Say that I just want to append "Version: " + the output of running 'hg id' on the command line to a JAR manifest. That ought to be about 4 lines of code in a build script, which anybody can read and see "Oh. This runs 'hg id' and puts its output in a build script." Writing a custom maven plugin is a lot of overhead - and means you have to read the docs or source code to the plugin to figure out what it does. Especially if it's to do something which would take three lines of code in a scripting language, that seems like really pointless complexity to me.

I am not sure what you mean about giving your dependency tree a haircut
Of the top of my head, one simple case: I have a library that provides some convenient features on top of Guice (flexible handling of @Named and a few other things). Its author decided to include a few utility classes that reference the Servlet API. If I *know* the code that calls it will never touch those classes, I don't want it.

I'm also coming from years of working mostly in an environment (the NetBeans module system) where dependencies are something which are managed very strictly - so the idea of managing the classpath doesn't disturb me nearly as much as the idea of not actually knowing what I'm deploying, or adding one JAR to my project logically adding 30 JARs without me realizing it until I've already written code that depends on that one JAR.

@Jiles

Part of the problem is that artifacts are file based
I agree with both you and Loren on this :-) There are a lot of really cool things you can do (effortless, 100% accurate refactoring and code analysis, or making formatting simply an attribute of the view, and semantic diffs, all for free, for example) if code is stored in some kind of database-like back-end which actually understands the code semantically.

Alas, the problem with that is that, to make that into a wonderful new world instead of a nightmarish prison, you need the entire rest of the world to decide they're going to store everything that way too. Files are the lowest common denominator, so they end up being the only thing somebody making a tool can reliably depend on talking to, so to reach the widest audience, it's a race-to-the-bottom. It's kind of like SQL - great language for running reports, and totally the wrong tool for the kinds of things it is often used for (a rant for another day). What you get in the exchange for holding your nose is lots and lots of tools that can speak SQL and databases with hundreds of years of person-hours invested in optimizing them (as with files and filesystems). That doesn't change the fact that the paradigm is fundamentally mismatched to the domain, but it can make it worth living with.

@Martin: Groovy-in-Maven could be interesting, if it really is free to customize the build sufficiently. My suspicion is that a bunch of the things I want (optionally test against .class files, perform minor custom code generation, tee test output handling to be developer-friendly and report-friendly at the same time, allow circular test dependencies, generally behave differently when dealing with dependencies vs. JAR dependencies) are so low level that they won't be overridable without having to duplicate functionality that you otherwise get for free (could Groovy *really* just plug into classpath assembling or output handling, without having to write a replacement for a much bigger build step?). But you're right, there is some hope.

Jesse Glick replied on Mon, 2011/04/04 - 10:37am

Regarding point #1, so long as the modules in question can be built in the same reactor - i.e. Maven can "see" where all the source projects are - you can indeed run something like 'mvn test' in which case the 'package' phase will not be run, and tests will be run with a classpath consisting of $eachmodule/target/classes plus the actual 3rd-party dep JARs. --also-make --projects $onemodule is also helpful, though currently this does not support running dependencies to an earlier phase than the enumerated modules; I mentioned this in http://www.mail-archive.com/dev@maven.apache.org/msg85957.html though I did not give the best example (posted an update just now).

#3 - so long as the Maven plugin offers customizations sufficient for your purposes, you can just configure it in the POM with no need for a custom plugin. For widely used plugins the customizations most people need tend to get written.

#5 - maven-release-plugin will create tags in Mercurial like anything else. Using a DVCS ID instead of a human-oriented version number is definitely an interesting idea but it will not play well with most of the universe of developer tooling which really expects a linear progression of versions, not a DAG of patches. http://pettermahlen.com/2010/02/20/git-and-maven/ is an interesting read on this topic. Whether branchy development and release-type dependencies can naturally coexist is a rather subtle, and probably open, question.

#6 - you can specify exclusions on dependencies and it is pretty easy, assuming you know what is safe to exclude. I agree that the root problem is insufficiently factored libraries, though I suspect the primary reason for this is ignorance of how much mayhem is caused by this kind of laziness, especially among people unaccustomed to working with module systems (at development time and/or runtime).

#7 - put the tests in C. After all, you are implicitly testing your mock implementation, not just your API. Better is to create a mock which is so lightweight it does even need an actual embedded DB, and can thus live in A to begin with. I also do not agree that test dependencies are fundamentally different from other dependencies; you can and sometimes do publish test libraries as precisely versioned artifacts in a shared repository. That means they need to be buildable in isolation.

#8 - true, it would be nice if Surefire had an alternate output mode geared toward interactive use (rather than on a CI server, say). This would an easy enough patch to make to the plugin - I do not think it is some kind of fundamental problem with the tool.

#9 - Maven does indeed not support multiple parent inheritance for now; probably it could be made to do so in the future without a huge upheaval. (There would have to be some policy for resolving conflicting settings.) In the meantime, the most common reason for wanting this, and one of the two use cases you mention, is factoring out library versioning information into one or more non-parent POMs - which is already possible using import scope.

Tim Boudreau replied on Mon, 2011/04/04 - 5:59pm in response to: Jesse Glick

Re #7: If the tests are in C, I will need another copy of them in B and those will get out of sync (B should be tested; those tests just should be run more rarely - in my current project I use a custom test runner and -DargLine to set a system property when building from an IDE). This *could* be done with a test superclass containing all of the tests in A and using Maven's test-jar, but that gets pretty ugly).

Re circular dependencies - IMO using test-jar is a rare enough case that it seems to me that circular test dependencies should be allowed, but the onus should be on the plugin that implements test-jar to detect if a circular test dependency exists and fail the build in that case - i.e. exclusively allow either circular test dependencies or creation of test artifacts.

Re #9: IMO, the bug is the use of the word "parent" in terms of a Maven parent project. I have never seen a Maven parent project which did not consist entirely of shared configuration. In short, there is nothing parent-like about a Maven parent project, at least in any project I have worked with (it is also odd, though probably necessary, that the parent lists its children and the child lists its parent). It seems like it is just a weird hack to shoehorn some shared configuration into something Maven will let you run targets on.

Jesse Glick replied on Mon, 2011/04/04 - 6:23pm

"If the tests are in C, I will need another copy of them in B" - you should not, because B is just the "live" impl that is not really testable. Again, either you make a real unit test which can live in A, or you decide to test C's impl and put the test in C.

test-jar artifacts are not all that rare, and this is not a plugin feature but a basic ability of Maven: to produce secondary artifacts. It would be strange indeed if dependencies on secondary, but not primary, artifacts could sometimes - but not always - be cyclic. And I can hardly imagine how a feature like --also-make would know what to build when if cyclic dependencies were permitted.

<modules> (in a parent POM) and <parent> (in a child POM) may be symmetric, but need not be - these are orthogonal features. It is quite possible, and not unusual, to have a parent listing only child modules, and a separate project for shared configuration.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.