Leaving university, I thought I'd be a developer happily knocking out code but was always drawn to tech support. How do we help people use our tools better? Now, I mostly specialize in consulting and introducing new customers to our tools. I am a Tech Evangelist and Lead Consultant at Urbancode. Eric is a DZone MVB and is not an employee of DZone and has posted 77 posts at DZone. You can read more from them at their website. View Full User Profile

Maven's Strengths and Weaknesses as a Dependency Management System

03.01.2010
| 13443 views |
  • submit to reddit
For many Java teams, switching to Maven will introduce them to formal dependency management. Maven actually does a pretty decent job and is a fantastic system out of the box for informal projects. However, for teams looking to implement rigorous component reuse policies, Maven falls short.

Last month, in our article Chaperoning Promiscuous Software Reuse, we reviewed four elements of a successful strategy for managing components. Briefly, these elements are:
  1. Testing and Validation of 3rd party components Only libraries approved for reuse should be used by developers. If developers want to use additional libraries, they can submit them for screening.
  2. A Definitive Software Library: A team should have an official storage place for reusable, versioned components (both internal and 3rd party).
  3. Using Protection: A team should have an "immune system" that rejects libraries other than those from the DSL.
  4. Rapid impact analysis audit When something goes wrong, it should be easy to determine what versions of which components were used by the project.

Basic Maven

Basic Maven points to a common Maven repository hosted centrally. Various open source projects publish their releases to this repository and a developer can configure her build file to depend on these releases. At build time, the artifacts are automatically pulled down into the workspace and the build can produce a report of what was used.

This is actually a pretty good start. If the build script depends on exact versions, the team has some traceability. The Maven repository is an official repository of artifacts (a DSL) used by the team although the team does not control it. Unfortunately, there's little protection against a developer using a rogue component, but it's easier to do the right thing than the wrong thing, and that goes a long way.

The major failing here is that the team does not control the repository. Rather, it is an arbitrary resource in the cloud. This eliminates the team's ability to test artifacts prior to including them in the DSL, and there's no guarantee that an artifact used by a production release today will still be available from the repository tomorrow unaltered and intact.

Bring the Repository in House

What's needed is to bring the repository in house. While this can be done without special tooling, there are a handful of decent Maven repositories that provide the team more control. Nexus and Artifactory are two of the major choices here. They provide security over who can publish artifacts to the repository as well as more sophisticated storage and indexing than a simple file system.

With one of these repositories, the team has it's own DSL they control and can ensure that only approved versions of libraries are made available to the team. Again, as long as the team does not use snapshot dependencies a basic audit trail of what is in a build is available. Impact analysis reports indicating which projects have used a given component version are a little more difficult. Protecting the code-base from a developer checking an arbitrary library into source control is still a manual process, but Maven does discourage the practice by making reuse of a library from the repository more straight-forward.

With an in-house repository, a team is able to extend Maven nicely. Dependency management is controlled by developers (for better or worse) but dependencies must come from an approved collection of versioned components controlled by the team. These libraries can be tested prior to being made available. Protection and audit capabilities are less than ideal but not wholly absent. For many teams, this kind of system may be adequate. The most disciplined are likely to find it wanting.

Other Concerns with Maven Dependency Management

No ability to depend on the newest version with traceability Maven draws a strict line between depending on a released version of a component and depending on the latest built version. Depending on the latest build version is a common practice in multi-component continuous integration. Maven's strict line allows for either approach to be used, but only provides traceability when dependencies are based on releases. In multi-component CI, teams often do not want to release particular components for reuse upstream, but rather consider every successful CI build (with passing unit tests) a mini-release.

Functional tests performed at a system level determine the quality of the entire dependency graph (the application). In these cases, it's only after testers validate a system build, that a release will be pushed out (ideally the binary that was tested) and full traceability to every component is valuable. That Maven discourages mutli-component CI through the arbitrary separation of Releases and Snapshots is unfortunate.

Inability to depend based on status

Sometimes, teams don't want to depend on either the latest build of a component nor a specific version. Rather, they want to depend on the latest version that has passed some sort of quality gate. Perhaps a validation from a testing team, or approval from a release manager. Maven has no real concept of statuses on components (version numbers and component names are pretty much it) so status based dependencies are out of the question.

From http://www.anthillpro.com/blogs

Published at DZone with permission of Eric Minick, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Jilles Van Gurp replied on Mon, 2010/03/01 - 2:32pm

Some issues I have observed in the field with maven dependency management:

1) Maven stimulates copy paste reuse. I clean some pom files up once in a while (do you?). Inevitably they are full of copy pasted bits that are not needed, or worse broken. Two sections are affected by this most: the dependencies and the plugins section. People seem to apply a paste until it stops whining approach to pom files. The net result of this is an unmanaged set of dependencies fragmented over different pom files. Mistakes tend to ripple across modules.

2) There tend to be a shitload of transitive dependencies. Version selection for those tends to be less than ideal. Transitive is just another word for unmanaged in this context. IMHO class not found would be preferable, at least then I'm forced to take a conscious decision on using some ancient version of commons-logging or whatever (an actual issue I fixed two weeks ago). Without transitive dependencies, dependencies are just a list of crap to put in WEB-INF/lib. How is this different than putting the jar files there yourself like we all used to? Sure it is nice to have a tool that does that for you. But only if it gets out of your way after that. In a nutshell, this is the main difference between ivy and maven. In ivy, downloading/checking dependencies is something you do only when needed.

3) If you have a multi-module web application (like you probably have),the only thing that counts is what ends up in WEB-INF/lib. Essentially, all the rest is just a level of indirection. If you have e.g. four modules, maven is going to quadruple the time it spends on figuring out what dependencies it needs. This can add seconds to your build that should otherwise be able to complete in 2-5 seconds

4) Dealing with excludes sucks. World + dog seems to depend on commons-logging. The only way to have log4j work reliably is to not have it on your classpath: dependencies making implementation decisions on their dependencies works against you here.

 5) If you use eclipse, maven and osgi in a web application you are juggling four different classpaths (whatever maven declares, whatever your osgi bundles require, whatever is in your .classpath file and whatever ends up in your WEB-INF/lib dir). Tools exist to deal with this but you will find yourself dealing with problems here at some point. Maven dependency management is primarily about maven dependencies. Making the rest of your environment comply with what maven thinks is right generally requires a lot of fiddling (e.g. the eclipse plugin deciding not to include an exlplicit dependency on the classpath, aspectj in my case). WTF indeed.

6) When creating a new module, you will likely do so by copying the pom file of some pre existing module. Nothing wrong with copy paste reuse, its what we all do. But as noted before, this reduces maven to a cumbersome way to dump unmanaged crap into your lib directory. 

7) Maven dependencies live in  ~/.m2/repository. Each time you build a web application, stuff gets copied from there to target/<yourapp>/WEB-INF/lib. I notice a lot of files tend to be copied when working with maven. This adds time to my builds. Compare this to jarring and including your lib dir without copying it first. Making redundant copies of stuff is stupid, especially when done automatically.

8) I don't recall dependency management ever being such a big deal in the put lib/**/*.jar on your classpath days. Clasnotfoundexception? Download the relevant library and fixed. I don't see any significant improvement on the crap that actually does end up in WEB-INF/lib with maven (i.e. redundant, outdated dependencies all over the place). I do notice spending a shitload more time on managing what used to be download & dump into lib. Copying, downloading, checking for the need to download, doing this for each module (4x in our case). Cleaning up pom files once in a while, doing the odd maven dependency:tree to figure out what the hell I'm actually depending on. Figuring out if dependency foo is really compile or just test. Etc. It adds up.

9) Every time I bother to check, I find that multiple dependencies have updates available with (probably) criticial bug fixes, performance enhancements, features I definitely want, etc. The point here is that despite dependency management, I still need to manually check this.

So in short, I think maven maven dependency management is a joke. It's tedious and ultimately does exactly the same thing as I used to do manually, which is dump stuff into some folder based on whatever list of dependencies you happened to define (which you likely copy pasted). Worse, it doesn't really help me do what I really needed which is notifying me about production quality updates to my dependencies; figuring out which dependencies are no longer needed; and reducing the amount of time I spend on dependency management.

michael cheung replied on Tue, 2010/03/02 - 3:23pm

1) People who doing copy and paste like this probably doing the same thing in their code. You can't blame the tool when people are using it wrong. Why would one put a dependency or plugin into the pom if he or she has no idea what is the impact?

2) There are many tools to help you to manage the transitive dependencies. I use m2eclipse plugin. When I open my pom, there is a tab called dependency hierarchy and dependency graph. I know "exactly" what transitive dependencies I am using in my module and why. It allows me to make correction to a transitive dependencies that is a older version. Why is this better than dropping jar in WEB-INF/lib? People tend to check in jars without a version. After awhile, you have no idea why certain jars or versions are checked in. You have no idea what is depend on it and upgrading it will cause what kind of chain effect.

3) If you have multi-module web application, most likely you can re-use those modules for many other web application. It speeds up your time for developing new application.

4 and 5) I think you are doing something wrong. I am not sure what you mean by not having log4j in your classpath. I am using slf4j with log4j implementation. One problem I ran into is i have to use 99.0-does-not-exists jcl. I used maven and osgi. I used spring DM. I haven't ran into issue you describe.

6) Copy and paste is just wrong

7) How much time are you adding? If this is a big concern, you should try a language that does not require compile, like ruby or php. It is back to a problem with checking in jar in your SCM. You have no idea what version you checked in or why you need it.

8) It sounds like you are using it wrong. Your problem seems to be just speed of the build, copy and paste in the pom causing unknown problem, not using best practice to manage your transitive dependencies.

9) Maven does not decided what version you should use. It is up to you who is the engineer to decide what version, feature, bug fixed you want to include to your application and do enough testing to make sure everything still works. If you don't care about that, there is a maven version plugin that allow you to update everything to the latest versions. "mvn versions:use-latest-versions"

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.