I'm an author and a developer focused on build tools. I'm currently focusing on Gradle, but I have an interest in all build tools and most development infrastructure. I focus on Enterprise Java, Ruby, and the interface between Systems Administration and Software Development. The focus of my work is to make it easier for individuals to adopt open source software. Tim is a DZone MVB and is not an employee of DZone and has posted 41 posts at DZone. You can read more from them at their website. View Full User Profile

How not to download the Internet

04.19.2011
| 7196 views |
  • submit to reddit

A criticism I hear often about Maven is, “every time you run Maven, it downloads the internet.” I understand the criticism, as the first time you run Maven it has to populate your local repository. Maven downloads plugins and artifacts that your project depends on. Maven does in fact download artifacts from remote repositories, but it downloads the artifact once and keeps a local cache.

Maven only downloads most of these dependencies because you’ve added them to your project. If you are unhappy that Maven is “downloading the internet” then stop developing software that depends on external libraries. Easy, right? Stop using Spring and Hibernate, stop referencing the commons libraries, and do everything yourself. This would be one way to avoid downloading any artifacts from a remote repository. Stop using Maven to build your software and write your own build tool that has all of the capabilities of Maven and every imaginable Maven plugin baked into it.

Not a workable solution, right? The fact of the matter is that your software has dependencies on external libraries. If you find yourself constantly “downloading the internet” there’s a reason. You are depending on projects that depend on “the internet” or, your projects have a very wide set of dependencies that may need to be trimmed.

How can we avoid creating projects and POMs that “download the internet”? The simple answer is that everyone needs to start focusing on dependencies. Library developers need to be smarter about creating leaner, meaner dependency lists, and you need to start evaluating your own dependencies with an eye on efficiency.

Library Developers Need to Modularize

Take a project like Spring as an example. Spring’s libraries provide interoperability across a number of core enterprise APIs: JMS, JDBC, JTA. Spring also allows people to plug-in different implementations for various feature: Hibernate, ehcache, MyBatis, log4j, slf4j, etc. Picking on Spring in particular, artifacts from Spring tend to rely on the world. If you use some of the core Spring libraries you’ll soon realize that that one simple dependency XML snippet actually translates to 30 or 40 dependencies.

If you are creating a library (Spring, Guice, or Hibernate) you need to start thinking about the dependencies that you are selecting. Instead of just blindly adding in a dependency to ten artifacts, split up your projects so you don’t create that one, gigantic library which depends on the world. I’ll bring the conversation back to Spring. Spring is moving in the right direction, the most recent version of spring-core version 3 has five dependencies, while spring-core 2.5.6 has 13 dependencies. If you’ve watched the progression of the Spring libraries over time, you’ll notice a trend toward modularization. Take the Spring AWS as an example – there isn’t just one spring-aws library, there is a spring-aws-ant, spring-aws-ivy, spring-aws-maven library. As more people within the Springsource use Maven as a build tool, more people are starting to realize the value of having more projects with lighter POMs.

This matters because low-level, almost universal libraries like Spring, Hibernate, log4j, Guice, Commons libraries. These projects end up putting dependencies into everyone’s classpath. If developers of popular libraries get the message and move toward more modular project scopes then you shouldn’t see so much bloat in your own project’s classpaths.

Don’t just let any dependency into your project

What can you do? You need to have some standards. Don’t just let anybody put some new dependency into your POM. Have some process to evaluate and assess exactly what a new dependency is going to do to your classpath. Is that new-fangled database library going to drop a dependency bomb and pull in 20 other libraries, some of which have incompatible licenses?

One tool you can use to make this process easier is Nexus Professional. Nexus Professional has a new Maven Dependency report. It is very easy to use, find an artifact in Nexus Professional, and then select the Maven Dependencies tab. This report will allow you to see just how many dependencies a particular artifact is going to introduce into your project. It will also list artifacts that may be missing from the public repositories, giving you a chance to assess the quality of an artifact’s dependencies.

 

From http://www.sonatype.com/people/2011/04/how-not-to-download-the-internet/

Published at DZone with permission of Tim O'brien, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Wal Rus replied on Wed, 2011/04/20 - 6:22pm

How not to download internet? Simple! Don't Use Maven! I think maven received a well deserved bile from bileblogger years ago, no need to repeat it.

Ronald Miura replied on Wed, 2011/04/20 - 7:54pm

Don't use Maven, and download the Internet youself, instead of letting the tool do ut for you.

Seriously, before Maven, I had to download distribution zips from five different projects, select what jars from each zip are really used in my application, and make sure their versions were compatible myself. Then I switched to Maven and never looked back.

Nowadays, the only project that insists in bundling all of its dependencies into one giant zip is Seam, and it is a 110Mb download. And you have to select which jars you'll put into your lib folder, because depending on your environment, you can't just copy everything, due conflicts with server-global libraries. I mean, if you develop Java applications, you'll have to download the Internet, and you'll have to deal with library versions anyway. Just let the tool do it for you instead.

That said, I do think Maven should allow me to check all dependency jars into Subversion alongside my project sources.

Geoffrey De Smet replied on Thu, 2011/04/21 - 2:14am

I agree that maven only downloads artifacts that you otherwise had to download manually anyway. Especially now that maven 3 downloads in paralel, it's less of a pain.

Although, there's one recent bug that makes it download snapshots when I don't want it to, but I am sure that will be fixed soon (especially now it's in the spotlight here :).

The biggest problem is indeed the quality of the poms on the central repository. Some projects don't consider it their job yet to deliver high-quality dependency information about their project, while it is. Other projects don't know how to check if it's high-quality. There are a couple of automated dependency quality checks out there, but most of them are not really in-your-face enough (except for nexus validation if activated).

Andrew Rubinger replied on Fri, 2011/04/22 - 9:56am

While the responsibility lies with the developer to not have Maven "Download the Internet", I feel comfortable in faulting Maven for its defaults.

Default dependency scope is "compile", which is available on all classpaths and is transitive. Assuming you need a dependency on Project X, if Project X was not kind enough to limit what transitives it exports, then you bring in a lot more than you necessarily bargained for.

Developers should be taking care to use "provided" and "test" for dependencies in most cases, allowing "compile" only when necessary.

So sure, I blame Maven for this particular design decision. Which affects things like Maven Plugins as well: You simply wanna run compilation, but first have to bring in/resolve a bunch of plugin versions and their requisite wagons, etc.

Related: There's no scope for "compile-only", meaning the dependency should be on the compilation classpath, but *not* the test/runtime classpath.

S,
ALR

Lund Wolfe replied on Sat, 2011/04/23 - 4:07pm

It is very tempting to blame Maven for the long list of jars in the final product. It is just too easy to declare the obvious dependencies for the project and let Maven gather all the rest. If the dependencies are bogus dependencies due to badly designed poms, internally or externally obtained, the Maven project gets real ugly real fast. For this reason, I would rather inherit an Ant project done badly than a Maven project done badly. The Ant project could have many fewer jars if they were added only on an as needed basis. If done badly, though, it could be insufficient (fails at runtime) or bloated in terms of unused jars.

It all comes down to knowing the library choices well enough to pick the right tools and minimize the library bloat. It's not Ant's or Maven's fault. If you are just picking tools based on popularity or all-purpose ability (I just need to learn one tool really well), or buzz words, then you will end up with something ugly. We don't like boated code that reinvents the wheel without using any external libraries. In the same way we should keep external library use clean and simple. Both can be very hard to simplify/refactor later.

Shoaib Almas replied on Sat, 2012/08/25 - 5:50am

Tim,

Great post. I've often heard "It downloads the Internet" as a typical (and primary) argument against using Maven. Personally I agree with you, it's unfounded. The problem isn't with Maven per-se it's with developers being lazy when they work out what dependencies they need.

As a Java developer myself I fully understand the frustration many developers feel when trying to work out dependencies. Typically they'll want to just get as quickly as they can past "ClassNotFoundException" to get to something that's of real value to their business.

Java Forum

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.