Jakub is a Java EE developer since 2005 and occasionally a project manager, working currently with Iterate AS. He's highly interested in developer productivity (and tools like Maven and AOP/AspectJ), web frameworks, Java portals, testing and performance and works a lot with IBM technologies. A native to Czech Republic, he lives now in Oslo, Norway. Jakub is a DZone MVB and is not an employee of DZone and has posted 156 posts at DZone. You can read more from them at their website. View Full User Profile

What I’ve Learned from (Nearly) Failing to Refactor Hudson

04.29.2011
| 5721 views |
  • submit to reddit

We’ve tried to refactor Hudson.java but without success; only later have I been able to refactor it successfully, thanks to the experience from the first attempt and more time. In any case it was a great learning opportunity.

Lessons Learned

The two most important things we’ve learned are:

  • Never underestimate legacy code. It’s for more complex and intertwined than you expect and it has more nasty surprises up in its sleeves than you can imagine.
  • Never underestimate legacy code.

And another important one: when you’re tired and depressed, have some fun reading the “best comments ever” at StackOverflow :-) . Seeing somebody else’ suffering makes one’s own seem to be smaller.

I’ve also started to think that the refactoring process must be more rigorous to protect you from wandering too far your original goal and from getting lost in the eternal cycle of fixing something <-> discovering new problems. People tend to do depth-first refactoring changes that can easily lead them astray, far from where they actually need to go; it is important to stop periodically and look at where we are, where we are trying to get and whether we aren’t getting lost and shouldn’t just prune the current “branch” of refactorings and return to some earlier point and try perhaps a completely different solution. I guess that one of the key benefits of the Mikado method is that it provides you with this global overview – which gets easily lost when it is only in your head – and with points to roll-back to.

Evils of Legacy Code

Use a dependency injection framework, for God’s sake! Singletons and their manual retrieval really complicate testing and affect the flexibility of the code.

Don’t use public fields. They make it really hard to replace a class with an interface.

Reflection and multithreading make it pretty difficult if not impossible to find out the dependencies of a particular piece of code and thus the impacts of its change. I’d hard time finding out all the places where Hudson.getInstance is invoked while its constructor is still running.

Our Way to Failure and Success

There is a lot of refactoring that could be done with Hudson.java, for it is a typical God Class which additionally spreads its tentacles through the whole code base via its evil singleton instance being used by just about anyone for many different purposes. Gojko describes some of the problems worth removing.

The Failure

We’ve tried to start small and “normalize” the singleton initialization, which isn’t done in a factory method, but in the constructor itself. I haven’t chosen the goal very well as it doesn’t bring much value. The idea was to make it possible to have potentially also other implementations of Hudson – e.g. a MockHudson – but with respect to the state of the code it wasn’t really feasible and even if it was, a simple Hudson.setInstance would perhaps suffice. Anyway we’ve tried to create a factory method and move the initialization of the singleton instance there but at the end we got lost in concurrency issues: there were either multiple instances of Hudson or the application deadlocked itself. We tried to move pieces of code around, but the dependencies wouldn’t have let us do that.

The Success

While reflecting on our failure I’ve come to the realization that the problem was that Hudson.getInstance() is called (many times) already during the execution of the Hudson’s constructor by the objects used there and threads started from there. It is of course a hideous practice to access a half-baked instance before it is fully initialized. The solution is then simple: to be able to initialize the singleton field outside of the constructor, we must remove all calls to getInstance from its context.

Mikado Graph: Hudson Refactoring (click for full size)

The steps can be seen very well from the corresponding GitHub commits. Summary:

  1. I used the “introduce factory” refactoring on the constructor
  2. I modified ProxyConfiguration not to use getInstance but to expect that the root directory will be set before its first use
  3. I moved the code that didn’t need to be run from the constructor out, to the new factory method – this resulted in some, hopefully insignificant, reordering of the code
  4. Finally, I also moved the instance initialization to the factory method

I can’t be 100% sure that the resulting code has the same semantic as far as it matters, for I had to do few changes outside of the safe automated refactorings and there are no useful tests except for trying to run the application (and, as is common with legacy applications, it wasn’t feasible to create them beforehand).

The refactored code doesn’t provide much added value yet but it is a good start for further refactorings (which I won’t have the time to try :-( ), it got rid of the offending use of an instance while it is being created and the constructor code is simpler and better. The exercise took me about four pomodoros, i.e. little less than two hours.

If I had the time, I’d continue with extracting an interface from Hudson, moving its unrelated responsibilities to classes of their own (perhaps keeping the methods in Hudson for backwards compatibility and delegating to those objects) and I might even  use some AOP magic to get a cleaner code while preserving binary compatibility  (as Hudson/Jenkins actually already does).

Try it for Yourself!

Setup

Get the code

Get the code as .zip or via git:

view sourceprint?
1git@github.com:iterate/coding-dojo.git # 50MB => takes a while
2cd coding-dojo
3git checkout -b mybranch INITIAL
Compile the Code

as described in the dojo’s README.

Run Jenkins/Hudson view sourceprint?
1cd coding-dojo/2011-04-26-refactoring_hudson/
2cd maven-plugin; mvn install; cd ..       # a necessary dependency
3cd hudson/war; mvn hudson-dev:run

and browse to http://localhost:8080/ (Jetty should pick changes to class files automatically).

Further Refactorings

If you’re the adventurous type, you can try to improve the code more by splitting out the individual responsibilities of the god class. I’d proceed like this:

  1. Extract an interface from Hudson and use it wherever possible
  2. Move related methods and fields into (nested) classes of their own, the original Hudson’s methods just delegate to them (the move method refactoring should be useful); for example:
    • Management of extensions and descriptors
    • Authentication & authorization
    • Cluster management
    • Application-level functionality (control methods such as restart, updates of configurations, management of socket listeners)
    • UI controller (factoring this out would require re-configuration of Stapler)
  3. Convert the nested classes into top-level ones
  4. Provide a way to get instances of the classes without Hudson, e.g. as singletons
  5. Use the individual classes instead of Hudson wherever possible so that other classes depend only on the functionality they actually need instead of on the whole of Hudson

Learning about Jenkins/Hudson

If you want to understand mode about what Hudson does and how it works, you may check:

Sidenote: Hudson vs. Jenkins

Once upon time there was a continuous integration server called Hudson but after its patron Sun died, it ended up in the hands of a man called Oracle. He wasn’t very good at communication and nobody really knew what he is up to so when he started to behave little weird – or at least so the friends of Hudson perceived it – those worried about Hudson’s future (including most people originally working in the project) made its clone and named it Jenkins, which is another popular name for butlers. So now we have Hudson backed by Oracle and the maven guys from Sonatype and Jenkins, supported by a vivid community. This exercise is based on the source code of the Jenkins, but to keep the confusion level low I refer to it often as Hudson for that is how the package and main class are called.

Conclusion

Refactoring legacy code always turns out to be more complicated and time-consuming than you expect. It’s important to follow some method – e.g. the Mikado method – that helps you to keep a global overview of where you want to go and where you are and to regularly consider what and why you’re doing so that you don’t get lost in a series of fix a problem – new problems discovered steps. It’s important to realize when to give up and try a different approach. It’s also very hard or impossible to write tests for the changes so you must be very careful (using safe, automated refactorings as much as possible and proceeding in small steps) but fear shouldn’t stop you from trying to save the code from decay.

 

From http://theholyjava.wordpress.com/2011/04/28/what-ive-learned-from-nearly-failing-to-refactor-hudson/

Published at DZone with permission of Jakub Holý, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Jonathan Fisher replied on Sat, 2011/04/30 - 12:02pm

I'm retracting my previous, unneeded sarcastic comment... I made it out of frustration of the whole jenkins fork. I earnestly think the mature thing for the developer community to do is extend an olive branch and prove to be the bigger person. Forking jenkins made a point to oracle, and it's time to stop wasting valuable engineering minds on a stupid petty argument.

Martijn Verburg replied on Mon, 2011/05/02 - 5:24am

I can also recommend reading Michael Feather's working with legacy code book. Provides great practical advice on what steps you can take to tackle the beast.

Christoph Kutzinski replied on Sun, 2011/05/22 - 3:01pm

Sound like some useful stuff - have thought about donating your changes back to Jenkins?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.