I have been writing code since 1980. In 1999, I learned Java and after discovering IntelliJ the love was complete. XML and XSLT was a revelation in 2000, too bad the standards committees have done so much damage since. Contributed to the Flying Saucer open source XML and CSS rendered and published an open source XSLT generator, weffo. Currently employed by Google. Torbjörn is a DZone MVB and is not an employee of DZone and has posted 9 posts at DZone. You can read more from them at their website. View Full User Profile

GC is for Goodstuff Collector

11.25.2008
| 6606 views |
  • submit to reddit

I have over the past few months noticed that there is a fairly common fear of creating objects in Java. When I query the authors, it always seems to boil down to an attempt to create more performant code through avoiding garbage collection.

So why would one want to create more objects, anyway? Well, one good reason would be to get more readable code, e.g. through encapsulating an algorithm or using appropriate helper objects to express an algorithm more clearly.

Even when the code does get put into a helper object, there seems to be a tendency to keep that instance around and reuse it, to avoid garbage collection so that the code performs better. My first comment is always "You should measure that". If I wanted to put it more sharply I should say "If you can't prove it's a performance benefit, then it probably isn't". I have worked with enough optimization to know that what's true for one language will not be true for another. Even using the same language, something that gives a performance benefit on one machine may be a detriment on another kind of machine (or even the same kind of machine with different "tweaks" like page sizes and such).

If you create a temporary object that lives only as long as you need it you gain the following benefits:

  1. Your object is always in a pristine state when you want to use it.
  2. Your code is a big step closer to being thread safe.
  3. Your code is more readable and easier to analyze.
  4. Your garbage collector gets to do less work.

 

"What?", you say, "The garbage collector does less work by collecting more garbage?".

Indeed it does. When we hear "garbage collector" we tend to think of the work involved in clearing out an overfilled store room, all that work to haul all the garbage out. But the garbage collector doesn't even know the garbage exists, the very definition of garbage is that it can no longer be reached from anywhere. What the garbage collector really does is create a whole new store room and move the stuff you want to keep over to it and then it lets the old store room and all the garbage disappear in a puff of smoke. So all the work done by the garbage collector is really done to keep objects alive, i.e. the GC is really the "goodstuff" collector.

This is obviously a somewhat simplified view and I don't think it holds completely for objects with finalizers (which is probably why finalizers are really bad for performance). Every single measurement and microbenchmark I've done confirms that creating and destroying objects is never worse and often much better than trying to keep objects around. I've done a few, first because I didn't trust it, then because others were so convinced of the opposite that I had to research it. That isn't an absolute proof that it will be true for all scenarios, but I think it's a pretty good indication of where you should be pointing when shooting from the hip.

From an IBM developer works article, we get the numbers that allocating an object in Java takes about 10 machine instructions (an order of magnitude better than the best C/C++ allocation) and that throwing an object away costs, you guessed it, absolutely zero. So just remember that GC stands for "Goodstuff Collector" and you'll be on the right track.

From http://tobega.blogspot.com/

Published at DZone with permission of Torbjörn Gannholm, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Artur Biesiadowski replied on Tue, 2008/11/25 - 5:23am

I think that 'objects are cheap' lobby is trusting technical documents and microbenchmarks too much...

Few points:

1) Object deallocation is free? Ok, let's write a program which allocated some small objects in loop and they are eligible for collection very soon afterwards. Object allocation is cheap and done while loop is running. As none of the objects is surviving, verbose gc should show exactly zero time spent in gc. But there is a surprise - gc TAKES time. It seems that 0*many != 0. I cannot check it at the moment, but as far as I remember it was on order of 2-3 ms for 100MB of pure garbage.With average object size of let's say 64 bytes, it means that around 2M objects takes 2ms to collect, which gives 1ns per object. 1ns is around 3 cycles with current cpus. BUT, it is one of the moments which is not really so much parallelizable - so in some cases, it might be 3 cycles times 2 or 3 in case of multiple cpus (16 cores for example) (due to non-perfect parallelization) - and we are coming into range of 10 cycles easily. And don't tell me 10 cycles cost is small amount - it is infinitely more than 'zero cost' you are claiming.

2) Now, even assuming object deallocation is free. Let's focus on CMS gc. I got a case in one of the applications where all 'permanent' objects were fitting in survivor space. This means, they were copied over and over on each gc, causing new gc to take 30-40ms. All the other objects wherepure garbage. Now, depending on speed of allocation of garbage ones, I was getting new gc every 5 seconds or every 15 seconds. While not causing problems directly, faster generation of garbage was causing more often gc pauses. Obviously, this particular issue is solvable by forcing objects into old generation earlier, but it is something you cannot always foresee.

3) Biggest problem of all. In some applications, you have large number of medium lived objects (for example market quotes). They live for 5-30 seconds, which is very bad time for gc. If you swamp the gc with temporary objects, medium objects mature faster and get promoted to old generation, instead of staying in survivor space and expiring there. And when in old space, they are deleted only on next CMS run (not on current one in most cases), which can easily mean 20-30 minutes for bigger heaps. I have seen the cases where programs were running fine, with 'stable' memory, but if you increased amount of temporary garbage by 10-20%, suddenly all medium-lived objects were getting promoted to old generation and CMS was not able to collect it fast enough, thus causing explosion in memory usage and finally full gc (which meant a system failure). We were able to fix this my removing one or two iterators here or there - suddenly, using index to access ArrayList instead of foreach was a difference between system managing to handle 50000 events per second forever and one which goes into full gc after 2-3 hours.

Problem is, that if you want to manage 50k events a second, you cannot get too long pauses in new gc, which means reasonable size young space (we took 100MB). So you get around 2kb to process the event which contains 50-400 bytes of real data. If you allocate 1kb of garbage for each of them, you will get new gc every 1-2 seconds... and 1kb of garbage is not much if you start using iterators, temporary arrays etc everywhere.

I agree, that for the 'lets-run-cluster-of-E10K-to-serve-few-web-pages' businesses, temporary objects are cheap. But if you try to run something which processes a lot of data and tries to cope with that in reasonable time (not real time of course), you start to think about gc in different way.

William Louth replied on Tue, 2008/11/25 - 6:44am

Not forgetting that within a distributed environment with locking mechanisms whether at the database, some shared clustered monitor (object) or a data grid transaction the cost of a local GC cycle can very easily be multiplied due to resulting wait times across one or more processes.

William 

Jesper Nordenberg replied on Tue, 2008/11/25 - 7:00am

No, allocating and deallocating objects is not free, but it could be with new optimizations in Hotspot. Escape analysis allows object fields and arrays to be placed on the stack or in registers, eliminating allocation/deallocation completely. These optimization will be added in OpenJDK, if they not already have been.

Also new GC algorithms will reduce the collection pauses, so that shouldn't be a big issue except for hardcore realtime applications.

In conclusion I would say don't worry about creating objects in your code, premature optimization is the root of all evil.

Artur Biesiadowski replied on Tue, 2008/11/25 - 8:45am in response to: Jesper Nordenberg

I thought that original article was about current state of art, not about 'what can be'. On-stack allocation of objects is just behind the corner for last 5 years. Indeed, it is a lot closer than it used to be, but before it will get into production environment in banks, more time will pass.

As for new GC algorithms - of course. Every new JDK has a lot of improvements. Unfortunately, in my business, we have seen doubling of requirements each 9-12 months for 4 years in a row, so I kind of take JDK improvements for granted, together with Moore's law (in whatever form is is happening at the moment, GHz, multicore, GPU clusters etc).

Yes, premature optimalization is evil. I would never come to replace iterators with tricky workarounds if they would not start showing up in profiler after optimizing rest of code as generating around half of garbage. But they did and application run a LOT better (or at all) after removing some temporary garbage allocations. This is why I disagree with blank statements like original article.

Torbjörn Gannholm replied on Tue, 2008/11/25 - 5:42pm in response to: Artur Biesiadowski

Thanks for the input, Artur! You obviously proved your optimizations worked in your specific case through measuring, as recommended. I'm sure you agree that does not prove that you should have a general "let's not create objects" attitude.

One of my measurements was for sustained throughput of small objects that were needed for 10 seconds (the really bad GC case). Creating and destroying objects gave a sustained throughput that was 10 times larger than reusing pooled objects. 

 

William Louth replied on Tue, 2008/11/25 - 6:23pm in response to: Torbjörn Gannholm

Please post the code.

Artur Biesiadowski replied on Wed, 2008/11/26 - 2:47am in response to: Torbjörn Gannholm

I think we can agree here - generally, temporary objects are a lot better than pooling (only exception I have found was for swig generated wrappers - cost of memory management in finalizer was too high). I was not advocating using pools over temporary objects - solution we had to find is to change the algorithms in the way they don't require ANY temporary objects inside performance critical parts.

 

Torbjörn Gannholm replied on Wed, 2008/11/26 - 5:52am in response to: William Louth

Sorry, William, I can't post the code as it involved production code.

If you want to try it yourself, set up a producer thread that sticks objects wrapped in a DelayedElement with a delay of 10 seconds in a DelayQueue and have a consumer thread eat elements off the queue. You can now choose to pool (or not) DelayedElement instances and the object instances.

I will say that the problem with the pool was not so much synchronization as we originally thought, but just the overhead of putting stuff into and taking it out of the pool. So if you specially design the pooled objects to also contain a pooling mechanism (i.e. an AtomicReference to the next object in the stack when pooled) you can beat gc with a small margin, about 10%, provided you don't need to do work to reset the state of the objects.

Note: java 1.6 

Torbjörn Gannholm replied on Wed, 2008/11/26 - 7:01am in response to: William Louth

I should stress again that you need to measure in your own situation in your own code, because that particular case may be different for some reason. The whole system is more complex than just its parts.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.