Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2578 posts at DZone. You can read more from them at their website. View Full User Profile

Terracotta's Perspective on the Java GC Problem

08.26.2010
| 13788 views |
  • submit to reddit
Caching is at the core for data management in applications, says Amit Pandey, CEO of Terracotta.  Too much time is spent tuning the size of the heap to keep application performance fast and predictable.  Garbage collection is to blame for making an application unpredictable for SLAs (Service Level Agreements).  When latency reaches unacceptable levels, a lot of work needs to be done that Terracotta sometimes gets pulled into.  They've also seen customers who, after months of development, integration testing, and functional testing, have had to spend additional months tuning and performance testing.  "In our experience, the majority of that work is driven by tuning around garbage collection.  When that heap size starts to exceed even 2GB, people start to see issues."  Along with Amit Pandey, Mike Allen, the head of product management, and Steven Harris, the Terracotta VP of Engineering, talked to DZone about the growing issues with Java garbage collection.

Industry changes like bigger applications are beginning to exponentially increase the problem of garbage collection, making the need for a solution in this space very critical as bigger and better applications will always require more memory as time marches on.  Web-based apps are getting very large and memory is getting cheaper, but garbage collection still bogs us down.  Some organizations won't buy servers with under 32GB memory, so applications that are tuned to run 2GB heaps have trouble working around that.  Another problem is that cloud and virtualized infrastructure is creating more pressure to get things into memory.  "In these infrastructures, going to disk is like committing suicide for the app.  Going to local memory is like going to your kitchen to get a cup of coffee, but going to disk is like going to the Moon to get a cup of coffee."

"In a language like Java, the developer has no control over memory, so the VM has to manage it.  Because the relationships between data in Java can be arbitrary, it becomes infinitely complex to decide when data is no longer needed.  Garbage collection is still a bit of a black art."  --Steven Harris, Terracotta VP of Engineering
                                          
                          

"A lot of Java customers are stuck on 32-bit operating systems because they can't use more than 4GB of memory in the heap.  A customer told me that GC is like a housekeeper that comes in and dusts off surfaces, but every now and again somebody arrives who knocks down your door, says 'everybody out', and locks you out of your own house while they clean.  That happens in Java.  So the customer can't live in a big house (with a large heap), they have to live in a small apartment (2GB) where someone won't come and break down their front door.  They have to own many small apartments (JVMs) rather than having all of their stuff in one place." --Mike Allen, Terracotta head of product management

                     

Azul Systems, another hotbed for Java innovation, has offered its Vega compute appliances for a long time, which had a software and hardware solution to the pauses of Java garbage collection.  More recently, they have initiated open source contributions to the Linux kernel (a reference implementation based on OpenJDK) that will mitigate garbage collection problems in Java and boost performance, scalability, and reliability.  They also unveiled the Zing Platform, which takes the algorithm Azul used to circumvent the GC issues, along with other software techniques for getting around locks, and makes them run on generic hardware.

For years, people have been able to work with these limitations in Java garbage collection, but the time is coming when it just won't be a viable option anymore.  Terracotta believes the time to move past GC issues is now.  Mike Allen says that once a more elegant solution for the problem emerges, we will enter "a new era of easy Java."

Comments

Clemens Eisserer replied on Fri, 2010/08/27 - 3:57am

Why do I get the impression that all this is just marketing blabla?

Nick Maiorano replied on Fri, 2010/08/27 - 9:28am

A customer told me that GC is like a housekeeper that comes in and dusts off surfaces, but every now and again somebody arrives who knocks down your door, says 'everybody out', and locks you out of your own house while they clean. That happens in Java.

This is simply not true. Today's GCs have strategies that allow most of the collection process to run concurrently with the overlaying application. In this case, only a small percentage of the collection process needs to pause the entire application. I see hypervisors (e.g. VMWare) as a bigger threat to applications that are sensitive to processing pauses.

Steven Harris replied on Fri, 2010/08/27 - 11:02am in response to: Nick Maiorano

I respectfully disagree with the comment above about GC pauses not being a problem. Having run tests with Java heaps running from 1Gig all the way up over 64 gig Java GC has significant pauses in most of that range. Having talked to hundreds of Java users very few run JVM's over 2 gig in production. The ones who do spend significant time tuning both GC and their app to tolerate GC's impact.

Andrew McVeigh replied on Sat, 2010/08/28 - 4:23am in response to: Steven Harris

we find the same -- as 64-bit JVMs start swimming in 10+gb of memory, GC pauses become a big issue requiring all sorts of memory allocation work arounds.

Kirk Pepperdine replied on Mon, 2010/08/30 - 4:25am

Interesting take on GC. I don't believe that tuning GC is a black art. There are clear processes that help mitigate the cost of memory management and clear triggers for when these processes should be implemented. For example, performing an object creation audit often shows that 40% of object creation can be atributued to fewer than 10 execution paths. Often 20% can be attributed to 2 to 3. I've found that simple fixes can wipe out that 20% in a few minutes. The problem is, most teams don't know how to find them or even realize that this is a problem.

Hardware is important and one still needs to know what you're deploying to and how to adapt your code to it. The JVM has gone a long way to help code adapt but it can't work magic. Every poorly performing application that I've run into abuses the hardware in some hidious way because developers felt they shouldn't need to worry about where they are deploying to. We are still not at the point where there is enough hardware to support this laissez-faire attitude towards it. Azul's answer to the GC problem has been to just simply use more CPUs. Now the solution is to use what is avalable more wisely. Developers need to take that que.

Nils Kilden-pedersen replied on Tue, 2010/09/14 - 1:10pm in response to: Nick Maiorano

This is simply not true

 You then go on to state exactly the same.

Nick Maiorano replied on Thu, 2010/09/16 - 8:55pm in response to: Steven Harris

My comment was specifically related to the incorrect anology used in describing how GC works. To a developer with limited knowledge of garbage collection, the marketing material leaves the impression that a java application is paused during the entire garbage collection process (i.e. while the house is cleaned). If that were true, java would be unsuitable for any high-throughput/low-latency application because pause times would be too high. In reality, JVMs can do most of their "cleaning the house" without pausing or disturbing the application.

Having said this, the portion of garbage collection that does require pausing can add up significantly, in terms of latency, for some high-volume applications. This is where fine-tuning of the garbage collection comes in and it's not a black art. Speaking from personal experience, there are specific things you can do (nursery sizes, compression tuning) that will go a long way past the 2Gb frontier.

The Terracotta and Azul products have their place in the industry. At the same time, javalobby readers deserve accuracy and I hope I have provided that.

County Line Nissan replied on Mon, 2011/08/01 - 10:56am

It has been tested to caches over 350G in size with almost no degradation in performance due to size of the off-heap in memory portion of the cache. -County Line Nissan

Jessie Mear replied on Wed, 2011/09/07 - 7:54am

By simply adding in a few extra lines of XML to the configuration files, the snap-in Ehcache module provides off-heap memory storage, completely bypassing the the standard JVM memory management mechanism. ruby on rails

Carla Brian replied on Sat, 2012/07/21 - 11:08pm

 This has made life easier, perhaps, immeasurably, for developers. However, automatic memory management does not come without costs. - Instant Tax Solutions Reviews

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.