Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2576 posts at DZone. You can read more from them at their website. View Full User Profile

Terracotta Says 'Goodbye Java GC, Hello BigMemory'

09.14.2010
| 36807 views |
  • submit to reddit
Last month, Terracotta told DZone that they were working on a module for Ehcache and Hibernate users that would bypass the Java garbage collection bottleneck.  The severely limited heap size allowed by the Java garbage collector (if you don't want to have significant performance issues) is a problem that has plagued the industry for years.  Terracotta believes that today's memory requirements are finally forcing many to address this issue, and now Terracotta has their own answer - BigMemory.  

DZone interviewed Terracotta CEO Amit Pandey about his company's solution to the GC problem in Java.  BigMemory is a snap-in module that provides off-heap memory storage.  It is a pure Java solution that works with all JVMs and bypasses the Java garbage collector.  You don't even need to be in distributed caching mode to use BigMemory, but it works with both standalone and distributed caches.  Like most Ehcache features, it requires only a few lines of configuration.  BigMemory can also handle hundreds of millions of objects.

Here are the two steps needed to implement BigMemory (when it is released) in Ehcache:

1. Update the Java classpath and startup arguments to include the new
ehcache-2.3 jar and allocate sufficient memory for the BigMemory off-heap
store.
 
2. Update the cache configuration in ehcache.xml to set the size of the
BigMemory off-heap store.
<ehcache>
<cache name="mycache" maxElementsInMemory="10000"
overflowToOffHeap="true" maxMemoryOffHeap="4G"/>
</ehcache>


"People have been trying to make better garbage collectors for years and years.  Sun/Oracle have been putting a lot of money into solving this problem, but they're only getting incremental improvements."  Terracotta, Pandey says, is able to solve the problem more elegantly because they they "own the cache."  So they decided to write their own memory manager and bypass Java GC completely, freeing developers from difficult and time consuming GC tuning.

Currently in the late beta stages, Pandey told DZone that the responses to BigMemory were very positive from early beta users.  Testing of the memory manager revealed that the responses to increased memory were fast and flat, meaning that it could be counted on to meet performance requirements in a Service Level Agreement.  The greater memory utilization with BigMemory also allows server consolidation.

       App Latency Over Time


Terracotta has tested heap sizes up to 350GB without running into a wall like the 2GB threshold in Java GC.  Pandey says Terracotta is avoiding many of the problems (fragmentation, poor memory management techniques, etc.) that others have encountered when they tried to build their own memory manager.  Pandey says the tagline for BigMemory will be: "Use all the memory you need to get blazingly fast performance."

BigMemory will be available for Ehcache and Hibernate users in October of this year.

Comments

sibasisha padhy replied on Tue, 2010/09/14 - 10:18am

Wow, Is there a copy that I download. This would be a big boost to the heap size constraint.

Jeff Rade replied on Tue, 2010/09/14 - 10:47am

Or we can go back to writing efficient code.

Fabrizio Giudici replied on Tue, 2010/09/14 - 12:26pm

Hmm... given Oracle plans to merge in some way JRockit with the standard JVM, I wonder whether BigMemory could be the future target of a patent litigation...

Peter Monks replied on Tue, 2010/09/14 - 1:03pm

Presumably "BigMemory" needs some kind of GC-equivalent to prevent eventual exhaustion of the "non-heap" memory area. I'd be very keen to hear more about what Terracotta have done on this topic, and if/how they've avoided running into the same issues as the "vanilla heap" GC. Unless they've reverted to manual memory reclamation or weak reference tricks or similar, it smells a bit fishy to me...

Alessandro Santini replied on Tue, 2010/09/14 - 2:35pm in response to: Peter Monks

I admit I still did not read the documentation, but I tend to second your opinion - it sounds like pushing the garbage collection phase as late as possible by largely increasing the available amount of memory... but again, I did not read the documentation.

Steven Harris replied on Tue, 2010/09/14 - 3:06pm in response to: Alessandro Santini

Hey Gang, A little about what we are doing and then a little bit of an answer to the questions With out spilling too much of the beans I can say that: 1) it is 100% pure java tiered cache (onHeap, off Heap, onDisk) and runs on any 1.5 and up JVM (64bit preferred, jrocket, ibm, sun tested) on any operating system that supports those JVMs (also 64 bit preferred). 2) It has been tested to caches over 350G in size with almost no degradation in performance due to size of the off-heap in memory portion of the cache. 3) The off heap portion is pause-less, fully concurrent and scales with CPU. I'm going to blog a bunch more details in the next week or so. Not going to reveal too much here but just a little about the nature of the problem. GC in an object oriented language is very complicated for two reasons: 1) Figuring out what is garbage Java has arbitrarily complex graphs of data constantly changing references. Their are lots of papers and books on how Java uses multiple Garbage Collection strategies to determine what is garbage and what isn't so I won't cover it here. 2) Managing space When you create new Objects you need to be able to find the space to do it and not waste space due to fragmentation. A cache is a different beast. Entries are explicitly put, expired and removed so number 1 is already taken care of for you. That leaves problem 2 which we have come up with a number of strategies to deal with efficiently. Stay tuned...

Sergio Bossa replied on Tue, 2010/09/14 - 4:16pm

Hi Steve, great job here and thanks for your answers. What I think everyone is wondering about is, how did you manage to get stuff out of the garbage collected Java heap? I mean, Java has no other memory space for its objects, other than the heap, which is subjected to gc, nor supports manual deallocation, so how do you put things off-heap without native code? Thanks, Cheers! Sergio B.

Greg Luck replied on Tue, 2010/09/14 - 10:08pm in response to: Sergio Bossa

Sergio

With some very clever usage of DirectByteBuffer, which has been there since Java 1.4. It is in-process but off-heap. We have just released the technical documentation where you can read all about it: ehcache.org/documentation/offheap_store.html

Greg

Sergio Bossa replied on Wed, 2010/09/15 - 2:39am

Hi Greg, thanks for your answer. The byte buffer trick is very neat (even if I wouldn't have called it "off-heap" because it actually is in heap, but that's a naming question): congrats!

Alois Cochard replied on Wed, 2010/09/15 - 3:20am

Really promising,

I'm exited about these new feature, and gonna try it ASAP.

I wonder how performance would compare with the experimental Azul GC we see some months ago, an hardware optimized GC (using module in linux kernel, if I remember correctly).

Perhaps both technology can be complementary !

Alois Cochard
http://aloiscochard.blogspot.com
http: //www.twitter.com/aloiscochard

Sergio Bossa replied on Wed, 2010/09/15 - 5:39am

Sorry Greg, had a look at your docs and yes, it's actually off heap because it uses direct allocation (I somewhat missed you cited that in your comment too): congrats again ;)

RIchard replied on Wed, 2010/09/15 - 3:11pm in response to: Fabrizio Giudici

you see how ridiculous it sounds, when we say it out loud? :)

Henk De Boer replied on Wed, 2010/09/15 - 4:17pm in response to: Steven Harris

A cache is a different beast. Entries are explicitly put, expired and removed so number 1 is already taken care of for you.

So, basically this is 'just' a cache right? As a user you need to allocate your objects with the new operator, then put them in this cache, and eventually when you think you're done with the object instance you call some method to remove it from the cache.

Nice as this might be, I'm not sure you can really compare this to GC or say you solved the GC problem. What happens when you forget to remove an object from the cache? This will open up an entire new class of memory leaks in Java, one that isn't really new in general of course since C++ programmers among others have been dealing with this for a long time.

Of course, as a pure cache it's a really nice solution ;)

Saravanan Subbiah replied on Wed, 2010/09/15 - 7:59pm in response to: Henk De Boer

yes, this is a neat solution to store huge amount of data closer to the application and have very fast access to them. It really isn't a generic solution to replace existing garbage collectors out there. Having said that, I believe that one could solve most GC problems with this solution and get the benefit of low GC pauses, low latency and consistent throughput.

Regarding the problem about forgetting to remove, the problem is the same for any java application, all data structures will leak if you forget to remove. Atleast with a cache you could configure it to be evicted on full capacity or be evicted on expiry after a timeout etc. to circumvent those issues.

Henk De Boer replied on Thu, 2010/09/16 - 1:39pm in response to: Saravanan Subbiah

yes, this is a neat solution to store huge amount of data closer to the application and have very fast access to them. It really isn't a generic solution to replace existing garbage collectors out there

Indeed, it's a cache, nothing more, nothing less. An efficient one maybe, and one that isn't bothered by the GC (the GC has nothing to do in memory areas that are caches), but the various titles of articles about this are a little over the top really ;)

Atleast with a cache you could configure it to be evicted on full capacity or be evicted on expiry after a timeout etc. to circumvent those issues.

This of course only applies for data that is truly just cached data. In generally you can't just randomly evict live objects in a JVM. Only those objects that you explicitly store in a cache and are able to be recreated if they suddenly happen to be not there anymore qualify for that.

This is perfectly fine for a cache, but again, this is not a general replacement for the heap and Java GC as the articles initially seem to suggest.

Kunal Bhasin replied on Fri, 2010/09/17 - 1:54pm

What happens when you forget to remove an object from the cache?
Typically, you set a TimeToLive or TimeToIdle on objects so the evictor can evict them from the cache. I would like to avoid getting into implementation details, but the Ehcache evictor is far less contentious than a full GC as it does not need to block or lock the entire cache or object graph.
(the GC has nothing to do in memory areas that are caches)
The data in the cache would still be on the Java heap and, given the nature of most cached data, it is usually long lived and prone to cause long pauses and fragmentation over time. BigMemory solves exactly this problem by moving cached data off the Java Heap and managing it in RAM for the application in a very efficient manner.

Hal Mo replied on Mon, 2010/09/20 - 5:31am in response to: sibasisha padhy

It's great. Anyway to donwload it?

Steven Harris replied on Mon, 2010/09/27 - 8:19pm

Wrote a short blog on the what, why how of big memory incase people are interested http://dsoguy.blogspot.com/2010/09/little-bit-about-bigmemory-for-ehcache.html

James Kear replied on Tue, 2011/09/06 - 2:35pm

This breakthrough solution improves memory utilization and application performance with both standalone and distributed caching. hire a programmers

Carla Brian replied on Tue, 2012/05/15 - 5:57pm

I would definitely download this one. I think this application is reliable and effective. - James P Stuckey

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.