NoSQL Zone is brought to you in partnership with:

Alec is a Content Curator at DZone. He lives in Raleigh and spends his free time writing and programming. Alec is a DZone Zone Leader and has posted 465 posts at DZone. You can read more from them at their website. View Full User Profile

Tuning the JVM to Improve Performance in Cassandra

02.10.2014
| 8339 views |
  • submit to reddit

It's a common problem: As data grows, performance suffers. That was the case in this article from Blake Eggleston at the Shift Developer Blog - an expanding dataset led to nodes that became unresponsive for seconds at a time, and even bigger problems - which was eventually solved by tuning the JVM to cooperate better with Cassandra.

Eggleston's post details the diagnostic process, including the tools used - OpsCenter and jstat, for example - to come to the conclusion that out-of-line garbage collection was a major culprit in the drop in performance. According to Eggleston:

We were also seeing major collections happening every few minutes that would take 5-15 seconds. Basically, the garbage collector was so far out of tune with Cassandra’s behavior that Cassandra was spending a ton of time collecting garbage.

But with careful garbage collection changes, the team was able to drastically improve performance. Eggleston says that:

I haven’t seen the read latencies go above 10ms, and I’ve seen the cluster handle 40 thousand plus reads per second with latencies around 7ms. New gen collection times are now around 15ms, and they happen slightly less than once per second. This means that Cassandra went from spending around 20% or more of it’s time collecting garbage, to a little over 1%. 

That's a serious improvement. Check out Eggleston's full article for details on how garbage collection can bog Cassandra down, and how to fix it.