Multithreading and the Java Memory Model
At the New England Software Symposium, I attended Brian Goetz's session called "The Java Memory Model". When I saw the phrase "memory model" in the title I thought it would be about garbage collection, memory allocation and memory types. Instead, it is really about multithreading. The difference is that this presentation focuses on visibility, not locking or atomicity. This is my attempt to summarize his talk.
The importance of visibility
Visibility here refers to the memory that an executing thread can see once it is written. The big gotcha is that when thread A writes something before thread B reads it, it does not mean thread B will read the correct value. You could ensure that threads A and B are ordered with locking but you can still be in deep doo doo because the memory is not written and read in order, or is read in a partially written state.
A big part of this peril comes from the layered memory architecture of
modern hardware: multi-CPU, multi-core CPUs, multi-level caches on and
off chip etc. Instructions could be executed in parallel or out of
order. The memory being written may not even be in RAM at all: it could
be on a remote core's register. But the danger could also come from
old-fashioned compiler optimizations. One of Brian's examples is the
following loop which depends on another thread to set the boolean field
asleep:
while (!asleep) ++sheep;
The compiler may notice that asleep is loop-invariant and optimize its evaluation out of the loop
if (!asleep) while (true) ++sleep;
The result is an infinite loop. The fix in this case is to use a volatile variable.
The Java Memory Model
A memory model describes when one thread's actions are guaranteed to be visible to another. The Java memory model (JMM) is quite an achievement: previously, memory models were specific to each processor architecture. A cross-platform memory model takes portability well beyond being able to compile the same source code: you really can run it anywhere. It took until Java 5 (JSR 133) to get the JMM right.
The JMM defines a partial ordering on program actions (read/write, lock/unlock, start/join threads) called happens-before. Basically, if action X happens-before Y, then X's results are visible to Y. Within a thread, the order is basically the program order. It's straightforward. But between threads, if you don't use synchronized or volatile, there are no visibility guarantees. As far as visible results go, there is no guarantee that thread A will see them in the order that thread B executes them. Brian even invoked special relativity to describe the disorienting effects of relative views of reality. You need synchronization to get inter-thread visibility guarantees.
The basic tools of thread synchronization are:
- The synchronized keyword: an unlock happens-before every subsequent lock on the same monitor.
- The volatile keyword: a write to a volatile variable happens-before subsequent reads of that variable.
- Static initialization: done by the class loader, so the JVM guarantees thread safety
In addition to the above, the JMM offers a guarantee of initialization safety for immutable objects.
The Rules
Here are points that Brian emphasized:
- If you read or write a field that is read/written by another thread, you must synchronize. This must be done by both the reading and writing threads, and on the same lock.
- Don't try to reason about ordering in undersynchronized programs.
- Avoiding synchronization can cause subtle bugs that only blow up in production. Do it right first, then make it fast.
Case study: double-checked locking
One example of synchronization avoidance gone bad is the popular double-checked locking idiom for lazy initialization, which we now know is broken:
private Thing instance = null;
public Thing getInstance() {
if (instance == null) {
synchronized (this) {
if (instance == null) instance = new Thing();
}
}
return instance;
}
This idiom can result in a partially constructed Thing object, because
it only worries about atomicity at the expense of visibility. There are
ways to fix this, of course, such as using a volatile field or
switching to using static initializers. But it's easy to get it wrong,
so Brian questions why we would want to do something like this in the
first place.
The main motivation was to avoid synchronization in the common case.
While it used to be expensive in the past, uncontended synchronization
is much cheaper now. There is still a lot of advice to avoid supposedly
expensive Java operations out there, but the JVM has improved
tremendously and a lot of old performance tips (like object pooling)
just don't make sense anymore. Beware of reading years-old advice when
you Google for Java tips. Remember Brian's advice above against
premature optimization. That said, he also showed a couple of better
alternatives for lazy initialization.
Some thoughts
This talk was a reminder to me that low-level multithreading is hard.
It's hard enough that it took years to get the JMM right. It's hard
enough that a university professor would say "don't do it".
And if you faithfully follow Brian's rules and use synchronization
primitives everywhere, you might find yourself vulnerable to thread
deadlocks (hmmm ... why does JConsole have a deadlock detection
function?).
The primary danger in multithreading is in shared, mutable state.
Without shared mutable data, threads might as well be separate
processes, and the danger evaporates. So while it's wonderful what JMM
has done for cross-platform visibility guarantees, I think we would do
ourselves a favor if we tried to minimize shared mutable data. There
are often higher level alternatives. For example, Scala's Actor
construct relies on passing immutable messages instead of sharing
memory.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)





Comments
Anuj Mehta replied on Mon, 2009/10/05 - 3:45am
Good one. There is an excellent book on Threads by Brian Goetz "Java Concurrency in practise". Personally I find it heavy stuff in first reading but it it The book on multithreading in Java
http://anuj-mehta.blogspot.com
James Imber replied on Mon, 2009/10/05 - 6:24am
srikanth ks replied on Mon, 2009/10/05 - 8:28am
I agree with you.Java Concurrency in practise is the best book i have seen on java concurrency.
After reading the book i felt none of the applications i have worked are completely thread safe ,but most of them never had concurrency issues when tens of concurrent requests.As per the book ,even getters and seeters method needs some level of synchronization which i rarely find in the applications.
i worked on some applications which are deployed in clustered environment with some level of synchronization which apprently not thread safe in clustered environment.surprisingly those application also never had problems with concurrency.its hard to convince the developers to go for complete thraed safe design or use technologies like terracotta.Also most of the developers are not much concerned with visibility issues as far as application is working fine.
Col B replied on Mon, 2009/10/05 - 4:31pm
Excellent article!
On the topic of books - I've been tempted to get Doug Lea's book, Concurrent Programming in Java, but the most recent edition is still 10 years old and obviously predates the Java 5 changes to the memory model. Does anyone know if this book still relevant or should I wait for a third edition?Vedhas Pitkar replied on Tue, 2009/10/06 - 12:46am
Jesper Nordenberg replied on Tue, 2009/10/06 - 3:03am
Threads and locks are notoriously hard to get right. See this talk by Rich Hickey on alternative solutions which are much easier to use:
http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
The code examples are written in Clojure, but the concepts can be applied in any language. For example, Scala has actors and persistent collection types.