Christopher has posted 1 posts at DZone. View Full User Profile

Multithreading and the Java Memory Model

10.05.2009
| 25635 views |
  • submit to reddit

At the New England Software Symposium, I attended Brian Goetz's session called "The Java Memory Model". When I saw the phrase "memory model" in the title I thought it would be about garbage collection, memory allocation and memory types. Instead, it is really about multithreading. The difference is that this presentation focuses on visibility, not locking or atomicity. This is my attempt to summarize his talk.

The importance of visibility

Visibility here refers to the memory that an executing thread can see once it is written. The big gotcha is that when thread A writes something before thread B reads it, it does not mean thread B will read the correct value. You could ensure that threads A and B are ordered with locking but you can still be in deep doo doo because the memory is not written and read in order, or is read in a partially written state.

A big part of this peril comes from the layered memory architecture of modern hardware: multi-CPU, multi-core CPUs, multi-level caches on and off chip etc. Instructions could be executed in parallel or out of order. The memory being written may not even be in RAM at all: it could be on a remote core's register. But the danger could also come from old-fashioned compiler optimizations. One of Brian's examples is the following loop which depends on another thread to set the boolean field asleep:

while (!asleep) ++sheep;

The compiler may notice that asleep is loop-invariant and optimize its evaluation out of the loop

if (!asleep) while (true) ++sleep; 

The result is an infinite loop. The fix in this case is to use a volatile variable.

The Java Memory Model

A memory model describes when one thread's actions are guaranteed to be visible to another. The Java memory model (JMM) is quite an achievement: previously, memory models were specific to each processor architecture. A cross-platform memory model takes portability well beyond being able to compile the same source code: you really can run it anywhere. It took until Java 5 (JSR 133) to get the JMM right.

The JMM defines a partial ordering on program actions (read/write, lock/unlock, start/join threads) called happens-before. Basically, if action X happens-before Y, then X's results are visible to Y. Within a thread, the order is basically the program order. It's straightforward. But between threads, if you don't use synchronized or volatile, there are no visibility guarantees. As far as visible results go, there is no guarantee that thread A will see them in the order that thread B executes them. Brian even invoked special relativity to describe the disorienting effects of relative views of reality. You need synchronization to get inter-thread visibility guarantees.

The basic tools of thread synchronization are:

  • The synchronized keyword: an unlock happens-before every subsequent lock on the same monitor.
  • The volatile keyword: a write to a volatile variable happens-before subsequent reads of that variable.
  • Static initialization: done by the class loader, so the JVM guarantees thread safety

In addition to the above, the JMM offers a guarantee of initialization safety for immutable objects.

The Rules

Here are points that Brian emphasized:

  • If you read or write a field that is read/written by another thread, you must synchronize. This must be done by both the reading and writing threads, and on the same lock.
  • Don't try to reason about ordering in undersynchronized programs. 
  • Avoiding synchronization can cause subtle bugs that only blow up in production. Do it right first, then make it fast.

Case study: double-checked locking

One example of synchronization avoidance gone bad is the popular double-checked locking idiom for lazy initialization, which we now know is broken:

private Thing instance = null;
public Thing getInstance() {
if (instance == null) {
synchronized (this) {
if (instance == null) instance = new Thing();
}
}
return instance;
}


This idiom can result in a partially constructed Thing object, because it only worries about atomicity at the expense of visibility. There are ways to fix this, of course, such as using a volatile field or switching to using static initializers. But it's easy to get it wrong, so Brian questions why we would want to do something like this in the first place.

The main motivation was to avoid synchronization in the common case. While it used to be expensive in the past, uncontended synchronization is much cheaper now. There is still a lot of advice to avoid supposedly expensive Java operations out there, but the JVM has improved tremendously and a lot of old performance tips (like object pooling) just don't make sense anymore. Beware of reading years-old advice when you Google for Java tips. Remember Brian's advice above against premature optimization. That said, he also showed a couple of better alternatives for lazy initialization.

Some thoughts

This talk was a reminder to me that low-level multithreading is hard. It's hard enough that it took years to get the JMM right. It's hard enough that a university professor would say "don't do it". And if you faithfully follow Brian's rules and use synchronization primitives everywhere, you might find yourself vulnerable to thread deadlocks (hmmm ... why does JConsole have a deadlock detection function?).

The primary danger in multithreading is in shared, mutable state. Without shared mutable data, threads might as well be separate processes, and the danger evaporates. So while it's wonderful what JMM has done for cross-platform visibility guarantees, I think we would do ourselves a favor if we tried to minimize shared mutable data. There are often higher level alternatives. For example, Scala's Actor construct relies on passing immutable messages instead of sharing memory.

From http://chriswongdevblog.blogspot.com/

Published at DZone with permission of its author, Christopher Wong.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Anuj Mehta replied on Mon, 2009/10/05 - 3:45am

Good one. There is an excellent book on Threads by Brian Goetz "Java Concurrency in practise". Personally I find it heavy stuff in first reading but it it The book on multithreading in Java

http://anuj-mehta.blogspot.com

 

 

James Imber replied on Mon, 2009/10/05 - 6:24am

I've read the book by B. Goety a few years ago. It was an eye opener for me. Since then I've tried to explain this visibility issue to almost every developer I met and almost none of them accept the issue as real. The main problem is that their program works currently and they don't understand why they should do things differently.

srikanth ks replied on Mon, 2009/10/05 - 8:28am

I agree with you.Java Concurrency in practise is the best book i have seen on java concurrency.

After reading the book i felt none of the applications i have worked are completely thread safe ,but most of them never had concurrency issues when tens of concurrent requests.As per the book ,even getters and seeters method needs some level of synchronization which i rarely find in the applications.

i worked on some applications which are deployed in clustered environment with some level of synchronization which apprently not thread safe in clustered environment.surprisingly those application also never had problems with concurrency.its hard to convince the developers to go for complete thraed safe design or use technologies like terracotta.Also most of the developers are not  much concerned with visibility issues as far as application is working fine.

 

 

Col B replied on Mon, 2009/10/05 - 4:31pm

Excellent article!

On the topic of books - I've been tempted to get Doug Lea's book, Concurrent Programming in Java, but the most recent edition is still 10 years old and obviously predates the Java 5 changes to the memory model. Does anyone know if this book still relevant or should I wait for a third edition?

Vedhas Pitkar replied on Tue, 2009/10/06 - 12:46am

Very good article!! For more information on the updated memory model, visit http://jeremymanson.blogspot.com

Jesper Nordenberg replied on Tue, 2009/10/06 - 3:03am

Threads and locks are notoriously hard to get right. See this talk by Rich Hickey on alternative solutions which are much easier to use:

http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey

The code examples are written in Clojure, but the concepts can be applied in any language. For example, Scala has actors and persistent collection types.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.