Ray has posted 1 posts at DZone. View Full User Profile

But It Works On My Machine...

06.25.2008
| 10085 views |
  • submit to reddit

What happens when we run main()in the code snippet below? Of course, the erudite, tech-savvy Java people can answer that in a heartbeat.

public class TestVolatile implements Runnable {
private boolean stopRequested = false;

public void run() {
while(!stopRequested) {
// do something here...
}
}

public void stop() {
stopRequested = true;
}

public static void main(String[] args) throws InterruptedException {
TestVolatile tv = new TestVolatile();
new Thread(tv, "Neverending").start();
Thread.sleep(1000);
tv.stop();
}
}

The answer is that we don’t know for sure when the program will terminate, because the stopRequested variable is not marked volatile. So when we call stop() in the main thread, the Neverending thread may never see that stopRequested has changed to true, and it just keeps going… and going… and going… just like the Energizer Bunny. We never know when it’s gonna stop.

But… it works just fine on my machine

A friend of mine wasn’t convinced though, so just to prove it to him, I typed that in Eclipse and ran it, fully expecting that it would run forever.

It stopped in about one second. Hmmm. Weird.

Seems that the change made from the main thread to stopRequested is immediately visible from the Neverending thread, although stopRequested is not volatile.

Out of curiosity, I modified the program a bit to look like this:

import java.util.concurrent.TimeUnit;

public class TestVolatile implements Runnable {
	private boolean stopRequested = false;

	private volatile long justAfterStopRequested;
	private volatile long afterNeverendingHasStopped;
	public void run() {
		while(!stopRequested) {
			// do something
		}
		afterNeverendingHasStopped = System.currentTimeMillis();
	}

	public void stop() {
		stopRequested = true;
		justAfterStopRequested = System.currentTimeMillis();
	}

	public static void main(String[] args) throws InterruptedException {
		TestVolatile tv = new TestVolatile();
		Thread t = new Thread(tv, "Neverending");
		t.start();
		// let main thread sleep for 1 second before requesting
		// Neverending to stop
		TimeUnit.SECONDS.sleep(1);
		tv.stop();
		// wait until Neverending has stopped
		t.join();
		System.out.println(tv.afterNeverendingHasStopped - tv.justAfterStopRequested);
	}
}

I’m trying to see how much time the Neverending thread needs to realize that stopRequested has changed to true, and get outside the loop. The result is 0.

But it can’t be instantaneous–there must be some difference in time. So I changed the System.currentTimeMillis() calls to System.nanoTime(). The result is virtually the same, ranging from around -300 to +300 nanoseconds (we can’t say which one gets modified first, justAfterStopRequested or afterNeverendingHasStopped–it differs on each run). But for all practical purposes, we can say that the Neverending thread sees the change to stopRequested almost immediately.

Strange.

In Effective Java, 2nd ed., Joshua Bloch says in his machine such a program never stops running. Not that I have a self-esteem problem or whatever, but when it comes to Java, if I have to choose whether to trust Joshua Bloch or me, I choose the former. Sorry, me.

On another machine, however…

But you know what? I found that TestVolatile did run forever on Solaris! (Well, it ran until I was back 15 minutes later and killed it anyway.) Is it because of the differences between Windows and Solaris? Not really–it’s more about the differences between the Server VM and the Client VM. My Solaris test platform is a server-class machine, so by default I was using the Server VM. Whereas on my Windows machine by default I was running the Client VM.

Indeed, when I ran the program again on Solaris with the Client VM (with the -client option), it stopped after roughly a second, without having to make stopRequested volatile. Conversely, when I ran the program on Windows with the -server option, it never stopped, and would only stop if I made stopRequested volatile.

This shows that the Client VM may deceive us into thinking that we don’t need volatile (until we run our program on a server-class machine and things start breaking in strange ways). Superficially, the Client VM and the Server VM may sound like they’re not that different. But some differences do matter: what we see here is a real example where the differences between the Client VM and the Server VM can make or break your application.

And that’s not all. There’s another non-obvious thing that may mask the need of volatile in our programs. Like System.out.println(), for example.

Hidden synchronized blocks sometimes hide the need for volatile

If we want to print a counter variable within the loop like this:

public class TestVolatile implements Runnable {
	private boolean stopRequested = false;
	private int i = 0;

	public void run() {
		while(!stopRequested) {
			System.out.println(i++);
		}
	}

	public void stop() {
		stopRequested = true;
	}

	// the rest
}

Then the program will always terminate even if we don’t mark stopRequested as volatile. Why? Because there is a synchronized block inside System.out.println(). When the thread that runs the run() method calls System.out.println(), its copy of stopRequested gets updated with the latest value, and the while loop terminates.

Note though, that this is NOT guaranteed by the Java Memory Model. The JMM guarantees visibility between two threads that enter and exit synchronized blocks protected by the same lock. If the synchronized blocks are protected by different locks, then the only safe assumption is to assume that there’s no synchronization at all.

This program below does stop,

public class TestVolatile implements Runnable {
	private boolean stopRequested = false;
	private int i = 0;

	private final Object lock1 = new Object();
	private final Object lock2 = new Object();

	public void run() {
		while(!stopRequested) {
			synchronized(lock1) {}
			i++;
		}
	}

	public void stop() {
		stopRequested = true;
		synchronized(lock2) {}
	}

	// the rest
}

even though the two threads are entering synchronized blocks guarded by different locks. We can even remove the synchronized block guarded by lock2 in stop(), and the program still stops. But not the other way around (that is, if we remove lock1’s synchronized block and leave lock2’s there, again the program will run forever).

So it seems that the thread that reads a variable gets the up-to-date value when it executes a synchronized block, regardless of the lock used to guard the block. Even if the variable is not volatile. Which means it’s possible to have code that used to work when you had System.out.println() calls sprinkled throughout suddenly stops working properly when you remove those calls!

Does volatile cascade to member variables? Array elements? Items in collections?

Now let’s say instead of a primitive boolean, stopRequested is a member variable of another class, like this:

import java.util.concurrent.TimeUnit;

public class TestVolatile implements Runnable {
	private A wrapper = new A();

	public void run() {
		while(!wrapper.stopRequested) {
			// do something
		}
	}

	public void stop() {
		wrapper.stopRequested = true;
	}

	public static void main(String[] args) throws InterruptedException {
		TestVolatile tv = new TestVolatile();
		new Thread(tv, "Neverending").start();
		// let main thread sleep for 1 second before requesting
		// Neverending to stop
		TimeUnit.SECONDS.sleep(1);
		tv.stop();
	}
}

class A {
	boolean stopRequested = false;
}

Run with the -server flag, this one never stops either. But what if we mark wrapper as volatile:

	private volatile A wrapper = new A();

Does the “volatility” of the reference cascades to the member variable stopRequested as well? Turned out, looks like the answer is yes. We can either mark wrapper as volatile, or stopRequested as volatile, and the program will terminate in about a second.

I’m not surprised that the program terminates when I mark stopRequested as volatile. But why wrapper as volatile works as well? The same thing happens when we use an ArrayList, like this:

import java.util.ArrayList;
import java.util.List;

public class TestVolatile implements Runnable {
    List<Boolean> wrapper = new ArrayList<Boolean>();
    public TestVolatile() {
        wrapper.add(Boolean.FALSE);
    }

    public void run() {
        while(wrapper.get(0) == Boolean.FALSE) {
            // do something
        }
    }

    public void stop() {
        wrapper.set(0, Boolean.TRUE);
    }

    // the rest of the code...
}

Run as it is, it never stops. If we mark the List<Boolean> wrapper as volatile, it stops pretty fast.

I wonder why? The object reference to that List instance itself never changes. We’re only changing an element within the List, and not the List itself. There’s no hidden synchronized block. Why does the Neverending thread sees the up-to-date value of stopRequested?

Above and beyond the call of duty

Before JSR 133, if thread A writes to volatile field f, thread B is guaranteed to see the new value of f only. Nothing else is guaranteed. With JSR 133 though, volatile is closer to synchronization than it was.  Reading/writing a volatile field now is like acquiring/releasing a monitor in terms of visibility. As the excellent FAQ says: “… anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.”

But still, all these new guarantees for volatile doesn’t answer the question:

public class TestVolatile implements Runnable {
	private volatile A wrapper = new A();

	public void run() {
		while(!wrapper.stopRequested) {
			// do something
		}
	}

	public void stop() {
		wrapper.stopRequested = true;
	}

	// the rest
}

Why does this work? In stop() we’re not exactly writing to wrapper. We’re just using it to change the value of stopRequested. So why does the other thread see the change?

Unfortunately, in the examples I’ve seen so far in books and countless articles, a volatile field is always a primitive, so it’s kinda hard to find the answer to my question. So I did the only remaining way I know to proceed: asking The Concurrency Expert himself. And I was pleasantly surprised to find that he replied very quickly! Here’s what Brian Goetz said:

The Java Memory Model sets the minimum requirements for visibility, but all VMs and CPUs generally provide more visibility than the minimum.  There’s a difference between “what do I observe on machine XYZ in casual use” and “what is guaranteed.”

So there. The VM in this case just goes above and beyond what it is supposed to do. But there’s no guarantee that on another VM and another CPU, the same thing will happen.

Conclusion

So here’s what I’ve learned from this little experiment:

  1. If your application is a server application, or will run on a server-class machine, remember to use the -server flag during development and testing as well to uncover potential problems as early as possible. The Server VM performs some optimizations that can expose bugs that do not manifest on the Client VM.
  2. Just because it works on your machine, doesn’t mean that it’ll work on other machines running other VMs too. It’s important to know what are exactly guaranteed, and code with those minimal guarantees in mind, instead of assuming that other VMs and OSes will be as forgiving as the ones you’re using for development.
  3. (This is closely related to #2 above.) Because VMs and CPUs generally provide more visibility than the minimum guaranteed by JSR 133, it’s good to know the extra things that they do that may mask a potential bug. For example, at least in some VMs, calling System.out.println() forces the change to a non-volatile variable to be visible to other threads because it has a synchronized block inside. This can explain a bug that appears after you’ve made a seemingly unrelated change (that removes a synchronized block from the execution path, for instance).

From http://rayfd.wordpress.com/

Published at DZone with permission of its author, Ray Djajadinata.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Riyad Kalla replied on Wed, 2008/06/25 - 5:47pm

This is getting bookmarked forever in the "Things I gotta freaking remember at 2am" folder. Thanks for writing that all up with the examples.

Alex(JAlexoid) ... replied on Wed, 2008/06/25 - 6:43pm

You could have used something other than the name TestVolatile, sine it gives away the problem... and also read the Java Concurreny In Practice. This is one of the BIG issues there. There is also a google talk on Java memmory model.

Mike P(Okidoky) replied on Wed, 2008/06/25 - 9:34pm

Hilarious!  This should have been hashed out eons ago.  Cudos!

I wish there was a way to run the JVM with minimum visibility, for testing.  Last thing we want is for Java to experience the kind of instability that so inheritely plagues C programs.  For WORA's sake, we need something better than just running with the -server flag.

David Voo replied on Wed, 2008/06/25 - 10:06pm

Nice write up.

Paul Bourdeaux replied on Thu, 2008/06/26 - 8:58am

Great blog post!  Definately one to add to my favorites!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.