Performance Zone is brought to you in partnership with:

Robert Atkey is a Senior Software Engineer at Contemplate Ltd., where he architects, develops and maintains the core analysis framework. He graduated from Edinburgh University in 2006 with a PhD in Computer Science, and has since worked on practical and theoretical aspects of software, in both academia and industry. Bob has posted 1 posts at DZone. You can read more from them at their website. View Full User Profile

Detecting Deadlocks without Drudgery

07.08.2014
| 1767 views |
  • submit to reddit

Deadlocks are not nice. Your program might be happily computing away, its concurrent threads cheerily communicating with each other and the outside world, when suddenly a bit of the program just seizes up. Deadlock!

Testing for deadlock is difficult. The circumstances that lead to a deadlock may occur very infrequently. To protect against deadlocks there is really only one practical option: to design and examine code very carefully to make sure that there is no way of creating the circumstances that will lead to a deadlock.

Examining code by hand is error prone and tedious. It is much better to get a computer to do it for you. We’ll see below how Contemplate’s ThreadSafe static analysis tool can be used to tirelessly examine your code for potential deadlocks, with an example of a potential deadlock found in the K9Mail Android app.

Ingredients of a Deadlock

What are deadlocks? From the point of view of someone observing the external behaviour of a program, we might be tempted to just say that a deadlock occurs whenever the program stops responding. But in the world of concurrent programming, we mean something quite specific when we say “deadlock”. Here is a fairly general definition that applies in many cases:

Deadlocks can occur when there is a circular dependency between exclusive resource acquisitions performed by two or more concurrent threads.

This definition is abstract, but it is fairly easy to break it down into its individual ingredients. We can then build a small example that brings the definition to life. To break it down, let’s work backwards through the definition:

  1. Two or more concurrent threads: There needs to be more than one concurrent thread of execution for a deadlock to occur.
  2. Exclusive resource acquisitions: Deadlocks occur when concurrent threads attempt to gain exclusive access to some resource. Exclusive access is often required for correctness in concurrent programs to prevent multiple threads from interfering with each other, but as we shall see, it is this very exclusivity that can lead to deadlock.
  3. Circular dependency: For a deadlock to occur, the final ingredient is a circular dependency: a ring of threads, each of which has acquired access to some exclusive resource, but desires access to another’s. Due to the circularity, each thread is ultimately waiting upon itself to release its resource in order to proceed. This will never happen, so the ring deadlocks.

A Simple Deadlock

Let’s build a small prototypical example of a deadlock showing how the three ingredients fit together.

For minimality’s sake, we’ll have two threads (“A” and “B”), each of which is attempting to acquire two locks (“lock1” and “lock2”). Locks are the prototypical example of an exclusive resource: only one thread may hold a lock at a time.

To get a deadlock, it remains to get circular dependency between the lock acquisitions. The “deadly” circularity is encoded in the behaviour of the two threads, which I have written as snippets of Java code:

01.Thread A:
02.  synchronized(lock1) {
03.      synchronized(lock2) {
04.          // ... perform some action ...
05.      }
06.  }
07. 
08.Thread B:
09.  synchronized(lock2) {
10.      synchronized(lock1) {
11.          // ... perform some action ...
12.      }
13.  }

Imagine that the threads “A” and “B” are scheduled as follows:

A1. acquire lock1
                                  B1. acquire lock2
A2. acquire lock2
    *blocks*
                                  B2. acquire lock1
                                      *blocks*

Threads A and B have both blocked. They are both waiting for the resource exclusively held by the other, and cannot proceed without it. Deadlock!

In some sense, we were unlucky to get into a deadlocked state. The two threads could have been scheduled as follows:

A1. acquire lock1
A2. acquire lock2
A3. perform action
A4. release lock2
A5. release lock1
                                  B1. acquire lock2
                                  B2. acquire lock1
                                  B3. perform action
                                  B4. release lock1
                                  B5. release lock2

In this scheduling, no deadlock occurs.

It is this dependence on the order and interleaving of thread executions that makes the absence of deadlock so difficult to test for. Threads A and B always have the latent capacity to deadlock, due to the circular dependency between the threads’ acquisitions of “lock1” and “lock2”. During testing, we may only ever trigger the second scenario. When the system is under light load tasks may effectively run sequentially, with little overlap, as in the second scenario. However, as load increases, perhaps at critical moments in production, the likelihood of the two threads’ executions being interleaved as in the first scenario rises, and the latent risk of deadlock becomes a real threat.

A More Complicated Deadlock

In “Thread A” and “Thread B” above, it was pretty obvious from reading the code that there is a circular dependency between the locks lock1 and lock2.

Unfortunately, it is not always so easy to see circular dependencies in code. Have a look at the following pair of Java class definitions, which together contain the potential for a deadlock:

01.publicclassFolder {
02. 
03.    privateintunreadCount;
04. 
05.    privateList<Message> messages;
06. 
07.    publicsynchronizedvoiddecrementUnreadCount() {
08.        unreadCount--;
09.    }
10. 
11.    publicsynchronizedvoidsetAllRead() {
12.        for(Message message : messages) {
13.            message.setRead();
14.        }
15.    }
16.}

and

01.publicclassMessage {
02. 
03.    privateFolder folder;
04. 
05.    privatebooleanread;
06. 
07.    publicsynchronizedvoidsetRead() {
08.        if(!read) {
09.            read = true;
10.            folder.decrementUnreadCount();
11.        }
12.    }
13.}
Suppose, again, that we have two threads, A and B. Thread A has a reference to a Folder object, and Thread B has a reference to a Message object. Also, the Message object is contained in the Folder’s messages list, and the Message object’s folder field references the Folder object. This scenario is depicted in the following diagram:


Object Graph for a Potential Deadlock

Object Graph for a Potential Deadlock


By looking at the diagram, we can quickly see that we nearly have the three ingredients for a deadlock:

  1. There are two threads, Thread A and Thread B, that we are assuming will execute concurrently.
  2. The intrinsic locks built in to the Message and Folder objects are the exclusive resources.
  3. There is obviously a circularity in the object graph between the Folder object and the Message object. This is not necessarily a circular dependency between the two exclusive resources though.

For an actual deadlock, we need to completely fulfil requirement 3 for a deadlock. Let’s set up a scenario that, with the right scheduling, will fulfil this requirement.

Let’s assume that, concurrently, Thread A invokes the setAllRead() method on the Folder instance, and Thread B invokes the setRead() method on the Message instance. Now it is possible for the following sequence of events to occur:

  • Thread A acquires the lock on the Folder instance, because the setAllRead() method is declared as synchronized. Thread A then proceeds to work its way along the list of messages, calling setRead() on each one.
  • Thread B acquires the lock on the Message instance, because the setRead() method is declared as synchronized.
  • Thread A reaches the Message instance that Thread B has locked, and invokes setRead() on it. This method is declared as synchronized, so Thread A attempts to acquire a lock on the Message instance. It cannot acquire this lock, because Thread B already has it. Thread A now blocks, waiting for the lock to be released by Thread B.
  • Thread B invokes the decrementUnreadCount() method on the Folder instance. This method is declared as synchronized, so the Thread B attempts to acquire a lock on the Folder instance. It cannot acquire this lock because Thread A acquired it in step 1. Thread B now blocks, waiting for the lock to be released by Thread A.

Both of the threads are now blocked, waiting for the other to release the lock they are attempting to acquire. This is our undesired circular dependency. Since neither thread can make any progress, the pair of threads has entered a deadlock situation.

Finding Deadlocks: by hand, and by machine

We’ve seen two examples of potential deadlocks. Both of these required careful step-by-step reasoning to discover that there was actually a potential for deadlock lurking in the code. We wouldn’t want to have to do this by hand for every possible pair of threads, possible object graph, and possible scheduling!

Let’s see a nearly practical way to discover deadlocks by hand, and a definitely practical way to get a computer to discover deadlocks, using the static analysis tool ThreadSafe.

Finding deadlocks by hand

The trick to finding potential deadlocks is to think about an abstraction of the program, in terms of the locks the program uses and the dependencies between them. If we can find a circularity in the dependencies, then we may have found a potential deadlock.

To look for circularities, we create a lock dependency graph of all the locks and lock acquisitions in a program. We draw a node for each lock in a program, and an edge between two locks X and Y if there is a situation when lock X is held while acquiring lock Y. Cycles in this graph represent circular dependencies between locks: the situations that can lead to deadlock!

For the Folder and Message example, the graph looks like this:


Lock Dependency Graph

Lock Dependency Graph


All the intrinsic locks associated with the objects of the Folder class have been gathered together into the “Folder” node at the top left; similarly, all the intrinsic locks associated with the objects Message class have been gathered together into the “Message” node in the bottom right. Since we don’t necessarily know how many Folder and Messages instances will actually be generated at runtime, the two nodes in the graph stand in for all the possible instances.

The edge from “Folder” to “Message” arises from the call by the method Folder.setAllRead() to Message.setRead(). Since both methods are synchronized, this call represents an acquisition of a lock on a Message instance, while holding a lock on a Folder instance.

Likewise, the edge from “Message” to “Folder” arises from the call by the method Message.setRead() to Folder.decrementUnreadCount(). Again, since both methods are synchronized, this call represents an acquisition of a lock on a Folder instance, while holding a lock on a Message instance.

Just by looking at the diagram, we can see the potential for a deadlock. We have identified most of ingredients 2 and 3 for a deadlock: the locks associated with “Folder” and “Message” are exclusive resources, and there is potentially a circular dependency between them.

We can only say “potentially” because we have abstracted away the actual object graph. It may very well be the case that there will never be a pair of a Folder object and a Message object that point to each other.

Nevertheless, the potential for a deadlock is there, and the code warrants further investigation to discover whether or not there really can be a deadlock.

Finding deadlocks by machine

Making a graph of locks and their dependencies cuts down the amount of work we need to do to find deadlocks, but it is still tedious to go through each lock acquisition in turn, think about what other locks are held, and to draw the graph. Especially when a realistic program might contain many thousands of lock acquisitions.

Surely we can get the computer to do this for us?

ThreadSafe is a static analysis tool specifically designed to find concurrency defects in Java programs. Running ThreadSafe’s Eclipse plugin on the Folder and Message classes produces the following finding:


ThreadSafe reporting a potential deadlock

ThreadSafe reporting a potential deadlock


ThreadSafe has discovered the potential deadlock by analysing the graph of lock acquisitions and discovering a circularity. The locks involved are clearly marked, with the line numbers showing where each lock was acquired while another lock was held.

Getting ThreadSafe to do this clearly beats finding a piece of paper big enough to draw out the graph of all the locks in a program and the edges between them!

Once ThreadSafe has found a potential deadlock situation, we can investigate further to determine whether there is a chance that it can actually arise in practice. Maybe it won’t, but it is better to be deadlock-free than sorry!

Using ThreadSafe to find a potential deadlock in K9Mail

K9Mail is an Android email client, written in Java, that uses concurrent threads to prevent the user interface from being made unresponsive during communication with the network.

K9Mail consists of over 1000 classes. It would be impractical to go through all of them to look for potential deadlock cycles. But we can get ThreadSafe to do the work for us. Running the ThreadSafe Eclipse plugin on K9Mail produces the following deadlock warning:


ThreadSafe reporting a potential deadlock in K9Mail

ThreadSafe reporting a potential deadlock in K9Mail


From the finding report, we can see that the intrinsic locks associated with objects of the Preferences and Account classes form a circular relationship. Investigating this finding using Eclipse’s call hierarchy feature, we learn that the circularity can arise from the following pieces of code (links are to the K9Mail source code on GitHub for the version that was analysed by ThreadSafe):

  1. In the synchronized method Account.save(Preferences), there is a call on line 679 to the synchronized method Preferences.getAccounts().
  2. In the synchronized method Preferences.getAvailableAccounts(), there is a call on line 95 to the synchronized method Account.isEnabled().

This situation is very similar to the Folder and Message example we looked at above. If two threads invoke the Account.save() method and the Preferences.getAvailableAccounts() method concurrently, then there is a real chance that a deadlock will result.

It will take further investigation to determine whether or not this scenario is actually possible, but ThreadSafe has successfully narrowed down the number of lines of code that we have to look at.

Conclusions

Deadlocks are tricky. To find them, we can either wait until the scheduler picks the unlucky schedule that results in a deadlock, or we can exhaustively search our code for potential deadlocks. By using a tool like Contemplate’s ThreadSafe, we can find potential deadlocks without too much work.

ThreadSafe is capable of discovering a range of other concurrency defects, including atomicity errors arising from incorrect use of the concurrent collections framework and potential data races. The ThreadSafe technical briefing and demonstration video provide further examples of the subtle but potentially catastrophic defects that ThreadSafe can detect.

Resources

Published at DZone with permission of its author, Bob Atkey. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)