Several years of experience in designing and building products. My roles have spanned from writing code to leading development for core technical features of Java EE App Server (Container Development), Business Intelligence Server, Web Service Stack, and various collaboration tools. Presently work on development a BI Search Engine for Cognos Software at IBM. Committer on the open source Apache Web Services Stack – Apache CXF. Member on the JSR 303 expert group - Bean Validation spec. Bharath has posted 3 posts at DZone. View Full User Profile

Do Your Iterators Always Fail-Fast?

11.21.2009
| 9215 views |
  • submit to reddit

The iterators of the Collection implementations of the Java runtime throw a ConcurrentModificationException[1] when they detect that another thread has modified the Collection while a thread is iterating over it. Such iterators are generally called fail-fast iterators.

Looking at the implementation one can see how this has been made possible. The Collection subclass maintains an integer modCount that is incremented on every operation that structurally modifies the collection (like add, remove, clear). The fail-fast iterators also contain an integer field expectedModCount which is initialized to the modCount while the iterator is created. Later on, during every iteration, the iterator verifies if the expectedModCount is same as the modCount of the collection it is iterating over. A mismatch means that the collection has been modified during the life cycle of the iterator and a ConcurrentModificationException is thrown. The same happens during the collection serialization process. The modification count prior to writing to the stream should be same as the count after writing to the stream.

But is this behavior guaranteed? Are these iterators always fail-fast? I used to think so, until I saw the declaration of the modCount field in the Collection implementations.

The modCount field is not marked as volatile. This would mean - while a thread makes structural changes to the collection and increments the modCount field, the other thread that performs the iteration or serialization might not see the incremented value of the modCount field. It might end up doing a comparison with the stale modCount that is visible to it. This might cause the fail-fast iterators to behave in a different way and failing at a later point. The field is non-volatile in Apache Harmony too.

A bug[2] in Sun Java explains this. It says the modCount was intentionally made non-volatile. Making it volatile would have added contention at every collection modification operation because of the memory fencing that’s required.

Also the Javadoc clearly says “fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification”.

This behavior would be the same even on the Synchronized wrapper's of these collections.

[1]http://java.sun.com/j2se/1.5.0/docs/api/java/util/ConcurrentModificationException.html
[2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6625725

 

Source: http://thoughts.bharathganesh.com

Published at DZone with permission of its author, Bharath Ganesh.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Artur Biesiadowski replied on Mon, 2009/11/23 - 3:43am

Even if modCount would be volatile, there would be a possibilty, just slightly after checking it, for getting corrupted view of internal structure. Without full synchronization, it is not possible to avoid it in simple collections. By making it non-volatile, they just make this window of opportunity longer (possibly LOT longer, maybe even seconds (minutes? forever?) instead of microseconds) - but in multithreaded environment, it doesn't matter. There is no NOW between threads, there is just order of synchronization points - everything in between can possibly run in completely different timelines. Volatile on modcount would create such sync point, but at wrong time to help in anything.

So, fail-fast on interators can fail to report error, volatile or not. Said that, on average, it is good enough even without any kind of synchronization (on the other hand, 90% of cases I have seen it happen was in single thread, where somebody holds the iterator to collection he modified directly in same thread).

Greg Brown replied on Mon, 2009/11/23 - 8:04pm

Mod count is actually used to ensure the integrity of the iterator state, not just protect against concurrent programming errors. The collection classes will also throw if a concurrent modification occurs within the same thread (e.g. removing an item from a list while iterating over it).

G

Nicolas Bousquet replied on Tue, 2009/11/24 - 11:18am

Actually even a conccurent Map or List would not fit in many concurrent cases. Simply because all "composites" operation are not "atomic"... And many time code is something like :

if (condition on the collection) {

read or write the collection

}

Depending of the case :

 - simply make a copy of the collection in a synchronized bloc and iterate over that copy . (CopyOnWriteArrayList and other can help here). Il allows you to do the actual iteration outside of a synchronized block, and then prevent dead locks. Be sure you can do so through.

- stay fully synchronised for shared state

 

In all case, always think about what you need, what you want. Synchronize just as necessary (to avoid dead lock) but NEVER less. Many time "concurrents objects" will simply not fit your case :

 - you need composite operations to be atomic

- you need to update several objects in the same atomic operation

- too many synchronized object will make dead locks. So your application is not working anymore.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.