Wrapped implementation of a map - java

What's the point of wrapping the map with Collections.synchronizedCollection(map), if then you have to synchronize the code while iterating?
Collection<Type> c = Collections.synchronizedCollection(myCollection);
synchronized(c) {
for (Type e : c)
foo(e); }
After having wrapped it, should not be thread safe?

What's the point of wrapping the map with Collections.synchronizedCollection(map), if then you have to synchronize the code while iterating?
To make individual operations thread-safe. (Personally I think it's a bad idea in general, but that's a different matter. It's not pointless, just limited in usefulness.)
After having wrapped it, should not be thread safe?
For any individual operation, yes. But iteration involves many steps - and while each of those individual steps will be synchronized, the collection can be modified between steps, invalidating the iterator. Don't forget that your loop is expanded to something like:
for (Iterator<Type> iterator = c.iterator(); iterator.hasNext(); ) {
Type e = iterator.next();
...
}
If you need iteration to be thread-safe, you should use one of the collections in java.util.concurrent... while noting the caveats about what is and isn't guaranteed if the collection is modified during iteration.

After wrapping it, each individual method is thread safe, but iteration involves calling methods repeatedly (iterator, then next and hasNext on the returned Iterator) and there's no synchronization between those methods. This is why you need to synchronize your iteration.
You also need to use a synchronized collection (rather than just synchronizing around your iteration code) because otherwise the methods that add or remove items would not synchronize and therefore could make modifications while you were iterating even if you used a synchronized block.

Adding to #jonskeet's #jule's answers, you should consider using ConcurrentHashMap (http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html) which does not require locking around iteration.

Related

ConcurrentModificationException using Iterator

I'm using an iterator to loop over a collection as follows:
Iterator<Entity> entityItr = entityList.iterator();
while (entityItr.hasNext())
{
Entity curr = entityItr.next();
for (Component c : curr.getComponents())
{
if (c instanceof PlayerControlled)
{
((PlayerControlled) c).pollKeyboard();
}
}
}
However on the following line I get a ConcurrentModificationException
Entity curr = entityItr.next();
Why is this happening when I'm not altering anything?
Many thanks
Edit - stack trace:
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(Unknown Source)
at java.util.ArrayList$Itr.next(Unknown Source)
at cw.systems.Input.checkInputs(Input.java:31)
at cw.systems.Input.begin(Input.java:21)
at cw.misc.Game.render(Game.java:73)
at com.badlogic.gdx.backends.lwjgl.LwjglApplication.mainLoop(LwjglApplication.java:207)
at com.badlogic.gdx.backends.lwjgl.LwjglApplication$1.run(LwjglApplication.java:114)
You must be modifying the list either:
inside your iterator in the pollKeyboard method, without using the add or remove methods on the iterator; or
in another thread
Therefore your exception is the expected behaviour. From the docs, if you have a single thread iterating the list:
if the list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException
and if multiple threads uses the list at one time:
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally
Solution:
If only one thread accesses the list, make sure you use the entityItr.remove or add methods to modify the list.
For the multi-threaded case you can use Collections.synchronizedList if you do not have a locking object available.
First store a single central reference to your list as:
entityList = Collections.synchronizedList(theOriginalArrayList);
And then access it (with all readers and writers) as:
synchronized (entityList) {
// Readers might do:
itr = entityList.iterator();
while (i.hasNext())
... do stuff ...
}
There are other ways to sync multi-threaded access, including copying the list to an array (inside a sync block) and iterating it for reading, or using a ReadWrite lock. They all depend on your exact requirement.
It looks that there is another thread using the same collection and modifing it when this code is iterating over the collection.
ConcurrentModificationException
You can use navite java concurrent collestions instead. They are thread safe. However it's a good habbit to create immutable collections - they are thread safe and enforce you to design reliable code.

Understanding collections concurrency and Collections.synchronized*

I learned yesterday that I've been incorrectly using collections with concurrency for many, many years.
Whenever I create a collection that needs to be accessed by more than one thread I wrap it in one of the Collections.synchronized* methods. Then, whenever mutating the collection I also wrap it in a synchronized block (I don't know why I was doing this, I must have thought I read it somewhere).
However, after reading the API more closely, it seems you need the synchronized block when iterating the collection. From the API docs (for Map):
It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views:
And here's a small example:
List<O> list = Collections.synchronizedList(new ArrayList<O>());
...
synchronized(list) {
for(O o: list) { ... }
}
So, given this, I have two questions:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
More importantly, what is this accomplishing? By putting the iteration in a synchronized block you are preventing multiple threads from iterating at the same time. But another thread could mutate the list while iterating so how does the synchronized block help there? Wouldn't mutating the list somewhere else screw with the iteration whether it's synchronized or not? What am I missing?
Thanks for the help!
Why is this even necessary? The only explanation I can think of is
they're using a default iterator instead of a managed thread-safe
iterator, but they could have created a thread-safe iterator and fixed
this mess, right?
Iterating works with one element at a time. For the Iterator to be thread-safe, they'd need to make a copy of the collection. Failing that, any changes to the underlying Collection would affect how you iterate with unpredictable or undefined results.
More importantly, what is this accomplishing? By putting the iteration
in a synchronized block you are preventing multiple threads from
iterating at the same time. But another thread could mutate the list
while iterating so how does the synchronized block help there?
Wouldn't mutating the list somewhere else screw with the iteration
whether it's synchronized or not? What am I missing?
The methods of the object returned by synchronizedList(List) work by synchronizing on the instance. So no other thread could be adding/removing from the same List while you are inside a synchronized block on the List.
The basic case
All of the methods of the object returned by Collections.synchronizedList() are synchronized to the list object itself. Whenever a method is called from one thread, every other thread calling any method of it is blocked until the first call finishes.
So far so good.
Iterare necesse est
But that doesn't stop another thread from modifying the collection when you're between calls to next() on its Iterator. And if that happens, your code will fail with a ConcurrentModificationException. But if you do the iteration in a synchronized block too, and you synchronize on the same object (i.e. the list), this will stop other threads from calling any mutator methods on the list, they have to wait until your iterating thread releases the monitor for the list object. The key is that the mutator methods are synchronized to the same object as your iterator block, this is what's stopping them.
We're not out of the woods yet...
Note though that while the above guarantees basic integrity, it doesn't guarantee correct behaviour at all times. You might have other parts of your code that make assumptions which don't hold up in a multi-threaded environment:
List<Object> list = Collections.synchronizedList( ... );
...
if (!list.contains( "foo" )) {
// there's nothing stopping another thread from adding "foo" here itself, resulting in two copies existing in the list
list.add( "foo" );
}
...
synchronized( list ) { //this block guarantees that "foo" will only be added once
if (!list.contains( "foo" )) {
list.add( "foo" );
}
}
Thread-safe Iterator?
As for the question about a thread-safe iterator, there is indeed a list implementation with it, it's called CopyOnWriteArrayList. It is incredibly useful but as indicated in the API doc, it is limited to a handful of use cases only, specifically when your list is only modified very rarely but iterated over so frequently (and by so many threads) that synchronizing iterations would cause a serious bottle-neck. If you use it inappropriately, it can vastly degrade the performance of your application, as each and every modification of the list creates an entire new copy.
Synchronizing on the returned list is necessary, because internal operations synchronize on a mutex, and that mutex is this, i.e. the synchronized collection itself.
Here's some relevant code from Collections, constructors for SynchronizedCollection, the root of the synchronized collection hierarchy.
SynchronizedCollection(Collection<E> c) {
if (c==null)
throw new NullPointerException();
this.c = c;
mutex = this;
}
(There is another constructor that takes a mutex, used to initialize synchronized "view" collections from methods such as subList.)
If you synchronize on the synchronized list itself, then that does prevent another thread from mutating the list while you're iterating over it.
The imperative that you synchronize of the synchronized collection itself exists because if you synchronize on anything else, then what you have imagined could happen - another thread mutating the collection while you're iterating over it, because the objects locked are different.
Sotirios Delimanolis answered your second question "What is this accomplishing?" effectively. I wanted to amplify his answer to your first question:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
There are several ways to approach making a "thread-safe" iterator. As is typical with software systems, there are multiple possibilities, and they offer different tradeoffs in terms of performance (liveness) and consistency. Off the top of my head I see three possibilities.
1. Lockout + Fail-fast
This is what's suggested by the API docs. If you lock the synchronized wrapper object while iterating it (and the rest of the code in the system written correctly, so that mutation method calls also all go through the synchronized wrapper object), the iteration is guaranteed to see a consistent view of the contents of the collection. Each element will be traversed exactly once. The downside, of course, is that other threads are prevented from modifying or even reading the collection while it's being iterated.
A variation of this would use a reader-writer lock to allow reads but not writes during iteration. However, the iteration itself can mutate the collection, so this would spoil consistency for readers. You'd have to write your own wrapper to do this.
The fail-fast comes into play if the lock isn't taken around the iteration and somebody else modifies the collection, or if the lock is taken and somebody violates the locking policy. In this case if the iteration detects that the collection has been mutated out from under it, it throws ConcurrentModificationException.
2. Copy-on-write
This is the strategy employed by CopyOnWriteArrayList among others. An iterator on such a collection does not require locking, it will always show consistent results during iterator, and it will never throw ConcurrentModificationException. However, writes will always copy the entire array, which can be expensive. Perhaps more importantly, the notion of consistency is altered. The contents of the collection might have changed while you were iterating it -- more precisely, while you were iterating a snapshot of its state some time in the past -- so any decisions you might make now are potentially out of date.
3. Weakly Consistent
This strategy is employed by ConcurrentLinkedDeque and similar collections. The specification contains the definition of weakly consistent. This approach also doesn't require any locking, and iteration will never throw ConcurrentModificationException. But the consistency properties are extremely weak. For example, you might attempt to copy the contents of a ConcurrentLinkedDeque by iterating over it and adding each element encountered to a newly created List. But other threads might be modifying the deque while you're iterating it. In particular, if a thread removes an element "behind" where you've already iterated, and then adds an element "ahead" of where you're iterating, the iteration will probably observe both the removed element and the added element. The copy will thus have a "snapshot" that never actually existed at any point in time. Ya gotta admit that's a pretty weak notion of consistency.
The bottom line is that there's no simple notion of making an iterator thread safe that would "fix this mess". There are several different ways -- possibly more than I've explained here -- and they all involve differing tradeoffs. It's unlikely that any one policy will "do the right thing" in all circumstances for all programs.

How to synchronize Map between one r/w Thread and one read-only Thread?

I have a synchronized Map (via Collections.synchronizedMap()) that is read and updated by Thread A. Thread B accesses the Map only via Map.keySet() (read-only).
How should I synchronize this? The docs say that keySet() (for a Collections.synchronizedMap) "Needn't be in synchronized block". I can put Thread A's read/write access within a synchronized block, but is that even necessary?
I guess it seems odd to me to even use a synchronized Map, or a synchronized block, if Map.keySet doesn't need to be synchronized (according to the docs link above)...
Update: I missed that iteration of the keySet must be synchronized, even though retrieving the keySet does not require sync. Not particularly exciting to have the keySet without being able to look through it, so end result = synchronization required. Using a ConcurrentHashMap instead.
To make a truly read/write versus read/only locking Map wrapper, you can take a look at the wrapper the Collections uses for synchronizedMap() and replace all of the synchronized statements with a ReentrantReadWriteLock. This is a good bit of work. Instead, you should consider switching to using a ConcurrentHashMap which does all of the right things there.
In terms of keySet(), it doesn't need to be in a synchronized block because it is already being synchronized by the Collections.synchronizedMap(). The Javadocs is just pointing out that if you are iterating through the map, you need to synchronize on it because you are doing multiple operations, but you don't need to synchronize when you are getting the keySet() which is wrapped in a SynchronizedSet class which does its own synchronization.
Lastly, your question seemed to be implying that you don't need to synchronize on something if you are just reading from it. You have to remember that synchronization not only protects against race conditions but also ensures that the data is properly shared by each of the processors. Even if you are accessing a Map as read-only, you still need to synchronize on it if any other thread is updating it.
The docs are telling you how to properly synchronize multi-step operations that need to be atomic, in this case iterating over the map:
Map m = Collections.synchronizedMap(new HashMap());
...
Set s = m.keySet(); // Needn't be in synchronized block
...
synchronized(m) { // Synchronizing on m, not s!
Iterator i = s.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}
Note how the actual iteration must be in a synchronized block. The docs are just saying that it doesn't matter if obtaining the keySet() is in the synchronized block, because it's a live view of the Map. If the keys in the map change between the reference to the key set being obtained and the beginning of the synchronized block, the key set will reflect those changes.
And by the way, the docs you cite are only for a Map returned by Collections.synchronizedMap. The statement does not necessarily apply to all Maps.
The docs are correct. The map returned from Collections.synchronizedMap() will properly wrap synchronized around all calls sent to the original Map. However, the set impl returned by keySet() does not have the same property, so you must ensure it is read under the same lock.
Without this synchronization, there is no guarantee that Thread B will ever see any update made by Thread A.
You might want to investigate ConcurrentHashMap. It provides useful semantics for exactly this use case. Iterating over a collection view in CHM (like keySet()) gives useful concurrent behavior ("weakly consistent" iterators). You will traverse all keys from the state of the collection at iteration and you may or may not see changes after the iterator was created.

Java synchronized list for loop

Documentation on synchronizedList states that,
It is imperative that the user manually synchronize on the returned list when iterating over it:
List list = Collections.synchronizedList(new ArrayList());
...
synchronized(list) {
Iterator i = list.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}
Failure to follow this advice may result in non-deterministic behavior.
This seems pretty clear, but I just wanted to confirm that a for each loop is prohibited. For example, I cannot do something like as follows right?
List<MyType> list = Collections.synchronizedList(new ArrayList(<MyType>));
...
synchronized(list){
for(MyType m : list){
foo(m);
m.doSomething();
}
}
Yes, you can - your enhanced for loop is basically the same as your code which explicitly uses the iterator. It boils down to the same code - it's just calling iterator() and then alternating between next() and hasNext() calls.
You can do that. The foreach loop compiles to (nearly) the same bytecode as the while loop. The keys are:
You synchronize the block around the loop because the list may change while you are iterating over it.
You use the list as the object that you are synchronizing on, since the implementation of this class locks on itself (through synchronized methods).
If possible, you might want to consider using immutability rather than synchonization.
http://docs.guava-libraries.googlecode.com/git-history/release09/javadoc/com/google/common/collect/ImmutableList.html
Of course you can, the only problem I see here is a performance issue, if your method dosomething() or foo(m) are costly to execute, you will have a performance cost. The size of your collection is also important to take in account while looping in a synchronized block, due to the fact that, when a thread acquire the lock, while in the synchronized block, looping in a huge collection will push other threads to wait.

Iterator Concurrent Modifiction Exception

This code will throw Concurrent Modification Exception if the list is modified in doSomething(). Is it possible to avoid it by enclosing the code in some synchronized block?
List l = Collections.synchronizedList(new ArrayList());
// normal iteration -- can throw ConcurrentModificationException
// may require external synchronization
for (Iterator i=list.iterator(); i.hasNext(); ) {
doSomething(i.next());
}
if you are removing an item from the list, you can do it by calling iterator.remove() instead of list.remove(iterator.next())
if you are adding an item - well, create a copy of the iterated list and add it there
if the code snippet above is part of the same method, then you don't need a synchronized list or synchronized blocks - no other thread can access the local list.
You can modify a Collection while iterating over it if you do so through the Iterator interface. You can use Iterator.remove() to remove elements.
You cannot modify it while you are iterating over it. Synchronizing won't help here.
EDIT : I forgot iterator does have the remove method. So it is possible to remove.
I agree with others about Iterator and remove().
About synchronization, I wanted to add that synchronization is designed to control interactions between different threads.
It is typical for an object to have several methods synchronized, and that one would call another. So the language designers decided that the same thread would not be blocked by himself on a synchronized.
Also, thinking about it, it a thread is blocked waiting for himself, you have a magnificent starvation perspective! ;-)
So this answers one of your questions: it is not possible to avoid the problem by synchronizing your code.
Use CopyOnWriteArrayList instead of synchronized Array List
List l = Collections.synchronizedList(new ArrayList());
synchronized(l) {
// normal iteration -- can throw ConcurrentModificationException
// may require external synchronization
for (Iterator i=list.iterator(); i.hasNext(); ) {
doSomething(i.next());
}
}

Categories