Why is this code throwing a ConcurrentModificationException? - java

Below is the code, I am getting a ConcurrentModificationException in the subiter.next() call even though I am not modifying the underlying collection and its running as a single thread.
Tree tree=partition.getTreeofThisPartition();
Set<DzExpressionHostTupel> oldSubtupels=tree.getSubscribers();
Iterator<DzExpressionHostTupel> subiter=oldSubtupels.iterator();
while (subiter.hasNext()){
DzExpressionHostTupel subtupel=subiter.next();
tree.removeSubscriber(subtupel);
}

If you read https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html, it says:
For example, it is not generally permissible for one thread to modify a Collection while another thread is iterating over it. In general, the results of the iteration are undefined under these circumstances. Some Iterator implementations (including those of all the general purpose collection implementations provided by the JRE) may choose to throw this exception if this behavior is detected. Iterators that do this are known as fail-fast iterators, as they fail quickly and cleanly, rather that risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception. For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
(emphasis added).
I'm guessing tree.removeSubscriber(subtupel); is modifying its subscribers set.

Related

Effects of concurrent access to an unsynchronised Java ArrayList

Imagine the following scenario:
I have a standard Java ArrayList<String>.
This ArrayList<String> is accessed by multiple threads with no explicit synchronisation, subject to the following constraints:
Multiple reads may occur at the same time (possibly concurrent with the write described below). All reads call the iterator() method on the ArrayList<String> and exclusively use the returned Iterator<E> (iterators are not shared between threads). The only methods called on the Iterator<String> are hasNext() and next() (the remove() method is not called).
One thread may write to the list (possibly concurrent with reads but not concurrent with other writes). Each write only calls the add(String) and remove(Object) methods on the ArrayList<E>.
I know the follow to be true:
The read threads may see outdated data.
The read threads may experience a ConcurrentModificationException.
Apart from the above two problems, can anything else go wrong?
I am looking for specific examples (of the form, if x and y are true, then z will occur, where z is bad). It has to be something that can actually happen in practice. Please provide citations where possible.
(I personally think that other failures are possible but I have been challenged to come up with specific examples of why the above scenario is not suitable for production code.)
I would expect that your readers or your writer could get an ArrayIndexOutOfBounds exception or a null pointer exception, as well, since they can be seeing inconsistent state of the underlying array.
You really are at the mercy of the detailed implementation of the ArrayList class-- which can vary from environment to environment. It is possible that the JVM could implement the class with native code that could cause worse undefined behavior (JVM crash) when reading without synchronization.
I'll add this from the text of the ConcurrentModificationException reference page:
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast operations throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: ConcurrentModificationException should be used only to detect bugs.
In short, you can't depend on nothing bad happening besides stale data and ConcurrentModificationException.

Understanding collections concurrency and Collections.synchronized*

I learned yesterday that I've been incorrectly using collections with concurrency for many, many years.
Whenever I create a collection that needs to be accessed by more than one thread I wrap it in one of the Collections.synchronized* methods. Then, whenever mutating the collection I also wrap it in a synchronized block (I don't know why I was doing this, I must have thought I read it somewhere).
However, after reading the API more closely, it seems you need the synchronized block when iterating the collection. From the API docs (for Map):
It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views:
And here's a small example:
List<O> list = Collections.synchronizedList(new ArrayList<O>());
...
synchronized(list) {
for(O o: list) { ... }
}
So, given this, I have two questions:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
More importantly, what is this accomplishing? By putting the iteration in a synchronized block you are preventing multiple threads from iterating at the same time. But another thread could mutate the list while iterating so how does the synchronized block help there? Wouldn't mutating the list somewhere else screw with the iteration whether it's synchronized or not? What am I missing?
Thanks for the help!
Why is this even necessary? The only explanation I can think of is
they're using a default iterator instead of a managed thread-safe
iterator, but they could have created a thread-safe iterator and fixed
this mess, right?
Iterating works with one element at a time. For the Iterator to be thread-safe, they'd need to make a copy of the collection. Failing that, any changes to the underlying Collection would affect how you iterate with unpredictable or undefined results.
More importantly, what is this accomplishing? By putting the iteration
in a synchronized block you are preventing multiple threads from
iterating at the same time. But another thread could mutate the list
while iterating so how does the synchronized block help there?
Wouldn't mutating the list somewhere else screw with the iteration
whether it's synchronized or not? What am I missing?
The methods of the object returned by synchronizedList(List) work by synchronizing on the instance. So no other thread could be adding/removing from the same List while you are inside a synchronized block on the List.
The basic case
All of the methods of the object returned by Collections.synchronizedList() are synchronized to the list object itself. Whenever a method is called from one thread, every other thread calling any method of it is blocked until the first call finishes.
So far so good.
Iterare necesse est
But that doesn't stop another thread from modifying the collection when you're between calls to next() on its Iterator. And if that happens, your code will fail with a ConcurrentModificationException. But if you do the iteration in a synchronized block too, and you synchronize on the same object (i.e. the list), this will stop other threads from calling any mutator methods on the list, they have to wait until your iterating thread releases the monitor for the list object. The key is that the mutator methods are synchronized to the same object as your iterator block, this is what's stopping them.
We're not out of the woods yet...
Note though that while the above guarantees basic integrity, it doesn't guarantee correct behaviour at all times. You might have other parts of your code that make assumptions which don't hold up in a multi-threaded environment:
List<Object> list = Collections.synchronizedList( ... );
...
if (!list.contains( "foo" )) {
// there's nothing stopping another thread from adding "foo" here itself, resulting in two copies existing in the list
list.add( "foo" );
}
...
synchronized( list ) { //this block guarantees that "foo" will only be added once
if (!list.contains( "foo" )) {
list.add( "foo" );
}
}
Thread-safe Iterator?
As for the question about a thread-safe iterator, there is indeed a list implementation with it, it's called CopyOnWriteArrayList. It is incredibly useful but as indicated in the API doc, it is limited to a handful of use cases only, specifically when your list is only modified very rarely but iterated over so frequently (and by so many threads) that synchronizing iterations would cause a serious bottle-neck. If you use it inappropriately, it can vastly degrade the performance of your application, as each and every modification of the list creates an entire new copy.
Synchronizing on the returned list is necessary, because internal operations synchronize on a mutex, and that mutex is this, i.e. the synchronized collection itself.
Here's some relevant code from Collections, constructors for SynchronizedCollection, the root of the synchronized collection hierarchy.
SynchronizedCollection(Collection<E> c) {
if (c==null)
throw new NullPointerException();
this.c = c;
mutex = this;
}
(There is another constructor that takes a mutex, used to initialize synchronized "view" collections from methods such as subList.)
If you synchronize on the synchronized list itself, then that does prevent another thread from mutating the list while you're iterating over it.
The imperative that you synchronize of the synchronized collection itself exists because if you synchronize on anything else, then what you have imagined could happen - another thread mutating the collection while you're iterating over it, because the objects locked are different.
Sotirios Delimanolis answered your second question "What is this accomplishing?" effectively. I wanted to amplify his answer to your first question:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
There are several ways to approach making a "thread-safe" iterator. As is typical with software systems, there are multiple possibilities, and they offer different tradeoffs in terms of performance (liveness) and consistency. Off the top of my head I see three possibilities.
1. Lockout + Fail-fast
This is what's suggested by the API docs. If you lock the synchronized wrapper object while iterating it (and the rest of the code in the system written correctly, so that mutation method calls also all go through the synchronized wrapper object), the iteration is guaranteed to see a consistent view of the contents of the collection. Each element will be traversed exactly once. The downside, of course, is that other threads are prevented from modifying or even reading the collection while it's being iterated.
A variation of this would use a reader-writer lock to allow reads but not writes during iteration. However, the iteration itself can mutate the collection, so this would spoil consistency for readers. You'd have to write your own wrapper to do this.
The fail-fast comes into play if the lock isn't taken around the iteration and somebody else modifies the collection, or if the lock is taken and somebody violates the locking policy. In this case if the iteration detects that the collection has been mutated out from under it, it throws ConcurrentModificationException.
2. Copy-on-write
This is the strategy employed by CopyOnWriteArrayList among others. An iterator on such a collection does not require locking, it will always show consistent results during iterator, and it will never throw ConcurrentModificationException. However, writes will always copy the entire array, which can be expensive. Perhaps more importantly, the notion of consistency is altered. The contents of the collection might have changed while you were iterating it -- more precisely, while you were iterating a snapshot of its state some time in the past -- so any decisions you might make now are potentially out of date.
3. Weakly Consistent
This strategy is employed by ConcurrentLinkedDeque and similar collections. The specification contains the definition of weakly consistent. This approach also doesn't require any locking, and iteration will never throw ConcurrentModificationException. But the consistency properties are extremely weak. For example, you might attempt to copy the contents of a ConcurrentLinkedDeque by iterating over it and adding each element encountered to a newly created List. But other threads might be modifying the deque while you're iterating it. In particular, if a thread removes an element "behind" where you've already iterated, and then adds an element "ahead" of where you're iterating, the iteration will probably observe both the removed element and the added element. The copy will thus have a "snapshot" that never actually existed at any point in time. Ya gotta admit that's a pretty weak notion of consistency.
The bottom line is that there's no simple notion of making an iterator thread safe that would "fix this mess". There are several different ways -- possibly more than I've explained here -- and they all involve differing tradeoffs. It's unlikely that any one policy will "do the right thing" in all circumstances for all programs.

Reason for ConcurrentModificationException on ArrayLists iterator.next()

I have no idea why a ConcurrentModificationException occurs when i iterate over an ArrayList. The ArrayList is methode scoped, so it should not be visible by other threads which execute the same code. At least if i understodd multi threading and variable scopes correctly.
Caused by: java.util.ConcurrentModificationException
at java.util.AbstractList$SimpleListIterator.next(AbstractList.java:64)
at com....StrategyHandler.applyStrategy(StrategyHandler.java:184)
private List<Order> applyStrategy(StorageObjectTree storageObjectTree) {
...
List<OrderHeader> finalList = new ArrayList<Order>();
for (StorageObject storageObject : storageObjectTree.getStorageObjects()) {
List<Order> currentOrders = strategy.process(storageObject);
...
if (currentOrders != null) {
Iterator<Order> iterator = currentOrders.iterator();
while (iterator.hasNext()) {
Order order = (Order) iterator.next(); // line 64
// read some values from order
}
finalList.addAll(currentOrders);
}
}
return finalList;
}
Can anybody give me an hint what could be the source of the problem?
If You have read the Java Doc for ConcurrentModifcationException :
It clearly states the condition in which it occurs:
This exception may be thrown by methods that have detected concurrent
modification of an object when such modification is not permissible.
For example, it is not generally permissible for one thread to modify
a Collection while another thread is iterating over it. In general,
the results of the iteration are undefined under these circumstances.
Some Iterator implementations (including those of all the general
purpose collection implementations provided by the JRE) may choose to
throw this exception if this behavior is detected. Iterators that do
this are known as fail-fast iterators, as they fail quickly and
cleanly, rather that risking arbitrary, non-deterministic behavior at
an undetermined time in the future.
Note that this exception does not always indicate that an object has
been concurrently modified by a different thread. If a single thread
issues a sequence of method invocations that violates the contract of
an object, the object may throw this exception. For example, if a
thread modifies a collection directly while it is iterating over the
collection with a fail-fast iterator, the iterator will throw this
exception.
Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.
In your case as you said, you do not have multiple threads accessing this list. it might still be possible as per second paragraph above if your single thread that is reading from iterator might be trying to write to it as well.
Hope this helps.
This exception occurred when you changing/adding/removing values from your list and in the same time you are iterating it. And if you use many threads at the same time...
Try to surround your if by synchronized(currentOrders) { /* YOUR LAST CODE */ }.
I'm not sure of this but try it.
Depending on the implementation of strategy.process(..) it could be that this implementation has still a reference to the List it passed back as a result. If there are multiple Threads involved in this implementation it might be possible that the List is modified by one of these threads even after it is passed back as a result.
(If you know the "Future" pattern, you could perhaps imagine an implementation where a method immediately returns an empty List and adds the actual results later using another Thread)
You could try to create a new ArrayList "around" the result list and iterate over this copied list.
You might want to read this SO post. Basically switch and use CopyOnWriteArrayList if you can't see where the problem is coming from.

What can happen if you concurrently modify unprotected Java collections?

http://docs.oracle.com/javase/7/docs/api/java/util/LinkedList.html
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
What can happen if you don't? Can you crash the JVM, cause an exception or just produce inconsistent state?
What if there is one writer but concurrent reads happen unprotected? Can you still crash and mess up the state, or just produce a inconsistent read?
Is this implementation-specific, or does the spec guarantee a certain level of security and/or atomicity?
Using an unsynchronized collection in an multithreaded environment will cause problems like dirty reads (inconsistent data state) and ConcurrentModificationException (mostly when one thread has modified the contents of the collection while another was iterating through it).
Depending on your use case, this may cause your application to crash or deadlock (when one thread is shut down by the JVM due to the mentioned above, uncaught exception). Even worse, it may cause dodgy problems and erroneous results which may be difficult to trace. It will not crash the JVM itself, though.
I'd suggest taking a look at the java.util.concurrent package. You'll find a wide variety of thread-safe, efficient collections. Most of them have weakly consistent iterators, returning elements reflecting the state of the collection at some point at or since the creation of the iterator. This means they do not throw the ConcurrentModificationException, and may proceed concurrently with other operations.
For information regarding Java Memory Model and it's guarantees, please refer to this (well worth reading!).
The nasty thing about thread safety is that errors can occur rarely and be difficult to reproduce. Java largely delegates the handling of Threads to the operating system, so the programmer is not in control of exactly how and when Threads get paused and switched by the OS when a number of tasks are running simultaneously. The frequency of kinds of errors observed can be different depending on whether the CPU is a single core or a dual or quad-core.
It is rare that concurrency errors will "crash the system," the more likely problem is inconsistent state. However, if you are iterating through a collection using an Iterator while another Thread modifies the collection then you will get a ConcurrentModificationException. For example:
Set<String> words; //a field that can be accessed by other threads.
// may throw ConcurrentModificationException
public ArrayList<String> unsafeIteration() {
ArrayList<String> longWords = new ArrayList<>();
for(String word : words) {
if(word.length()>4)
longWords.add(word);
}
return longWords ;
}
The implementation of Iterator attempts to detect concurrent modifications of the collection it is iterating over, but this is just a best-effort attempt to "fail-fast". Having your program fail by throwing an exception is much better than having unpredictable behavior.
The javadocs make this disclaimer:
Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.
If we are simply reading data from a collection using get then we aren't going to see this exception, but we do run the risk of inconsistent state. There may be times when this isn't a problem that requires fixing. If only one thread writes to a field and it is not vital that all threads always see the most recent up-to-date value in that field, I think you should be fine as long as you stay clear of iterators.
You may not crash the application but your different threads will suffer from DIRTY_READ problems.
The Javadoc of the root class of the collections framework, java.util.Collection, writes:
It is up to each collection to determine its own synchronization policy. In the absence of a stronger guarantee by the implementation, undefined behavior may result from the invocation of any method on a collection that is being mutated by another thread; this includes direct invocations, passing the collection to a method that might perform invocations, and using an existing iterator to examine the collection.
"Undefined behavior" implies that the collection may do whatever it pleases, the entire javadoc of the collections framework is null and void. For instance, an element might still be present in the collection after being removed, or not present after being added. For instance, if Thread 1 adds to a HashMap and triggers a resize while Thread 2 inserts something, the insert from Thread 2 might be lost.
However, I would be greatly surprised if lack of synchronization could crash the JVM itself.

Insertion on a Deque in Java

I would like to know what this tutorial means when it refers to the following bit of explanation. In particular the part which I highlighted in bold.
Insert
The addfirst and offerFirst methods insert elements at the beginning
of the Deque instance. The methods addLast and offerLast insert
elements at the end of the Deque instance. When the capacity of the
Deque instance is restricted, the preferred methods are offerFirst and
offerLast because addFirst might fail to throw an exception if it is
full.
Why would offerFirst be preferred?
Why would addFirst fail to throw an exception if it is full? Should not it be better if it guaranteed to throw an exception in those circumstances?
I think both methods are legitimate (though the offerXXX methods are more likely to be used in bounded dequeues).
If your code assumes that there's available space in the queue, and this assumption is critical to the correctness of the code, use addFirst/addLast. The runtime exception being thrown (IllegalStateException) is perfectly suitable for this bug scenario.
If, on the other hand, a full queue is a normal scenario, don't deal with it using exceptions. Use offerFirst/offerLast, and check the returned value.
OfferFirst is the prefereable methods if there is a risk the deque will reach capacity. If it has reached capacity addFirst will throw an exception, where as offerFirst returns a boolean (true/false) value to indicate if the add was successful. offerFirst Inserts the specified element at the front of this deque unless it would violate capacity restrictions. When using a capacity-restricted deque, this method is generally preferable to the addFirst(E) method, which can fail to insert an element only by throwing an exception.
Why would be that when using the restricted version you would not want an exception thrown when adding the element fails. This is because you would expect some failures which is why you are offering to add instead of insisting to add.
It means that the offerXXX methods return boolean, and the addXXX methods don't.
So it is recommending that you use offerXXX and check the boolean for success, rather than expecting an exception to be thrown from either method.
It's very badly worded. And so is the Javadoc.
According to Docs:
offerFirst:
Inserts the specified element at the front of this deque unless it would violate capacity restrictions. When using a capacity-restricted deque, this method is generally preferable to the addFirst(E) method, which can fail to insert an element only by throwing an exception.
Which means if you use addFirst with a capacity restricted deque, it may throw an exception, but using offerFirst won't throw any exception.
offerLast
Inserts the specified element at the end of this deque unless it would violate capacity restrictions. When using a capacity-restricted deque, this method is generally preferable to the addLast(E) method, which can fail to insert an element only by throwing an exception.
Similarly if you use addLast with a capacity restricted deque, it may throw an exception, but using offerLast won't throw any exception.

Categories