I have a ConcurrentSKipListSet, and I'm iterating over values in this set with a for-each loop.
Another thread at some point is going to remove an element from this set.
I think I'm running into a situation where one thread removes an element that I'm yet to iterate over (or maybe I've just started to iterate over it) and so a call being made from within the loop fails.
Some code for clarity:
for(Foo foo : fooSet) {
//do stuff
//At this point in time, another thread removes this element from the set
//do some more stuff
callService(foo.getId()); // Fails
}
Reading the docs I can't work out if this is possible or not:
Iterators are weakly consistent, returning elements reflecting the state of the set at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations.
So is this possible, and if so, what's a good way of handling this?
Thanks
Will
I think I'm running into a situation where one thread removes an element that I'm yet to iterate over (or maybe I've just started to iterate over it) and so a call being made from within the loop fails.
I don't think that's what the javadocs are saying:
Iterators are weakly consistent, returning elements reflecting the state of the set at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations.
This is saying that you don't have to worry about someone removing from the ConcurrentSkipListSet at the same time that you are iterating across the list. There certainly is going to be a race condition as you are moving across the iterator however. Either foo gets removed right after your iterator gets it or it was removed right before and the iterator doesn't see it.
callService(foo.getId()); // this shouldn't "fail"
If foo gets returned by the iterator, your service call won't "fail" unless it is assuming that the foo is still in the list and somehow checking it. The worst case is that you might do some operations on foo and call the service with it even though it was just removed from the list by the other thread.
I've hit this problem as well with queues that are written to and read by different threads. One approach is to mark instead of remove elements that are no longer needed. You can run a cleanup iterator after you go through the whole list. You need a global lock just for removing elements from the list, and the rest of the time your code can run in parallel. Schematically it works like this:
writer:
while() {
set.add(something);
something.markForDelete();
}
reader:
while() {
// process async
iterator iter = set.getIterator();
for(iter.hasNext()) {
... work, check isMarkedForDelete() ...
}
iter = set.getIterator();
// delete, sync
globalLock.Lock();
for(iter.hasNext()) {
if(something.isMarkedForDelete()) {
set.remove(something);
}
globalLock.Unlock();
}
}
Related
If I spawn 2 threads on a single core PC does it ever access for example an ArrayList in the same time so it will throw ConcurrentModificationException?
My gut tells me although there are 2 threads, they cannot achieve true parallelism because there is a single core and what it can do mostly is to jump from one thread to another but without executing an instruction such as arrayList.add(element) in the same time.
TL;DR: Yes
List<String> myList = new ArrayList<String>(Arrays.asList("My string"));
Iterator<String> myIterator = myList.iterator();
myList.add("Another string");
myIterator.next();
Result:
Exception in thread "main" java.util.ConcurrentModificationException
at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042)
at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996)
at com.ajax.YourClass.yourMethod(YourClass.java:134)
You shouldn’t modify the collection while iterating over it. In practice the ConcurrentModificationException usually comes (but is not guaranteed) when you call next() on an iterator after having added an element or removed one. And in practice it often happens when you add or remove an element from inside a loop iterating over the collection, as Carciganicate said in the comment.
Or as ernest_k put it so well in the comment:
"Concurrent" in ConcurrentModificationException is not really about
parallelism
Concurrency is not the same thing as parallel computing. Two activities (e.g., two threads) happen concurrently if both have been started before either one of them is finished. You don't need multiple CPUs for that to happen.
But also note what #ernest_k said in a comment: You don't even need to have more than one thread in order for your program to throw a ConcurrentModificationException. All you need to do is create an iterator for some collection, then modify the collection, and then try to continue using the iterator after you've done the modification. That is to say, you'll get the exception if you modify the collection concurrently with an iteration.
I have a method similar to the one below:
public void addSubjectsToCategory() {
final List<Subject> subjectsList = new ArrayList<>(getSubjectList());
for (final Iterator<Subject> subjectIterator =
subjectsList.iterator(); subjectIterator.hasNext();) {
addToCategory(subjectIterator.next().getId());
}
}
When this runs concurrently for the same user (another instance), sometimes it throws NoSuchElementException. As per my understanding, sometimes subjectIterator.next() get executed when there are no elements in the list. This occurs when being accessed only. Will method synchronization solve this issue?
The stack trace is:
java.util.NoSuchElementException: null
at java.util.ArrayList$Itr.next(Unknown Source)
at org.cmos.student.subject.category.CategoryManager.addSubjectsToCategory(CategoryManager.java:221)
This stack trace fails at the addToCategory(subjectIterator.next().getId()); line.
The basic rule of iterators is that underlying collection must not be modified while the iterator is being used.
If you have a single thread, there seems to be nothing wrong with this code as long as getSubjectsList() does not return null OR addToCategory() or getId() have some strange side-effects that would modify the subjectsList. Note, however, that you could rewrite the for-loop somewhat nicer (for(Subject subject: subjectsList) ...).
Judging by your code, my best guess is that you have another thread which is modifying subjectsList somewhere else. If this is the case, using a SynchronizedList will probably not solve your problem. As far as I know, synchronization only applies to List methods such as add(), remove() etc., and does not lock a collection during iteration.
In this case, adding synchronized to the method will not help either, because the other thread is doing its nasty stuff elsewhere. If these assumptions are true, your easiest and safest way is to make a separate synchronization object (i.e. Object lock = new Object()) and then put synchronized (lock) { ... } around this for loop as well as any other place in your program that modifies the collection. This will prevent the other thread from doing any modifications while this thread is iterating, and vice versa.
subjectIterator.hasNext();) {
--- Imagine a thread switch occurs here, at this point, between the call to hasNext() and next() methods.
addToCategory(subjectIterator.next().getId());
What could happen is the following, assuming you are at the last element in the list:
thread A calls hasNext(), the result is true;
thread switch occurs to thread B;
thread B calls hasNext(), the result is also true;
thread B calls next() and gets the next element from the list; now the list is empty because it was the last one;
thread switch occurs back to thread A;
thread A is already inside the body of the for loop, because this is where it was interrupted, it already called hasNext earlier, which
was true;
so thread A calls next(), which fails now with an exception, because there are no more elements in the list.
So what you have to do in such situations, is to make the operations hasNext and next behave in an atomic way, without thread switches occurring in between.
A simple synchronization on the list solves, indeed, the problem:
public void addSubjectsToCategory() {
final ArrayBlockingQueue<Subject> subjectsList = new ArrayBlockingQueue(getSubjectList());
synchronized (subjectsList) {
for (final Iterator<Subject> subjectIterator =
subjectsList.iterator(); subjectIterator.hasNext();) {
addToCategory(subjectIterator.next().getId());
}
}
}
Note, however, that there may be performance implications with this approach. No other thread will be able to read or write from/to the same list until the iteration is over (but this is what you want). To solve this, you may want to move the synchronization inside the loop, just around hasNext and next. Or you may want to use more sophisticated synchronization mechanisms, such as read-write locks.
It sounds like another thread is calling the method and grabbing the last element while another thread is about to get the next. So when the other thread finishes and comes back to the paused thread there is nothing left. I suggest using an ArrayBlockingQueue instead of a list. This will block threads when one is already iterating.
public void addSubjectsToCategory() {
final ArrayBlockingQueue<Subject> subjectsList = new ArrayBlockingQueue(getSubjectList());
for (final Iterator<Subject> subjectIterator =
subjectsList.iterator(); subjectIterator.hasNext();) {
addToCategory(subjectIterator.next().getId());
}
}
There is a bit of a wrinkle that you may have to sort out. The ArrayBlockingQueue will block if it is empty or full and wait for a thread to either insert something or take something out, respectively, before it will unblock and allow other threads to access.
You can use Collections.synchronizedList(list) if all you need is a simple invocation Sycnchronization. But do note that the iterator that you use must be inside the Synchronized block.
As I get you are adding elements to a list which might be under reading process.
Imagine the list is empty and your other thread is reading it. These kinds of problems might lead into your problem. You could never be sure that an element is written to your list which you are trying to read , in this approach.
I was surprised not to see an answer involving the use of a CopyOnWriteArrayList or Guava's ImmutableList so I thought that I would add such an answer here.
Firstly, if your use case is such that you only have a few additions relative to many reads, consider using the CopyOnWriteArrayList to solve the concurrent list traversal problem. Method synchronization could solve your issue, but CopyOnWriteArrayList will likely have better performance if the number of concurrent accesses "vastly" exceeds the number of writes, as per that class's Javadoc.
Secondly, if your use case is such that you can add everything to your list upfront in a single-threaded manner and only then do you need iterate across it concurrently, then consider Guava's ImmutableList class. You accomplish this by first using a standard ArrayList or a LinkedList or a builder for your ImmutableList. Once your single-threaded data entry is complete, then you instantiate your ImmutableList using either ImmutableList.copyOf() or ImmutableList.build(). If your use case will allow for this write/read pattern, this will probably be your most performant option.
Hope that helps.
I would like to make a suggestion that would probably solve your problem, considering that this is a concurrency issue.
If making the method addSubjectsToCategory() synchronized solves your problem, then you have located where your concurrency issue is. It is important to locate where the problem occurs, otherwise the information you provided is useless to us, we can't help you.
IF using synchronized in your method solves your problem, then consider this answer as educational or as a more elegant solution. Otherwise, share the code where you implement your threading environment, so we can have a look.
public synchronized void addSubjectsToCategory(List subjectsList){
Iterator iterator = subjectsList.iterator();
while(iterator.hasNext())
addToCategory(iterator.next().getId());
}
or
//This semaphore should be used by all threads. Be careful not to create a
//different semaphore each time.
public static Semaphore mutex = new Semaphore(1);
public void addSubjectsToCategory(List subjectsList){
Iterator<Subject> iterator = subjectsList.iterator();
mutex.acquire();
while(iterator.hasNext())
addToCategory(iterator.next().getId());
mutex.release();
}
Synchronized is clean, tidy and elegant. You have a really small method and creating locks, imho is unnecessary.
Synchronized means that only 1 thread will be able to enter the method at a time. Which means, you should use it only if you want 1 thread active each time.
If you actually need parallel execution, then your problem is not thread-related, but has something to do with the rest of your code, which we can not see.
I learned yesterday that I've been incorrectly using collections with concurrency for many, many years.
Whenever I create a collection that needs to be accessed by more than one thread I wrap it in one of the Collections.synchronized* methods. Then, whenever mutating the collection I also wrap it in a synchronized block (I don't know why I was doing this, I must have thought I read it somewhere).
However, after reading the API more closely, it seems you need the synchronized block when iterating the collection. From the API docs (for Map):
It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views:
And here's a small example:
List<O> list = Collections.synchronizedList(new ArrayList<O>());
...
synchronized(list) {
for(O o: list) { ... }
}
So, given this, I have two questions:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
More importantly, what is this accomplishing? By putting the iteration in a synchronized block you are preventing multiple threads from iterating at the same time. But another thread could mutate the list while iterating so how does the synchronized block help there? Wouldn't mutating the list somewhere else screw with the iteration whether it's synchronized or not? What am I missing?
Thanks for the help!
Why is this even necessary? The only explanation I can think of is
they're using a default iterator instead of a managed thread-safe
iterator, but they could have created a thread-safe iterator and fixed
this mess, right?
Iterating works with one element at a time. For the Iterator to be thread-safe, they'd need to make a copy of the collection. Failing that, any changes to the underlying Collection would affect how you iterate with unpredictable or undefined results.
More importantly, what is this accomplishing? By putting the iteration
in a synchronized block you are preventing multiple threads from
iterating at the same time. But another thread could mutate the list
while iterating so how does the synchronized block help there?
Wouldn't mutating the list somewhere else screw with the iteration
whether it's synchronized or not? What am I missing?
The methods of the object returned by synchronizedList(List) work by synchronizing on the instance. So no other thread could be adding/removing from the same List while you are inside a synchronized block on the List.
The basic case
All of the methods of the object returned by Collections.synchronizedList() are synchronized to the list object itself. Whenever a method is called from one thread, every other thread calling any method of it is blocked until the first call finishes.
So far so good.
Iterare necesse est
But that doesn't stop another thread from modifying the collection when you're between calls to next() on its Iterator. And if that happens, your code will fail with a ConcurrentModificationException. But if you do the iteration in a synchronized block too, and you synchronize on the same object (i.e. the list), this will stop other threads from calling any mutator methods on the list, they have to wait until your iterating thread releases the monitor for the list object. The key is that the mutator methods are synchronized to the same object as your iterator block, this is what's stopping them.
We're not out of the woods yet...
Note though that while the above guarantees basic integrity, it doesn't guarantee correct behaviour at all times. You might have other parts of your code that make assumptions which don't hold up in a multi-threaded environment:
List<Object> list = Collections.synchronizedList( ... );
...
if (!list.contains( "foo" )) {
// there's nothing stopping another thread from adding "foo" here itself, resulting in two copies existing in the list
list.add( "foo" );
}
...
synchronized( list ) { //this block guarantees that "foo" will only be added once
if (!list.contains( "foo" )) {
list.add( "foo" );
}
}
Thread-safe Iterator?
As for the question about a thread-safe iterator, there is indeed a list implementation with it, it's called CopyOnWriteArrayList. It is incredibly useful but as indicated in the API doc, it is limited to a handful of use cases only, specifically when your list is only modified very rarely but iterated over so frequently (and by so many threads) that synchronizing iterations would cause a serious bottle-neck. If you use it inappropriately, it can vastly degrade the performance of your application, as each and every modification of the list creates an entire new copy.
Synchronizing on the returned list is necessary, because internal operations synchronize on a mutex, and that mutex is this, i.e. the synchronized collection itself.
Here's some relevant code from Collections, constructors for SynchronizedCollection, the root of the synchronized collection hierarchy.
SynchronizedCollection(Collection<E> c) {
if (c==null)
throw new NullPointerException();
this.c = c;
mutex = this;
}
(There is another constructor that takes a mutex, used to initialize synchronized "view" collections from methods such as subList.)
If you synchronize on the synchronized list itself, then that does prevent another thread from mutating the list while you're iterating over it.
The imperative that you synchronize of the synchronized collection itself exists because if you synchronize on anything else, then what you have imagined could happen - another thread mutating the collection while you're iterating over it, because the objects locked are different.
Sotirios Delimanolis answered your second question "What is this accomplishing?" effectively. I wanted to amplify his answer to your first question:
Why is this even necessary? The only explanation I can think of is they're using a default iterator instead of a managed thread-safe iterator, but they could have created a thread-safe iterator and fixed this mess, right?
There are several ways to approach making a "thread-safe" iterator. As is typical with software systems, there are multiple possibilities, and they offer different tradeoffs in terms of performance (liveness) and consistency. Off the top of my head I see three possibilities.
1. Lockout + Fail-fast
This is what's suggested by the API docs. If you lock the synchronized wrapper object while iterating it (and the rest of the code in the system written correctly, so that mutation method calls also all go through the synchronized wrapper object), the iteration is guaranteed to see a consistent view of the contents of the collection. Each element will be traversed exactly once. The downside, of course, is that other threads are prevented from modifying or even reading the collection while it's being iterated.
A variation of this would use a reader-writer lock to allow reads but not writes during iteration. However, the iteration itself can mutate the collection, so this would spoil consistency for readers. You'd have to write your own wrapper to do this.
The fail-fast comes into play if the lock isn't taken around the iteration and somebody else modifies the collection, or if the lock is taken and somebody violates the locking policy. In this case if the iteration detects that the collection has been mutated out from under it, it throws ConcurrentModificationException.
2. Copy-on-write
This is the strategy employed by CopyOnWriteArrayList among others. An iterator on such a collection does not require locking, it will always show consistent results during iterator, and it will never throw ConcurrentModificationException. However, writes will always copy the entire array, which can be expensive. Perhaps more importantly, the notion of consistency is altered. The contents of the collection might have changed while you were iterating it -- more precisely, while you were iterating a snapshot of its state some time in the past -- so any decisions you might make now are potentially out of date.
3. Weakly Consistent
This strategy is employed by ConcurrentLinkedDeque and similar collections. The specification contains the definition of weakly consistent. This approach also doesn't require any locking, and iteration will never throw ConcurrentModificationException. But the consistency properties are extremely weak. For example, you might attempt to copy the contents of a ConcurrentLinkedDeque by iterating over it and adding each element encountered to a newly created List. But other threads might be modifying the deque while you're iterating it. In particular, if a thread removes an element "behind" where you've already iterated, and then adds an element "ahead" of where you're iterating, the iteration will probably observe both the removed element and the added element. The copy will thus have a "snapshot" that never actually existed at any point in time. Ya gotta admit that's a pretty weak notion of consistency.
The bottom line is that there's no simple notion of making an iterator thread safe that would "fix this mess". There are several different ways -- possibly more than I've explained here -- and they all involve differing tradeoffs. It's unlikely that any one policy will "do the right thing" in all circumstances for all programs.
I have no idea why a ConcurrentModificationException occurs when i iterate over an ArrayList. The ArrayList is methode scoped, so it should not be visible by other threads which execute the same code. At least if i understodd multi threading and variable scopes correctly.
Caused by: java.util.ConcurrentModificationException
at java.util.AbstractList$SimpleListIterator.next(AbstractList.java:64)
at com....StrategyHandler.applyStrategy(StrategyHandler.java:184)
private List<Order> applyStrategy(StorageObjectTree storageObjectTree) {
...
List<OrderHeader> finalList = new ArrayList<Order>();
for (StorageObject storageObject : storageObjectTree.getStorageObjects()) {
List<Order> currentOrders = strategy.process(storageObject);
...
if (currentOrders != null) {
Iterator<Order> iterator = currentOrders.iterator();
while (iterator.hasNext()) {
Order order = (Order) iterator.next(); // line 64
// read some values from order
}
finalList.addAll(currentOrders);
}
}
return finalList;
}
Can anybody give me an hint what could be the source of the problem?
If You have read the Java Doc for ConcurrentModifcationException :
It clearly states the condition in which it occurs:
This exception may be thrown by methods that have detected concurrent
modification of an object when such modification is not permissible.
For example, it is not generally permissible for one thread to modify
a Collection while another thread is iterating over it. In general,
the results of the iteration are undefined under these circumstances.
Some Iterator implementations (including those of all the general
purpose collection implementations provided by the JRE) may choose to
throw this exception if this behavior is detected. Iterators that do
this are known as fail-fast iterators, as they fail quickly and
cleanly, rather that risking arbitrary, non-deterministic behavior at
an undetermined time in the future.
Note that this exception does not always indicate that an object has
been concurrently modified by a different thread. If a single thread
issues a sequence of method invocations that violates the contract of
an object, the object may throw this exception. For example, if a
thread modifies a collection directly while it is iterating over the
collection with a fail-fast iterator, the iterator will throw this
exception.
Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.
In your case as you said, you do not have multiple threads accessing this list. it might still be possible as per second paragraph above if your single thread that is reading from iterator might be trying to write to it as well.
Hope this helps.
This exception occurred when you changing/adding/removing values from your list and in the same time you are iterating it. And if you use many threads at the same time...
Try to surround your if by synchronized(currentOrders) { /* YOUR LAST CODE */ }.
I'm not sure of this but try it.
Depending on the implementation of strategy.process(..) it could be that this implementation has still a reference to the List it passed back as a result. If there are multiple Threads involved in this implementation it might be possible that the List is modified by one of these threads even after it is passed back as a result.
(If you know the "Future" pattern, you could perhaps imagine an implementation where a method immediately returns an empty List and adds the actual results later using another Thread)
You could try to create a new ArrayList "around" the result list and iterate over this copied list.
You might want to read this SO post. Basically switch and use CopyOnWriteArrayList if you can't see where the problem is coming from.
I am using a shared library in Java that returns ArrayList; as I iterate over it, a ConcurrentModificationException could be thrown and I am looking for 100% (?) guarantee to be safe. I was thinking on something like below and I'd appreciate any input.
The data_list is the ArrayList<> returned from the MT library.
boolean pass = true;
ArrayList<Something> local = new ArrayList<Something>(256);
for (int spin=0; spin<10; ++spin)
{
try {
local.addAll(data_list);
}
catch (java.util.ConcurrentModificationException ce) {
pass = false;
}
finally {
if (pass) break;
pass = true;
}
}
Assuming variable pass is true, how should I operate on local?
There is no safe way to do this. You should not catch ConcurrentModificationException.
The iterators returned by this class's iterator and listIterator methods are fail-fast: if the list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
Some collections, like HashMap, even can enter an infinite loop when used this way. Here's an explanation of how it happens.
You should not do this. There is no correct way to do this.
Either you misunderstand how the library works, or you need to switch out your library with one written by a competent developer.
What library are you using?
You don't define exactly what you mean by safe, and don't specify what kind of modifications are being performed to the list, but in many cases it may be acceptable to iterate over it manually by index, i.e.
for (int index = 0; index < data_list.size(); index ++)
local.add(data_list.get(index));
The way I see it, there are four possible kinds of modification, with varying degrees of acceptability:
New items could be appended. This solution should work appropriately for this case, as long as the list does not grow enough to trigger a backing list expansion (and as this should happen with exponentially-reducing frequency, retrying if it occurs should be guaranteed to succeed eventually).
Existing items may be modified. This solution may not present a consistent view of the contents of the list at any given time, but it would be guaranteed to provide a usable list that is representative of items that have been in the list, which may be acceptable depending on your definition of "safe".
Items may be removed. There is a small chance this solution would fail with an IndexOutOfBoundsException, and the same caveat as for items being modified would apply with regards to consistency.
Items may be inserted into the middle of the list. The same caveat as items being modified would apply, and there would also be a danger of getting duplicated values. The problems with backing array expansion from the appending case would also apply.
You've got a bad situation here, but I think your solution is as sound as possible. The new ArrayList should go in the loop so you start fresh after each failure. Actually, the best thing might be to make your "try" line look like:
local = new ArrayList<Something>( data_list );
You don't want your ArrayList to have to expand itself because that will take time when you're trying to grab the data before the list changes. This should set the size, create it, and fill it with the least wasted effort.
You might need to catch things other than ConcurrentModification. You'll probably learn what the hard way. Or just catch Throwable.
If you want to go to extremes, run the code inside the for loop in it's own thread so if it does hang you can kill it and restart it. That's going to take some work.
I think this will work, if you let "spin" get large enough.
I don't have any fundamental changes, but I think that code could be simplified a bit:
ArrayList<Something> local = new ArrayList<Something>(256);
for (int spin=0; spin<10; ++spin)
{
try {
local.addAll(data_list);
break;
}
catch (java.util.ConcurrentModificationException ce) {}
}