I am using a shared library in Java that returns ArrayList; as I iterate over it, a ConcurrentModificationException could be thrown and I am looking for 100% (?) guarantee to be safe. I was thinking on something like below and I'd appreciate any input.
The data_list is the ArrayList<> returned from the MT library.
boolean pass = true;
ArrayList<Something> local = new ArrayList<Something>(256);
for (int spin=0; spin<10; ++spin)
{
try {
local.addAll(data_list);
}
catch (java.util.ConcurrentModificationException ce) {
pass = false;
}
finally {
if (pass) break;
pass = true;
}
}
Assuming variable pass is true, how should I operate on local?
There is no safe way to do this. You should not catch ConcurrentModificationException.
The iterators returned by this class's iterator and listIterator methods are fail-fast: if the list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
Some collections, like HashMap, even can enter an infinite loop when used this way. Here's an explanation of how it happens.
You should not do this. There is no correct way to do this.
Either you misunderstand how the library works, or you need to switch out your library with one written by a competent developer.
What library are you using?
You don't define exactly what you mean by safe, and don't specify what kind of modifications are being performed to the list, but in many cases it may be acceptable to iterate over it manually by index, i.e.
for (int index = 0; index < data_list.size(); index ++)
local.add(data_list.get(index));
The way I see it, there are four possible kinds of modification, with varying degrees of acceptability:
New items could be appended. This solution should work appropriately for this case, as long as the list does not grow enough to trigger a backing list expansion (and as this should happen with exponentially-reducing frequency, retrying if it occurs should be guaranteed to succeed eventually).
Existing items may be modified. This solution may not present a consistent view of the contents of the list at any given time, but it would be guaranteed to provide a usable list that is representative of items that have been in the list, which may be acceptable depending on your definition of "safe".
Items may be removed. There is a small chance this solution would fail with an IndexOutOfBoundsException, and the same caveat as for items being modified would apply with regards to consistency.
Items may be inserted into the middle of the list. The same caveat as items being modified would apply, and there would also be a danger of getting duplicated values. The problems with backing array expansion from the appending case would also apply.
You've got a bad situation here, but I think your solution is as sound as possible. The new ArrayList should go in the loop so you start fresh after each failure. Actually, the best thing might be to make your "try" line look like:
local = new ArrayList<Something>( data_list );
You don't want your ArrayList to have to expand itself because that will take time when you're trying to grab the data before the list changes. This should set the size, create it, and fill it with the least wasted effort.
You might need to catch things other than ConcurrentModification. You'll probably learn what the hard way. Or just catch Throwable.
If you want to go to extremes, run the code inside the for loop in it's own thread so if it does hang you can kill it and restart it. That's going to take some work.
I think this will work, if you let "spin" get large enough.
I don't have any fundamental changes, but I think that code could be simplified a bit:
ArrayList<Something> local = new ArrayList<Something>(256);
for (int spin=0; spin<10; ++spin)
{
try {
local.addAll(data_list);
break;
}
catch (java.util.ConcurrentModificationException ce) {}
}
Related
I wanted to avoid syncing so I tried to use iterators. The only place where I modify array looks as follows:
if (lastSegment.trackpoints.size() > maxPoints)
{
ListIterator<TrackPoint> points = lastSegment.trackpoints.listIterator();
points.next();
points.remove();
}
ListIterator<TrackPoint> points = lastSegment.trackpoints.listIterator(lastSegment.trackpoints.size());
points.add(lastTrackPoint);
And array traversal looks as follows:
for (Iterator<Track.TrackSegment> segments = track.getSegments().iterator(); segments.hasNext();)
{
Track.TrackSegment segment = segments.next();
for (Iterator<Track.TrackPoint> points = segment.getPoints().iterator(); points.hasNext();)
{
Track.TrackPoint tp = points.next();
// ^^^ HERE I GET ConcurentModificationException
// =============================================
...
}
}
So, what's wrong with iterators? Second level arrays are huge, so I do not want to copy them nor I want to rely on synchronization outside of my Track class.
From https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html
For example, it is not generally permissible for one thread to modify a Collection while another thread is iterating over it. In general, the results of the iteration are undefined under these circumstances. Some Iterator implementations (including those of all the general purpose collection implementations provided by the JRE) may choose to throw this exception if this behavior is detected.
You will need to implement some sort of synchronization yourself to avoid the exception. You may consider using Collections.synchronizedList and only operating on the list through that view.
I have no idea why a ConcurrentModificationException occurs when i iterate over an ArrayList. The ArrayList is methode scoped, so it should not be visible by other threads which execute the same code. At least if i understodd multi threading and variable scopes correctly.
Caused by: java.util.ConcurrentModificationException
at java.util.AbstractList$SimpleListIterator.next(AbstractList.java:64)
at com....StrategyHandler.applyStrategy(StrategyHandler.java:184)
private List<Order> applyStrategy(StorageObjectTree storageObjectTree) {
...
List<OrderHeader> finalList = new ArrayList<Order>();
for (StorageObject storageObject : storageObjectTree.getStorageObjects()) {
List<Order> currentOrders = strategy.process(storageObject);
...
if (currentOrders != null) {
Iterator<Order> iterator = currentOrders.iterator();
while (iterator.hasNext()) {
Order order = (Order) iterator.next(); // line 64
// read some values from order
}
finalList.addAll(currentOrders);
}
}
return finalList;
}
Can anybody give me an hint what could be the source of the problem?
If You have read the Java Doc for ConcurrentModifcationException :
It clearly states the condition in which it occurs:
This exception may be thrown by methods that have detected concurrent
modification of an object when such modification is not permissible.
For example, it is not generally permissible for one thread to modify
a Collection while another thread is iterating over it. In general,
the results of the iteration are undefined under these circumstances.
Some Iterator implementations (including those of all the general
purpose collection implementations provided by the JRE) may choose to
throw this exception if this behavior is detected. Iterators that do
this are known as fail-fast iterators, as they fail quickly and
cleanly, rather that risking arbitrary, non-deterministic behavior at
an undetermined time in the future.
Note that this exception does not always indicate that an object has
been concurrently modified by a different thread. If a single thread
issues a sequence of method invocations that violates the contract of
an object, the object may throw this exception. For example, if a
thread modifies a collection directly while it is iterating over the
collection with a fail-fast iterator, the iterator will throw this
exception.
Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.
In your case as you said, you do not have multiple threads accessing this list. it might still be possible as per second paragraph above if your single thread that is reading from iterator might be trying to write to it as well.
Hope this helps.
This exception occurred when you changing/adding/removing values from your list and in the same time you are iterating it. And if you use many threads at the same time...
Try to surround your if by synchronized(currentOrders) { /* YOUR LAST CODE */ }.
I'm not sure of this but try it.
Depending on the implementation of strategy.process(..) it could be that this implementation has still a reference to the List it passed back as a result. If there are multiple Threads involved in this implementation it might be possible that the List is modified by one of these threads even after it is passed back as a result.
(If you know the "Future" pattern, you could perhaps imagine an implementation where a method immediately returns an empty List and adds the actual results later using another Thread)
You could try to create a new ArrayList "around" the result list and iterate over this copied list.
You might want to read this SO post. Basically switch and use CopyOnWriteArrayList if you can't see where the problem is coming from.
I have a ConcurrentSKipListSet, and I'm iterating over values in this set with a for-each loop.
Another thread at some point is going to remove an element from this set.
I think I'm running into a situation where one thread removes an element that I'm yet to iterate over (or maybe I've just started to iterate over it) and so a call being made from within the loop fails.
Some code for clarity:
for(Foo foo : fooSet) {
//do stuff
//At this point in time, another thread removes this element from the set
//do some more stuff
callService(foo.getId()); // Fails
}
Reading the docs I can't work out if this is possible or not:
Iterators are weakly consistent, returning elements reflecting the state of the set at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations.
So is this possible, and if so, what's a good way of handling this?
Thanks
Will
I think I'm running into a situation where one thread removes an element that I'm yet to iterate over (or maybe I've just started to iterate over it) and so a call being made from within the loop fails.
I don't think that's what the javadocs are saying:
Iterators are weakly consistent, returning elements reflecting the state of the set at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations.
This is saying that you don't have to worry about someone removing from the ConcurrentSkipListSet at the same time that you are iterating across the list. There certainly is going to be a race condition as you are moving across the iterator however. Either foo gets removed right after your iterator gets it or it was removed right before and the iterator doesn't see it.
callService(foo.getId()); // this shouldn't "fail"
If foo gets returned by the iterator, your service call won't "fail" unless it is assuming that the foo is still in the list and somehow checking it. The worst case is that you might do some operations on foo and call the service with it even though it was just removed from the list by the other thread.
I've hit this problem as well with queues that are written to and read by different threads. One approach is to mark instead of remove elements that are no longer needed. You can run a cleanup iterator after you go through the whole list. You need a global lock just for removing elements from the list, and the rest of the time your code can run in parallel. Schematically it works like this:
writer:
while() {
set.add(something);
something.markForDelete();
}
reader:
while() {
// process async
iterator iter = set.getIterator();
for(iter.hasNext()) {
... work, check isMarkedForDelete() ...
}
iter = set.getIterator();
// delete, sync
globalLock.Lock();
for(iter.hasNext()) {
if(something.isMarkedForDelete()) {
set.remove(something);
}
globalLock.Unlock();
}
}
We all know when using Collections.synchronizedXXX (e.g. synchronizedSet()) we get a synchronized "view" of the underlying collection.
However, the document of these wrapper generation methods states that we have to explicitly synchronize on the collection when iterating of the collections using an iterator.
Which option do you choose to solve this problem?
I can only see the following approaches:
Do it as the documentation states: synchronize on the collection
Clone the collection before calling iterator()
Use a collection which iterator is thread-safe (I am only aware of CopyOnWriteArrayList/Set)
And as a bonus question: when using a synchronized view - is the use of foreach/Iterable thread-safe?
You've already answered your bonus question really: no, using an enhanced for loop isn't safe - because it uses an iterator.
As for which is the most appropriate approach - it really depends on how your context:
Are writes very infrequent? If so, CopyOnWriteArrayList may be most appropriate.
Is the collection reasonably small, and the iteration quick? (i.e. you're not doing much work in the loop) If so, synchronizing may well be fine - especially if this doesn't happen too often (i.e. you won't have much contention for the collection).
If you're doing a lot of work and don't want to block other threads working at the same time, the hit of cloning the collection may well be acceptable.
Depends on your access model. If you have low concurrency and frequent writes, 1 will have the best performance. If you have high concurrency with and infrequent writes, 3 will have the best performance. Option 2 is going to perform badly in almost all cases.
foreach calls iterator(), so exactly the same things apply.
You could use one of the newer collections added in Java 5.0 which support concurrent access while iterating. Another approach is to take a copy using toArray which is thread safe (during the copy).
Collection<String> words = ...
// enhanced for loop over an array.
for(String word: words.toArray(new String[0])) {
}
I might be totally off with your requirements, but if you are not aware of them, check out google-collections with "Favor immutability" in mind.
I suggest dropping Collections.synchronizedXXX and handle all locking uniformly in the client code. The basic collections don't support the sort of compound operations useful in threaded code, and even if you use java.util.concurrent.* the code is more difficult. I suggest keeping as much code as possible thread-agnostic. Keep difficult and error-prone thread-safe (if we are very lucky) code to a minimum.
All three of your options will work. Choosing the right one for your situation will depend on what your situation is.
CopyOnWriteArrayList will work if you want a list implementation and you don't mind the underlying storage being copied every time you write. This is pretty good for performance as long as you don't have very big collections.
ConcurrentHashMap or "ConcurrentHashSet" (using Collections.newSetFromMap) will work if you need a Map or Set interface, obviously you don't get random access this way. One great! thing about these two is that they will work well with large data sets - when mutated they just copy little bits of the underlying data storage.
It does depend on the result one needs to achieve cloning/copying/toArray(), new ArrayList(..) and the likes obtain a snapshot and does not lock the the collection.
Using synchronized(collection) and iteration through ensure by the end of the iteration would be no modification, i.e. effectively locking it.
side note:(toArray() is usually preferred with some exceptions when internally it needs to create a temporary ArrayList). Also please note, anything but toArray() should be wrapped in synchronized(collection) as well, provided using Collections.synchronizedXXX.
This Question is rather old (sorry, i am a bit late..) but i still want to add my Answer.
I would choose your second choice (i.e. Clone the collection before calling iterator()) but with a major twist.
Asuming, you want to iterate using iterator, you do not have to coppy the Collection before calling .iterator() and sort of negating (i am using the term "negating" loosely) the idea of the iterator pattern, but you could write a "ThreadSafeIterator".
It would work on the same premise, coppying the Collection, but without letting the iterating class know, that you did just that. Such an Iterator might look like this:
class ThreadSafeIterator<T> implements Iterator<T> {
private final Queue<T> clients;
private T currentElement;
private final Collection<T> source;
AsynchronousIterator(final Collection<T> collection) {
clients = new LinkedList<>(collection);
this.source = collection;
}
#Override
public boolean hasNext() {
return clients.peek() != null;
}
#Override
public T next() {
currentElement = clients.poll();
return currentElement;
}
#Override
public void remove() {
synchronized(source) {
source.remove(currentElement);
}
}
}
Taking this a Step furhter, you might use the Semaphore Class to ensure thread-safety or something. But take the remove method with a grain of salt.
The point is, by using such an Iterator, no one, neither the iterating nor the iterated Class (is that a real word) has to worrie about Thread safety.
In a multi-threaded application I'm working on, we occasionally see ConcurrentModificationExceptions on our Lists (which are mostly ArrayList, sometimes Vectors). But there are other times when I think concurrent modifications are happening because iterating through the collection appears to be missing items, but no exceptions are thrown. I know that the docs for ConcurrentModificationException says you can't rely on it, but how would I go about making sure I'm not concurrently modifying a List? And is wrapping every access to the collection in a synchronized block the only way to prevent it?
Update: Yes, I know about Collections.synchronizedCollection, but it doesn't guard against somebody modifying the collection while you're iterating through it. I think at least some of my problem is happening when somebody adds something to a collection while I'm iterating through it.
Second Update If somebody wants to combine the mention of the synchronizedCollection and cloning like Jason did with a mention of the java.util.concurrent and the apache collections frameworks like jacekfoo and Javamann did, I can accept an answer.
Depending on your update frequency one of my favorites is the CopyOnWriteArrayList or CopyOnWriteArraySet. They create a new list/set on updates to avoid concurrent modification exception.
Your original question seems to be asking for an iterator that sees live updates to the underlying collection while remaining thread-safe. This is an incredibly expensive problem to solve in the general case, which is why none of the standard collection classes do it.
There are lots of ways of achieving partial solutions to the problem, and in your application, one of those may be sufficient.
Jason gives a specific way to achieve thread safety, and to avoid throwing a ConcurrentModificationException, but only at the expense of liveness.
Javamann mentions two specific classes in the java.util.concurrent package that solve the same problem in a lock-free way, where scalability is critical. These only shipped with Java 5, but there have been various projects that backport the functionality of the package into earlier Java versions, including this one, though they won't have such good performance in earlier JREs.
If you are already using some of the Apache Commons libraries, then as jacekfoo points out, the apache collections framework contains some helpful classes.
You might also consider looking at the Google collections framework.
Check out java.util.concurrent for versions of the standard Collections classes that are engineered to handle concurrency better.
Yes you have to synchronize access to collections objects.
Alternatively, you can use the synchronized wrappers around any existing object. See Collections.synchronizedCollection(). For example:
List<String> safeList = Collections.synchronizedList( originalList );
However all code needs to use the safe version, and even so iterating while another thread modifies will result in problems.
To solve the iteration problem, copy the list first. Example:
for ( String el : safeList.clone() )
{ ... }
For more optimized, thread-safe collections, also look at java.util.concurrent.
Usually you get a ConcurrentModificationException if you're trying to remove an element from a list whilst it's being iterated through.
The easiest way to test this is:
List<Blah> list = new ArrayList<Blah>();
for (Blah blah : list) {
list.remove(blah); // will throw the exception
}
I'm not sure how you'd get around it. You may have to implement your own thread-safe list, or you could create copies of the original list for writing and have a synchronized class that writes to the list.
You could try using defensive copying so that modifications to one List don't affect others.
Wrapping accesses to the collection in a synchronized block is the correct way to do this. Standard programming practice dictates the use of some sort of locking mechanism (semaphore, mutex, etc) when dealing with state that is shared across multiple threads.
Depending on your use case however you can usually make some optimizations to only lock in certain cases. For example, if you have a collection that is frequently read but rarely written, then you can allow concurrent reads but enforce a lock whenever a write is in progress. Concurrent reads only cause conflicts if the collection is in the process of being modified.
ConcurrentModificationException is best-effort because what you're asking is a hard problem. There's no good way to do this reliably without sacrificing performance besides proving that your access patterns do not concurrently modify the list.
Synchronization would likely prevent concurrent modifications, and it may be what you resort to in the end, but it can end up being costly. The best thing to do is probably to sit down and think for a while about your algorithm. If you can't come up with a lock-free solution, then resort to synchronization.
See the implementation. It basically stores an int:
transient volatile int modCount;
and that is incremented when there is a 'structural modification' (like remove). If iterator detects that modCount changed it throws Concurrent modification exception.
Synchronizing (via Collections.synchronizedXXX) won't do good since it does not guarantee iterator safety it only synchronizes writes and reads via put, get, set ...
See java.util.concurennt and apache collections framework (it has some classes that are optimized do work correctly in concurrent environment when there is more reads (that are unsynchronized) than writes - see FastHashMap.
You can also synchronize over iteratins over the list.
List<String> safeList = Collections.synchronizedList( originalList );
public void doSomething() {
synchronized(safeList){
for(String s : safeList){
System.out.println(s);
}
}
}
This will lock the list on synchronization and block all threads that try to access the list while you edit it or iterate over it. The downside is that you create a bottleneck.
This saves some memory over the .clone() method and might be faster depending on what you're doing in the iteration...
Collections.synchronizedList() will render a list nominally thread-safe and java.util.concurrent has more powerful features.
This will get rid of your concurrent modification exception. I won't speak to the efficiency however ;)
List<Blah> list = fillMyList();
List<Blah> temp = new ArrayList<Blah>();
for (Blah blah : list) {
//list.remove(blah); would throw the exception
temp.add(blah);
}
list.removeAll(temp);