What can happen if you concurrently modify unprotected Java collections? - java

http://docs.oracle.com/javase/7/docs/api/java/util/LinkedList.html
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
What can happen if you don't? Can you crash the JVM, cause an exception or just produce inconsistent state?
What if there is one writer but concurrent reads happen unprotected? Can you still crash and mess up the state, or just produce a inconsistent read?
Is this implementation-specific, or does the spec guarantee a certain level of security and/or atomicity?

Using an unsynchronized collection in an multithreaded environment will cause problems like dirty reads (inconsistent data state) and ConcurrentModificationException (mostly when one thread has modified the contents of the collection while another was iterating through it).
Depending on your use case, this may cause your application to crash or deadlock (when one thread is shut down by the JVM due to the mentioned above, uncaught exception). Even worse, it may cause dodgy problems and erroneous results which may be difficult to trace. It will not crash the JVM itself, though.
I'd suggest taking a look at the java.util.concurrent package. You'll find a wide variety of thread-safe, efficient collections. Most of them have weakly consistent iterators, returning elements reflecting the state of the collection at some point at or since the creation of the iterator. This means they do not throw the ConcurrentModificationException, and may proceed concurrently with other operations.
For information regarding Java Memory Model and it's guarantees, please refer to this (well worth reading!).

The nasty thing about thread safety is that errors can occur rarely and be difficult to reproduce. Java largely delegates the handling of Threads to the operating system, so the programmer is not in control of exactly how and when Threads get paused and switched by the OS when a number of tasks are running simultaneously. The frequency of kinds of errors observed can be different depending on whether the CPU is a single core or a dual or quad-core.
It is rare that concurrency errors will "crash the system," the more likely problem is inconsistent state. However, if you are iterating through a collection using an Iterator while another Thread modifies the collection then you will get a ConcurrentModificationException. For example:
Set<String> words; //a field that can be accessed by other threads.
// may throw ConcurrentModificationException
public ArrayList<String> unsafeIteration() {
ArrayList<String> longWords = new ArrayList<>();
for(String word : words) {
if(word.length()>4)
longWords.add(word);
}
return longWords ;
}
The implementation of Iterator attempts to detect concurrent modifications of the collection it is iterating over, but this is just a best-effort attempt to "fail-fast". Having your program fail by throwing an exception is much better than having unpredictable behavior.
The javadocs make this disclaimer:
Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.
If we are simply reading data from a collection using get then we aren't going to see this exception, but we do run the risk of inconsistent state. There may be times when this isn't a problem that requires fixing. If only one thread writes to a field and it is not vital that all threads always see the most recent up-to-date value in that field, I think you should be fine as long as you stay clear of iterators.

You may not crash the application but your different threads will suffer from DIRTY_READ problems.

The Javadoc of the root class of the collections framework, java.util.Collection, writes:
It is up to each collection to determine its own synchronization policy. In the absence of a stronger guarantee by the implementation, undefined behavior may result from the invocation of any method on a collection that is being mutated by another thread; this includes direct invocations, passing the collection to a method that might perform invocations, and using an existing iterator to examine the collection.
"Undefined behavior" implies that the collection may do whatever it pleases, the entire javadoc of the collections framework is null and void. For instance, an element might still be present in the collection after being removed, or not present after being added. For instance, if Thread 1 adds to a HashMap and triggers a resize while Thread 2 inserts something, the insert from Thread 2 might be lost.
However, I would be greatly surprised if lack of synchronization could crash the JVM itself.

Related

Why is this code throwing a ConcurrentModificationException?

Below is the code, I am getting a ConcurrentModificationException in the subiter.next() call even though I am not modifying the underlying collection and its running as a single thread.
Tree tree=partition.getTreeofThisPartition();
Set<DzExpressionHostTupel> oldSubtupels=tree.getSubscribers();
Iterator<DzExpressionHostTupel> subiter=oldSubtupels.iterator();
while (subiter.hasNext()){
DzExpressionHostTupel subtupel=subiter.next();
tree.removeSubscriber(subtupel);
}
If you read https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html, it says:
For example, it is not generally permissible for one thread to modify a Collection while another thread is iterating over it. In general, the results of the iteration are undefined under these circumstances. Some Iterator implementations (including those of all the general purpose collection implementations provided by the JRE) may choose to throw this exception if this behavior is detected. Iterators that do this are known as fail-fast iterators, as they fail quickly and cleanly, rather that risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception. For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
(emphasis added).
I'm guessing tree.removeSubscriber(subtupel); is modifying its subscribers set.

Effects of concurrent access to an unsynchronised Java ArrayList

Imagine the following scenario:
I have a standard Java ArrayList<String>.
This ArrayList<String> is accessed by multiple threads with no explicit synchronisation, subject to the following constraints:
Multiple reads may occur at the same time (possibly concurrent with the write described below). All reads call the iterator() method on the ArrayList<String> and exclusively use the returned Iterator<E> (iterators are not shared between threads). The only methods called on the Iterator<String> are hasNext() and next() (the remove() method is not called).
One thread may write to the list (possibly concurrent with reads but not concurrent with other writes). Each write only calls the add(String) and remove(Object) methods on the ArrayList<E>.
I know the follow to be true:
The read threads may see outdated data.
The read threads may experience a ConcurrentModificationException.
Apart from the above two problems, can anything else go wrong?
I am looking for specific examples (of the form, if x and y are true, then z will occur, where z is bad). It has to be something that can actually happen in practice. Please provide citations where possible.
(I personally think that other failures are possible but I have been challenged to come up with specific examples of why the above scenario is not suitable for production code.)
I would expect that your readers or your writer could get an ArrayIndexOutOfBounds exception or a null pointer exception, as well, since they can be seeing inconsistent state of the underlying array.
You really are at the mercy of the detailed implementation of the ArrayList class-- which can vary from environment to environment. It is possible that the JVM could implement the class with native code that could cause worse undefined behavior (JVM crash) when reading without synchronization.
I'll add this from the text of the ConcurrentModificationException reference page:
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast operations throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: ConcurrentModificationException should be used only to detect bugs.
In short, you can't depend on nothing bad happening besides stale data and ConcurrentModificationException.

Reason for ConcurrentModificationException on ArrayLists iterator.next()

I have no idea why a ConcurrentModificationException occurs when i iterate over an ArrayList. The ArrayList is methode scoped, so it should not be visible by other threads which execute the same code. At least if i understodd multi threading and variable scopes correctly.
Caused by: java.util.ConcurrentModificationException
at java.util.AbstractList$SimpleListIterator.next(AbstractList.java:64)
at com....StrategyHandler.applyStrategy(StrategyHandler.java:184)
private List<Order> applyStrategy(StorageObjectTree storageObjectTree) {
...
List<OrderHeader> finalList = new ArrayList<Order>();
for (StorageObject storageObject : storageObjectTree.getStorageObjects()) {
List<Order> currentOrders = strategy.process(storageObject);
...
if (currentOrders != null) {
Iterator<Order> iterator = currentOrders.iterator();
while (iterator.hasNext()) {
Order order = (Order) iterator.next(); // line 64
// read some values from order
}
finalList.addAll(currentOrders);
}
}
return finalList;
}
Can anybody give me an hint what could be the source of the problem?
If You have read the Java Doc for ConcurrentModifcationException :
It clearly states the condition in which it occurs:
This exception may be thrown by methods that have detected concurrent
modification of an object when such modification is not permissible.
For example, it is not generally permissible for one thread to modify
a Collection while another thread is iterating over it. In general,
the results of the iteration are undefined under these circumstances.
Some Iterator implementations (including those of all the general
purpose collection implementations provided by the JRE) may choose to
throw this exception if this behavior is detected. Iterators that do
this are known as fail-fast iterators, as they fail quickly and
cleanly, rather that risking arbitrary, non-deterministic behavior at
an undetermined time in the future.
Note that this exception does not always indicate that an object has
been concurrently modified by a different thread. If a single thread
issues a sequence of method invocations that violates the contract of
an object, the object may throw this exception. For example, if a
thread modifies a collection directly while it is iterating over the
collection with a fail-fast iterator, the iterator will throw this
exception.
Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.
In your case as you said, you do not have multiple threads accessing this list. it might still be possible as per second paragraph above if your single thread that is reading from iterator might be trying to write to it as well.
Hope this helps.
This exception occurred when you changing/adding/removing values from your list and in the same time you are iterating it. And if you use many threads at the same time...
Try to surround your if by synchronized(currentOrders) { /* YOUR LAST CODE */ }.
I'm not sure of this but try it.
Depending on the implementation of strategy.process(..) it could be that this implementation has still a reference to the List it passed back as a result. If there are multiple Threads involved in this implementation it might be possible that the List is modified by one of these threads even after it is passed back as a result.
(If you know the "Future" pattern, you could perhaps imagine an implementation where a method immediately returns an empty List and adds the actual results later using another Thread)
You could try to create a new ArrayList "around" the result list and iterate over this copied list.
You might want to read this SO post. Basically switch and use CopyOnWriteArrayList if you can't see where the problem is coming from.

ConcurrentHashMap operations

Following are some lines from the java docs of ConcurrentHashMap
This class obeys the same functional specification as Hashtable, and
includes versions of methods corresponding to each method of
Hashtable. However, even though all operations are thread-safe,
retrieval operations do not entail locking, and there is not any
support for locking the entire table in a way that prevents all
access.
What is the meaning of the statement
though all operations are thread-safe
from above paragraph?
Can anyone explain with any example of put() or get() methods?
The ConcurrentHashMap allows concurrent modification of the Map from several threads without the need to block them. Collections.synchronizedMap(map) creates a blocking Map which will degrade performance, albeit ensure consistency (if used properly).
Use the second option if you need to ensure data consistency, and each thread needs to have an up-to-date view of the map. Use the first if performance is critical, and each thread only inserts data to the map, with reads happening less frequently.
Your question is odd. If you understand what "thread safety" means then you would be able to understand how it applies to get() and put() on your own. If you don't understand thread safety then there is no point to explain it specifically in relation to get() and put(). Are you sure this isn't a homework question?
However, answering your question anyway, the fact that ConcurrentHashMap is thread safe means that if you have several threads executing put()s on the same map at the same time, then: a) no damage will occur to the internal data structures of the map and: b) some other thread doing a get() will see all of the values put in by the other threads. With a non-thread safe Map such as HashMap neither of those are guaranteed.

Detecting concurrent modifications?

In a multi-threaded application I'm working on, we occasionally see ConcurrentModificationExceptions on our Lists (which are mostly ArrayList, sometimes Vectors). But there are other times when I think concurrent modifications are happening because iterating through the collection appears to be missing items, but no exceptions are thrown. I know that the docs for ConcurrentModificationException says you can't rely on it, but how would I go about making sure I'm not concurrently modifying a List? And is wrapping every access to the collection in a synchronized block the only way to prevent it?
Update: Yes, I know about Collections.synchronizedCollection, but it doesn't guard against somebody modifying the collection while you're iterating through it. I think at least some of my problem is happening when somebody adds something to a collection while I'm iterating through it.
Second Update If somebody wants to combine the mention of the synchronizedCollection and cloning like Jason did with a mention of the java.util.concurrent and the apache collections frameworks like jacekfoo and Javamann did, I can accept an answer.
Depending on your update frequency one of my favorites is the CopyOnWriteArrayList or CopyOnWriteArraySet. They create a new list/set on updates to avoid concurrent modification exception.
Your original question seems to be asking for an iterator that sees live updates to the underlying collection while remaining thread-safe. This is an incredibly expensive problem to solve in the general case, which is why none of the standard collection classes do it.
There are lots of ways of achieving partial solutions to the problem, and in your application, one of those may be sufficient.
Jason gives a specific way to achieve thread safety, and to avoid throwing a ConcurrentModificationException, but only at the expense of liveness.
Javamann mentions two specific classes in the java.util.concurrent package that solve the same problem in a lock-free way, where scalability is critical. These only shipped with Java 5, but there have been various projects that backport the functionality of the package into earlier Java versions, including this one, though they won't have such good performance in earlier JREs.
If you are already using some of the Apache Commons libraries, then as jacekfoo points out, the apache collections framework contains some helpful classes.
You might also consider looking at the Google collections framework.
Check out java.util.concurrent for versions of the standard Collections classes that are engineered to handle concurrency better.
Yes you have to synchronize access to collections objects.
Alternatively, you can use the synchronized wrappers around any existing object. See Collections.synchronizedCollection(). For example:
List<String> safeList = Collections.synchronizedList( originalList );
However all code needs to use the safe version, and even so iterating while another thread modifies will result in problems.
To solve the iteration problem, copy the list first. Example:
for ( String el : safeList.clone() )
{ ... }
For more optimized, thread-safe collections, also look at java.util.concurrent.
Usually you get a ConcurrentModificationException if you're trying to remove an element from a list whilst it's being iterated through.
The easiest way to test this is:
List<Blah> list = new ArrayList<Blah>();
for (Blah blah : list) {
list.remove(blah); // will throw the exception
}
I'm not sure how you'd get around it. You may have to implement your own thread-safe list, or you could create copies of the original list for writing and have a synchronized class that writes to the list.
You could try using defensive copying so that modifications to one List don't affect others.
Wrapping accesses to the collection in a synchronized block is the correct way to do this. Standard programming practice dictates the use of some sort of locking mechanism (semaphore, mutex, etc) when dealing with state that is shared across multiple threads.
Depending on your use case however you can usually make some optimizations to only lock in certain cases. For example, if you have a collection that is frequently read but rarely written, then you can allow concurrent reads but enforce a lock whenever a write is in progress. Concurrent reads only cause conflicts if the collection is in the process of being modified.
ConcurrentModificationException is best-effort because what you're asking is a hard problem. There's no good way to do this reliably without sacrificing performance besides proving that your access patterns do not concurrently modify the list.
Synchronization would likely prevent concurrent modifications, and it may be what you resort to in the end, but it can end up being costly. The best thing to do is probably to sit down and think for a while about your algorithm. If you can't come up with a lock-free solution, then resort to synchronization.
See the implementation. It basically stores an int:
transient volatile int modCount;
and that is incremented when there is a 'structural modification' (like remove). If iterator detects that modCount changed it throws Concurrent modification exception.
Synchronizing (via Collections.synchronizedXXX) won't do good since it does not guarantee iterator safety it only synchronizes writes and reads via put, get, set ...
See java.util.concurennt and apache collections framework (it has some classes that are optimized do work correctly in concurrent environment when there is more reads (that are unsynchronized) than writes - see FastHashMap.
You can also synchronize over iteratins over the list.
List<String> safeList = Collections.synchronizedList( originalList );
public void doSomething() {
synchronized(safeList){
for(String s : safeList){
System.out.println(s);
}
}
}
This will lock the list on synchronization and block all threads that try to access the list while you edit it or iterate over it. The downside is that you create a bottleneck.
This saves some memory over the .clone() method and might be faster depending on what you're doing in the iteration...
Collections.synchronizedList() will render a list nominally thread-safe and java.util.concurrent has more powerful features.
This will get rid of your concurrent modification exception. I won't speak to the efficiency however ;)
List<Blah> list = fillMyList();
List<Blah> temp = new ArrayList<Blah>();
for (Blah blah : list) {
//list.remove(blah); would throw the exception
temp.add(blah);
}
list.removeAll(temp);

Categories