LinkedList Iterator throwing Concurrent Modification Exception - java

Is there a way to stop a ListIterator from throwing a ConcurrentModificationException? This is what I want to do:
Create a LinkedList with a bunch of objects that have a certain method that is to be executed frequently.
Have a set number of threads (say N) all of which are responsible for executing the said method of the objects in the LinkedList. For example, if there are k objects in the list, thread n would execute the method of the n-th object in the list, then move on to n+N-th object, then to n+2N-th, etc., until it loops back to the beginning.
The problem here lies in the retrieval of these objects. I would obviously be using a ListIterator to do this work. However, I predict this will not get very far, thanks to the ConcurrentModificationException that will be thrown according to the documentation. I want the list to be modifiable, and for the iterators to not care. In fact, it is expected that these objects will create and destroy other objects in the list.
I've thought of a few work-arounds:
Create and destroy a new iterator to retrieve the object at the given index. However, this is O(n), undesirable.
Use an ArrayedList instead; however, this is also undesirable, since deletions are O(n) and there are problems with the list needing to expand (and perhaps contract?) from time to time.
Write my own LinkedList class. Don't want to.
Thus, my question. Is there a way to stop a ListIterator from throwing a ConcurrentModificationException?

You seem concerned with performance. Have you actually measured the performance hit of using an O(n) vs O(1) algorithm? Depending on what you are doing and how frequently you are doing it, it might be acceptable to simply use a CopyOnWriteArrayList which is thread safe. Its iterators are also thread safe.
The main performance drag is on mutative operations (set, add, remove...): a new list is recreated each time.
However, the performance will be good enough for most applications. I would personally try using that, profile my application to check that the performance is good enough, and move on if it is. If it is not, you will need to find other ways.

Is there a way to stop a ListIterator from throwing a ConcurrentModificationException?
That you are asking this question this way shows a lack of understanding of how to properly use threads to increase the performance of your application.
The whole purpose of using threads is to divide processing and IO into separate runnable entities that can be executed in parallel -- independent of each other. If you are forking threads to all work on the same LinkedList then you most likely will have a performance loss or minimal gain since the overhead of the synchronization necessary to keep each of the threads' "view" of the LinkedList in sync would counter any gains due to parallel execution.
The question should not be "how to I stop ConcurrentModificationException", it should be "how can I use threads to improve the processing of a list of objects". That's the right question.
To process a collection of objects in parallel with a number of threads, you should be using an ExecutorService thread-pool. You create the pool with something like the following code. Each of the entries in your LinkedList (in this example Job) would then be processed by the threads in the pool in parallel.
// create a thread pool with 10 workers
ExecutorService threadPool = Executors.newFixedThreadPool(10);
// submit each of the objects in the list to the pool
for (Job job : jobLinkedList) {
    threadPool.submit(new MyJobProcessor(job));
}
// once we have submitted all jobs to the thread pool, it should be shutdown
threadPool.shutdown();
// wait for the thread-pool jobs to finish
threadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
synchronized (jobLinkedList) {
// not sure this is necessary but we need to a memory barrier somewhere
}
...
// you wouldn't need this if Job implemented Runnable
public class MyJobProcessor implements Runnable {
    private Job job;
public MyJobProcessor(Job job) {
        this.job = job;
}
  public void run() {
    // process the job
    }
}

You could use one Iterator to scan the list, and use an Executor to do the work on each object by passing off to a pool of threads. That's easy. There's overhead in packaging up work units this way. You still have to be careful to use Iterator method to modify the list, only, but maybe that simplifies the problem.
Or can you perform your work in one pass, then list modification in the next?
Can you split into N lists?

Please see the answer from #assylias -- his advice is good. I would add that if you decide to write your own linked list class, you need to think very carefully about how to make it thread-safe.
Think about all the ways your list could get mangled if multiple threads tried to modify it simultaneously. Just locking 1 or 2 nodes is not enough -- as an example, take the following list:
A -> B -> C -> D
Imagine that one thread tries to remove B, just as another thread is removing C. To remove B, the link from A needs to "jump" over B to C. But what if C is no longer part of the list by that time? Likewise, to remove C, the link from B needs to be changed to jump to D, but what if B has already been removed from the list by that time? Similar issues arise when nodes are added simultaneously to nearby parts of the list.
If you have 1 lock per node, and you lock 3 nodes when doing a "remove" operation (the node to be removed, and the nodes before and after it), I think it will be thread-safe. You need to also think carefully about which nodes must be locked when adding nodes, and when traversing the list. To avoid deadlocks, you need to make sure to always acquire locks in a constant order, and when traversing the list, you need to use "hand-over-hand" locking (which precludes the use of ordinary Java monitors -- you need explicit lock objects).

Related

java linkedlist returns same element multi thread

I want to add Packets read by my PacketHandler into an LinkedList to
save them with:
Packet toAdd = handler.handlePacket(socket.getInputStream());
synchronized (packetsRead) {
packetsRead.addLast(toAdd);
if (debug) {
System.out.println(packetsRead.getLast().toString());
}
}
and reading them with
synchronized (packetsRead) {
if (packetsRead.size() > 0) {
return packetsRead.pollFirst();
}
}
with the debug method in the first method I can see that the
last item is never the same. So different Packets are added into my list.
But when I try to read them from a different thread I always got the same packets.
For example if there are 10 different packets in my list it would return the first one 10 times.
How to make it thread safe?
Code must synchronize on the same object.
If this was done - and the rest of the program correct, and this was the only place the LinkedList was accessed - then it would work as expected, with or without threads. If using the same object the synchronization blocks are not large enough and the posted code does not show the problem.
FWIW: See Matt's answer for some alternative thread-safe-by-design data structures. But note that these will only solve the problem if the issue was not using the same synchronized object.
Don't use a plain LinkedList for concurrent purposes:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
At the very least, use Collections.synchronizedList()as described in the LinkedList JavaDocs.
Even better: use a thread safe, concurrent data structure such as ConcurrentLinkedQueue, ArrayBlockingQueue or LinkedBlockingQueue.

Can objects get lost if a LinkedList is add/remove fast by lots of threads?

sound like a silly question. I just started Java Concurrency.
I have a LinkedList that acts as a task queue and is accessed by multiple threads. They removeFirst() and execute it, other threads put more tasks (.add()). Tasks can have the thread put them back to the queue.
I notice that when there are a lot of tasks and they are put back to the queue a lot, the number of tasks I add to the queue initially are not what come out, 1, or sometimes 2 is missing.
I checked everything and I synchronized every critical section + notifyAll().
Already mark the LinkedList as 'volatile'.
Exact number is 384 tasks, each is put back 3072 times.
The problem doesn't occur if there is a small number of tasks & put back. Also if I System.out.println() all the steps then it doesn't happens anymore so I can't debug.
Could it be possible that LinkedList.add() is not fast enough so the threads somehow miss it?
Simplified code:
public void callByAllThreads() {
Task executedTask = null;
do
{
// access by multiple thread
synchronized(asyncQueue) {
executedTask = asyncQueue.poll();
if(executedTask == null) {
inProcessCount.incrementAndGet(); // mark that there is some processing going on
}
}
if(executedTask != null) {
executedTask.callMethod(); // subclass of task can override this method
synchronized(asyncQueue) {
inProcessCount.decrementAndGet();
asyncQueue.notifyAll();
}
}
}
while(executedTask != null);
}
The Task can override callMethod:
public void callMethodOverride() {
synchronized(getAsyncQueue()) {
getAsyncQueue().add(this);
getAsyncQueue().notifyAll();
}
}
From the docs for LinkedList:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
i.e. you should synchronize access to the list. You say you are, but if you are seeing items get "lost" then you probably aren't synchronizing properly. Instead of trying to do that, you could use a framework class that does it for you ...
... If you are always removing the next available (first) item (effectively a producer/consumer implementation) then you could use a BlockingQueue implementation, This is guaranteed to be thread safe, and has the advantage of blocking the consumer until an item is available. An example is the ArrayBlockingQueue.
For non-blocking thread-safe queues you can look at ConcurrentLinkedQueue
Marking the list instance variable volatile has nothing to do with your list being synchronized for mutation methods like add or removeFirst. volatile is simply to do with ensuring that read/write for that instance variable is communicated correctly between, and ordered correctly within, threads. Note I said that variable, not the contents of that variable (see the Java Tutorials > Atomic Access)
LinkedList is definitely not thread safe; you cannot use it safely with multiple threads. It's not a question of "fast enough," it's a question of changes made by one thread being visible to other threads. Marking it volatile doesn't help; that only affects references to the LinkedList being changed, not changes to the contents of the LinkedList.
Consider ConcurrentLinkedQueue or ConcurrentLinkedDeque.
LinkedList is not thread safe, so yes, multiple threads accessing it simultaneously will lead to problems. Synchronizing critical sections can solve this, but as you are still having problems you probably made a mistake somewhere. Try wrapping it in a Collections.synchronizedList() to synchronize all method calls.
Linked list is not thread safe , you can use ConcurrentLinkedQueue if it fits your need,which seems possibly can.
As documentation says
An unbounded thread-safe queue based on linked nodes. This queue
orders elements FIFO (first-in-first-out). The head of the queue is
that element that has been on the queue the longest time. The tail of
the queue is that element that has been on the queue the shortest
time. New elements are inserted at the tail of the queue, and the
queue retrieval operations obtain elements at the head of the queue. A
ConcurrentLinkedQueue is an appropriate choice when many threads will
share access to a common collection. This queue does not permit null
elements.
You increment your inProcessCount when executedTask == null which is obviously the opposite of what you want to do. So it’s no wonder that it will have inconsistent values.
But there are other issues as well. You call notifyAll() at several places but as long as there is no one calling wait() that has no use.
Note further that if you access an integer variable consistently from inside synchronized blocks only throughout the code, there is no need to make it an AtomicInteger. On the other hand, if you use it, e.g. because it will be accessed at other places without additional synchronization, you can move the code updating the AtomicInteger outside the synchronized block.
Also, a method which calls a method like getAsyncQueue() three times looks suspicious to a reader. Just call it once and remember the result in a local variable, then everone can be confident that it is the same reference on all three uses. Generally, you have to ensure that all code is using the same list, hence the appropriate modifier for the variable holding it is final, not volatile.

Synchronized collections list

I have 2 threads needing access to a Queue, one for putting and one for getting.
So I have an initiation
public static Queue<WorldData> blockDestructionQueue = Collections.synchronizedList(new LinkedList<WorldData>());
With the above I get a Type mismatch: cannot convert from List to Queue
I tried casting it to a Queue but this did not work.
public static Queue<WorldData> blockDestructionQueue = (Queue<WorldData>)Collections.synchronizedList(new LinkedList<WorldData>());
I was wondering as to why this is not working.
I got this information from another stack overflow answer.
How to use ConcurrentLinkedQueue?
In the correct answer paragraph 6
If you only have one thread putting stuff into the queue, and another
thread taking stuff out of the queue, ConcurrentLinkingQueue is
probably overkill. It's more for when you may have hundreds or even
thousands of threads accessing the queue at the same time. Your needs
will probably be met by using:
Queue<YourObject> queue = Collections.synchronizedList(new LinkedList<YourObject>());
A plus of this is that it locks on the instance (queue), so you can
synchronize on queue to ensure atomicity of composite operations (as
explained by Jared). You CANNOT do this with a ConcurrentLinkingQueue,
as all operations are done WITHOUT locking on the instance (using
java.util.concurrent.atomic variables). You will NOT need to do this
if you want to block while the queue is empty, because poll() will
simply return null while the queue is empty, and poll() is atomic.
Check to see if poll() returns null. If it does, wait(), then try
again. No need to lock.
Additional Information:
edit: Eclipse was trying to be too helpful and decided to add a break point exception where it was not needed and was not asked to put one.
A queue is not a list and a Queue is not an implementation of List, although you can implement a queue with a list.
Have a look at BlockingQueue it is probably a better fit for what you need:
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/BlockingQueue.html
Collections.synchronizedList returns an instance of SynchronizedList which does not extend Queue. LinkedList is a Queue but that's not what you're using at that point.

Java Iterator Concurrency

I'm trying to loop over a Java iterator concurrently, but am having troubles with the best way to do this.
Here is what I have where I don't try to do anything concurrently.
Long l;
Iterator<Long> i = getUserIDs();
while (i.hasNext()) {
l = i.next();
someObject.doSomething(l);
anotheObject.doSomething(l);
}
There should be no race conditions between the things I'm doing on the non iterator objects, so I'm not too worried about that. I'd just like to speed up how long it takes to loop through the iterator by not doing it sequentially.
Thanks in advance.
One solution is to use an executor to parallelise your work.
Simple example:
ExecutorService executor = Executors.newCachedThreadPool();
Iterator<Long> i = getUserIDs();
while (i.hasNext()) {
final Long l = i.next();
Runnable task = new Runnable() {
public void run() {
someObject.doSomething(l);
anotheObject.doSomething(l);
}
}
executor.submit(task);
}
executor.shutdown();
This will create a new thread for each item in the iterator, which will then do the work. You can tune how many threads are used by using a different method on the Executors class, or subdivide the work as you see fit (e.g. a different Runnable for each of the method calls).
A can offer two possible approaches:
Use a thread pool and dispatch the items received from the iterator to a set of processing threads. This will not accelerate the iterator operations themselves, since those would still happen in a single thread, but it will parallelize the actual processing.
Depending on how the iteration is created, you might be able to split the iteration process to multiple segments, each to be processed by a separate thread via a different Iterator object. For an example, have a look at the List.sublist(int fromIndex, int toIndex) and List.listIterator(int index) methods.
This would allow the iterator operations to happen in parallel, but it is not always possible to segment the iteration like this, usually due to the simple fact that the items to be iterated over are not immediately available.
As a bonus trick, if the iteration operations are expensive or slow, such as those required to access a database, you might see a throughput improvement if you separate them out to a separate thread that will use the iterator to fill in a BlockingQueue. The dispatcher thread will then only have to access the queue, without waiting on the iterator object to retrieve the next item.
The most important advice in this case is this: "Use your profiler", usually to be followed by "Do not optimise prematurely". By using a profiler, such as VisualVM, you should be able to ascertain the exact cause of any performance issues, without taking shots in the dark.
If you are using Java 7, you can use the new fork/join; see the tutorial.
Not only does it split automatically the tasks among the threads, but if some thread finishes its tasks earlier than the other threads, it "steals" some tasks from the other threads.

what to use in multithreaded environment; Vector or ArrayList

I have this situation:
web application with cca 200 concurent requests (Threads) are in need to log something to local filesystem. I have one class to which all threads are placing their calls, and that class internally stores messages to one Array (Vector or ArrayList) which then in turn will be written to filesystem.
Idea is to return from thread's call ASAP so thread can do it's job as fast as possible, what thread wanted to log can be written to filesystem later, it is not so crucial.
So, that class in turn removes first element from that list and writes it to filesystem, while in real time there is 10 or 20 threads which are appending new logs at the end of that list.
I would like to use ArrayList since it is not synchronized and therefore thread's calls will last less, question is:
am I risking deadlocks / data loss? Is it better to use Vector since it is thread safe? Is it slower to use Vector?
Actually both ArrayList and Vector are very bad choices here, not because of synchronization (which you would definitely need), but because removing the first element is O(n).
The perfect data structure for your purspose is the ConcurrentLinkedQueue: it offers both thread safety (without using synchronization), and O(1) adding and removing.
Are you limitted to particular (old) java version? It not please consider using java.util.concurrent.LinkedBlockingQueue for this kind of stuff. It's really worth looking at java.util.concurrent.* package when dealing with concurrency.
Vector is worse than useless. Don't use it even when using multithreading. A trivial example of why it's bad is to consider two threads simultaneously iterating and removing elements on the list at the same time. The methods size(), get(), remove() might all be synchronized but the iteration loop is not atomic so - kaboom. One thread is bound to try removing something which is not there, or skip elements because the size() changes.
Instead use synchronized() blocks where you expect two threads to access the same data.
private ArrayList myList;
void removeElement(Object e)
{
synchronized (myList) {
myList.remove(e);
}
}
Java 5 provides explicit Lock objects which allow more finegrained control, such as being able to attempt to timeout if a resource is not available in some time period.
private final Lock lock = new ReentrantLock();
private ArrayList myList;
void removeElement(Object e) {
{
if (!lock.tryLock(1, TimeUnit.SECONDS)) {
// Timeout
throw new SomeException();
}
try {
myList.remove(e);
}
finally {
lock.unlock();
}
}
There actually is a marginal difference in performance between a sychronizedlist and a vector. (http://www.javacodegeeks.com/2010/08/java-best-practices-vector-arraylist.html)

Categories