Thread safety issue

Thread safety issue - java

I have a LinkedList with Objects, that I want to process. Objects get added to it from another thread, but only one Thread removes/reads from it.
private LinkedList<MyObject> queue = new LinkedList<>();
new Thread()
{
#Override
public void run()
{
while (!Thread.interrupted())
{
if (!queue.isEmpty())
{
MyObject first = queue.removeFirst();
// do sth..
}
}
}
}.start();
In another Thread I add Objects to the queue
queue.add(new MyObject());
Sometimes this code leads to an Exception though, which I cant really explain to myself.
Exception in thread "" java.util.NoSuchElementException
at java.util.LinkedList.removeFirst(LinkedList.java:270)
I dont get, why I get this Exception, since it should only try to remove an object if one exists.

As Nicolas has already mentioned, you need a thread safe implementation. I would recommend using LinkedBlockingQueue.
You can add to it using offer method and remove using take which will also resolve your "busy waiting" problem.

A LinkedList is not thread-safe so you can't share it with several threads as you currently do otherwise you will face unpredictable bugs like this one due to concurrent modifications that lead to an inconsistent state, use instead a thread-safe deque such as ConcurrentLinkedDeque.

Although I think there have been a few good solutions offered as to how to resolve the problem, none of the answers explained why #BluE sees the NoSuchElementException. So here is what I think could be happening.
Since LinkedList access is not synchronized it is possible that:
The producer thread adds an element to the queue
Two consumer threads concurrently check for if (!queue.isEmpty()) and see that it is not.
Both consumer threads go ahead and try to take an element from the queue invoking MyObject first = queue.removeFirst();
One of the threads succeeds and the other one fails with a NoSuchElementException since there are no more elements in the queue.
UPDATE:
Provided you have only one producer and one consumer, I think the Java Memory Model specification could explain the behaviour you see.
Long story short, since the access to the LinkedList is not synchronized there are no data visibility guarantees offered by the JVM. Let's have a look at the implementations of isEmpty and removeFirst methods:
From LinkedList
transient int size = 0;
transient Node<E> first;
// ...
public int More ...size() {
return size;
}
// ...
public E removeFirst() {
final Node<E> f = first;
if (f == null)
throw new NoSuchElementException();
return unlinkFirst(f);
}
From AbstractCollection
public boolean isEmpty() {
return size() == 0;
}
As you can see, the size and the elements are stored in different variables. So it is possible that the consumer thread sees the updates on the "size" variable and does not see the updates on the "first".

What you could do is to use some kind of technique to coordinate the threads, like Mutex, Semaphore, Monitor, Mailbox etc.

Related

What is the proper way to wait (block) until a LinkedBlockingQueue is nonempty, without mutating it? [duplicate]

I have a blocking queue of objects.
I want to write a thread that blocks till there is a object on the queue. Similar to the functionality provided by BlockingQueue.take().
However, since I do not know if I will be able to process the object successfully, I want to just peek() and not remove the object. I want to remove the object only if I am able to process it successfully.
So, I would like a blocking peek() function. Currently, peek() just returns if the queue is empty as per the javadocs.
Am I missing something? Is there another way to achieve this functionality?
EDIT:
Any thoughts on if I just used a thread safe queue and peeked and slept instead?
public void run() {
while (!exit) {
while (queue.size() != 0) {
Object o = queue.peek();
if (o != null) {
if (consume(o) == true) {
queue.remove();
} else {
Thread.sleep(10000); //need to backoff (60s) and try again
}
}
}
Thread.sleep(1000); //wait 1s for object on queue
}
}
Note that I only have one consumer thread and one (separate) producer thread. I guess this isn't as efficient as using a BlockingQueue... Any comments appreciated.

You could use a LinkedBlockingDeque and physically remove the item from the queue (using takeLast()) but replace it again at the end of the queue if processing fails using putLast(E e). Meanwhile your "producers" would add elements to the front of the queue using putFirst(E e).
You could always encapsulate this behaviour within your own Queue implementation and provide a blockingPeek() method that performs takeLast() followed by putLast() behind the scenes on the underlying LinkedBlockingDeque. Hence from the calling client's perspective the element is never removed from your queue.

However, since I do not know if I will be able to process the object successfully, I want to just peek() and not remove the object. I want to remove the object only if I am able to process it successfully.
In general, it is not thread-safe. What if, after you peek() and determine that the object can be processed successfully, but before you take() it to remove and process, another thread takes that object?

Could you also just add an event listener queue to your blocking queue, then when something is added to the (blocking)queue, send an event off to your listeners? You could have your thread block until it's actionPerformed method was called.

The only thing I'm aware of that does this is BlockingBuffer in Apache Commons Collections:
If either get or remove is called on
an empty Buffer, the calling thread
waits for notification that an add or
addAll operation has completed.
get() is equivalent to peek(), and a Buffer can be made to act like BlockingQueue by decorating a UnboundedFifoBuffer with a BlockingBuffer

The quick answer is, not there's not really a way have a blocking peek, bar implementing a blocking queue with a blocking peek() yourself.
Am I missing something?
peek() can be troublesome with concurrency -
If you can't process your peek()'d message - it'll be left in the queue, unless you have multiple consumers.
Who is going to get that object out of the queue if you can't process it ?
If you have multiple consumers, you get a race condition between you peek()'ing and another thread also processing items, resulting in duplicate processing or worse.
Sounds like you might be better off actually removing the item and process it using a
Chain-of-responsibility pattern
Edit: re: your last example: If you have only 1 consumer, you will never get rid of the object on the queue - unless it's updated in the mean time - in which case you'd better be very very careful about thread safety and probably shouldn't have put the item in the queue anyway.

Not an answer per se, but: JDK-6653412 claims this is not a valid use case.

Looks like BlockingQueue itself doesn't have the functionality you're specifying.
I might try to re-frame the problem a little though: what would you do with objects you can't "process correctly"? If you're just leaving them in the queue, you'll have to pull them out at some point and deal with them. I'd reccommend either figuring out how to process them (commonly, if a queue.get() gives any sort of invalid or bad value, you're probably OK to just drop it on the floor) or choosing a different data structure than a FIFO.

The 'simplest' solution
Do not process the next element until the previous element is processed succesfully.
public void run() {
Object lastSuccessfullyProcessedElement = null;
while (!exit) {
Object obj = lastSuccessfullyProcessedElement == null ? queue.take() : lastSuccessfullyProcessedElement; // blocking
boolean successful = process(obj);
if(!successful) {
lastSuccessfullyProcessedElement = obj;
} else {
lastSuccessfullyProcessedElement = null;
}
}
}
Calling peek() and checking if the value is null is not CPU efficient.
I have seen CPU usage going to 10% on my system when the queue is empty for the following program.
while (true) {
Object o = queue.peek();
if(o == null) continue;
// omitted for the sake of brevity
}
Adding sleep() adds slowness.
Adding it back to the queue using putLast will disturb the order. Moreover, it is a blocking operation which requires locks.

NoSuchElementException occurs when Iterating through Java ArrayList concurrently

I have a method similar to the one below:
public void addSubjectsToCategory() {
final List<Subject> subjectsList = new ArrayList<>(getSubjectList());
for (final Iterator<Subject> subjectIterator =
subjectsList.iterator(); subjectIterator.hasNext();) {
addToCategory(subjectIterator.next().getId());
}
}
When this runs concurrently for the same user (another instance), sometimes it throws NoSuchElementException. As per my understanding, sometimes subjectIterator.next() get executed when there are no elements in the list. This occurs when being accessed only. Will method synchronization solve this issue?
The stack trace is:
java.util.NoSuchElementException: null
at java.util.ArrayList$Itr.next(Unknown Source)
at org.cmos.student.subject.category.CategoryManager.addSubjectsToCategory(CategoryManager.java:221)
This stack trace fails at the addToCategory(subjectIterator.next().getId()); line.

The basic rule of iterators is that underlying collection must not be modified while the iterator is being used.
If you have a single thread, there seems to be nothing wrong with this code as long as getSubjectsList() does not return null OR addToCategory() or getId() have some strange side-effects that would modify the subjectsList. Note, however, that you could rewrite the for-loop somewhat nicer (for(Subject subject: subjectsList) ...).
Judging by your code, my best guess is that you have another thread which is modifying subjectsList somewhere else. If this is the case, using a SynchronizedList will probably not solve your problem. As far as I know, synchronization only applies to List methods such as add(), remove() etc., and does not lock a collection during iteration.
In this case, adding synchronized to the method will not help either, because the other thread is doing its nasty stuff elsewhere. If these assumptions are true, your easiest and safest way is to make a separate synchronization object (i.e. Object lock = new Object()) and then put synchronized (lock) { ... } around this for loop as well as any other place in your program that modifies the collection. This will prevent the other thread from doing any modifications while this thread is iterating, and vice versa.

subjectIterator.hasNext();) {
--- Imagine a thread switch occurs here, at this point, between the call to hasNext() and next() methods.
addToCategory(subjectIterator.next().getId());
What could happen is the following, assuming you are at the last element in the list:
thread A calls hasNext(), the result is true;
thread switch occurs to thread B;
thread B calls hasNext(), the result is also true;
thread B calls next() and gets the next element from the list; now the list is empty because it was the last one;
thread switch occurs back to thread A;
thread A is already inside the body of the for loop, because this is where it was interrupted, it already called hasNext earlier, which
was true;
so thread A calls next(), which fails now with an exception, because there are no more elements in the list.
So what you have to do in such situations, is to make the operations hasNext and next behave in an atomic way, without thread switches occurring in between.
A simple synchronization on the list solves, indeed, the problem:
public void addSubjectsToCategory() {
final ArrayBlockingQueue<Subject> subjectsList = new ArrayBlockingQueue(getSubjectList());
synchronized (subjectsList) {
for (final Iterator<Subject> subjectIterator =
subjectsList.iterator(); subjectIterator.hasNext();) {
addToCategory(subjectIterator.next().getId());
}
}
}
Note, however, that there may be performance implications with this approach. No other thread will be able to read or write from/to the same list until the iteration is over (but this is what you want). To solve this, you may want to move the synchronization inside the loop, just around hasNext and next. Or you may want to use more sophisticated synchronization mechanisms, such as read-write locks.

It sounds like another thread is calling the method and grabbing the last element while another thread is about to get the next. So when the other thread finishes and comes back to the paused thread there is nothing left. I suggest using an ArrayBlockingQueue instead of a list. This will block threads when one is already iterating.
public void addSubjectsToCategory() {
final ArrayBlockingQueue<Subject> subjectsList = new ArrayBlockingQueue(getSubjectList());
for (final Iterator<Subject> subjectIterator =
subjectsList.iterator(); subjectIterator.hasNext();) {
addToCategory(subjectIterator.next().getId());
}
}
There is a bit of a wrinkle that you may have to sort out. The ArrayBlockingQueue will block if it is empty or full and wait for a thread to either insert something or take something out, respectively, before it will unblock and allow other threads to access.

You can use Collections.synchronizedList(list) if all you need is a simple invocation Sycnchronization. But do note that the iterator that you use must be inside the Synchronized block.

As I get you are adding elements to a list which might be under reading process.
Imagine the list is empty and your other thread is reading it. These kinds of problems might lead into your problem. You could never be sure that an element is written to your list which you are trying to read , in this approach.

I was surprised not to see an answer involving the use of a CopyOnWriteArrayList or Guava's ImmutableList so I thought that I would add such an answer here.
Firstly, if your use case is such that you only have a few additions relative to many reads, consider using the CopyOnWriteArrayList to solve the concurrent list traversal problem. Method synchronization could solve your issue, but CopyOnWriteArrayList will likely have better performance if the number of concurrent accesses "vastly" exceeds the number of writes, as per that class's Javadoc.
Secondly, if your use case is such that you can add everything to your list upfront in a single-threaded manner and only then do you need iterate across it concurrently, then consider Guava's ImmutableList class. You accomplish this by first using a standard ArrayList or a LinkedList or a builder for your ImmutableList. Once your single-threaded data entry is complete, then you instantiate your ImmutableList using either ImmutableList.copyOf() or ImmutableList.build(). If your use case will allow for this write/read pattern, this will probably be your most performant option.
Hope that helps.

I would like to make a suggestion that would probably solve your problem, considering that this is a concurrency issue.
If making the method addSubjectsToCategory() synchronized solves your problem, then you have located where your concurrency issue is. It is important to locate where the problem occurs, otherwise the information you provided is useless to us, we can't help you.
IF using synchronized in your method solves your problem, then consider this answer as educational or as a more elegant solution. Otherwise, share the code where you implement your threading environment, so we can have a look.
public synchronized void addSubjectsToCategory(List subjectsList){
Iterator iterator = subjectsList.iterator();
while(iterator.hasNext())
addToCategory(iterator.next().getId());
}
or
//This semaphore should be used by all threads. Be careful not to create a
//different semaphore each time.
public static Semaphore mutex = new Semaphore(1);
public void addSubjectsToCategory(List subjectsList){
Iterator<Subject> iterator = subjectsList.iterator();
mutex.acquire();
while(iterator.hasNext())
addToCategory(iterator.next().getId());
mutex.release();
}
Synchronized is clean, tidy and elegant. You have a really small method and creating locks, imho is unnecessary.
Synchronized means that only 1 thread will be able to enter the method at a time. Which means, you should use it only if you want 1 thread active each time.
If you actually need parallel execution, then your problem is not thread-related, but has something to do with the rest of your code, which we can not see.

synchronized vs ReadWriteLock

Hi i have extremly simplified a java problem in the following code snippet
public class WhichJavaSynchroIsBestHere {
private BlockingQueue<CustomObject> queue = new PriorityBlockingQueue<CustomObject>();
public void add( CustomObject customO ) {
// The custom objects do never have the same id
// so it's no horizontal concurrency but more vertical one a la producer/consumer
if ( !queue.contains( customO ) ) {
// Between the two statement a remove can happen
queue.add( customO );
}
}
public void remove( CustomObject customO ) {
queue.remove( customO );
}
public static class CustomObject {
long id;
#Override
public boolean equals( Object obj ) {
if ( obj == null || getClass() != obj.getClass() )
return false;
CustomObject other = (CustomObject) obj;
return ( id == other.id;
}
}
}
So this more of producer / consumer problem because presumably the two thread calling add do not pass the same Customobject (id), but this can happen when if one thread is calling add with same object as a second thread calling remove. The code section in between the if condition and the adding is what seems to me as not thread, safe, i was thinking about the object Lock (no synchronized blocks) to secure that section, but is a ReadWriteLock better?

It would make no difference, as both sections would require a Write lock anyway.
The advantage of a ReadWriteLock is to easily allow multiple Readers that can work with shared access, and, yet, cooperate well with someone who requires exclusive access for Writing.
You could surround the contains code with a Read lock, and that would make sense if putting in potential duplicates is the bulk of your work. But if it's more a sanity check for a rare edge case, than a primary driver of your work (i.e. the test will pass the vast majority of the time), then there's no reason for the Read lock in this case. Just lock the whole section and be done with it.

The code section in between the if condition and the adding is what seems to me as not thread-safe
You are correct, it isn't thread-safe. Anytime you have multiple calls to a object, even a synchronized one, you need to worry about the state of the object changing between the calls if accessed by multiple threads. It's not just than an object might have been removed, but a duplicate object could have been added by a different producer causing duplicates in the queue. The race would be:
thread #1 tests to see if object A is in the queue, it's not
thread #2 tests to see if object A is in the queue, it's not
thread #1 adds object A to the queue
thread #2 adds object A to the queue
The queue would then have 2 copies of A in it.
i was thinking about the object Lock (no synchronized blocks) to secure that section
If you are shying away from synchronized because of perceived performance problems, then don't. This is a perfect example of where using synchronized is appropriate. You could get rid of the BlockingQueue if you are doing all operations inside of a synchronized block.
is a ReadWriteLock better ?
No because in both cases, the threads are "writing" to the queue. A removal is modifying the queue as much as an add. The ReadWriteLock allows multiple reading threads if there is no writers but exclusive access to a writing thread. Now the testing of the queue is considered a read but that's not going to save you very much unless there are is a large percentage of times where there are duplicates that are already in the queue.
Also, be very careful of queue.contains(customO). Most queues (this includes PriorityBlockingQueue) run through all items in the queue to look for the one you may be adding (O(N)). This can be very expensive depending on how many items are in the collection.
This feels to me to be be a good place to use the ConcurrentSkipListSet. You just do an queue.add()which internally does a put-if-absent. You can do aqueue.pollFirst()` to remove and get the first item. The collection then takes care of the memory synchronization and locking for you and resolves the race conditions.

How can I stop two threads colliding when accessing java ArrayList?

I have two threads which both need to access an ArrayList<short[]> instance variable.
One thread is going to asynchronously add short[] items to the list via a callback when new data has arrived : void dataChanged(short[] theData)
The other thread is going to periodically check if the list has items and if it does it is going to iterate over all the items, process them, and remove them from the array.
How can I set this up to guard for collisions between the two threads?
This contrived code example currently throws a java.util.ConcurrentModificationException
//instance vairbales
private ArrayList<short[]> list = new ArrayList<short[]>();
//asynchronous callback happening on the thread that adds the data to the list
void dataChanged(short[] theData) {
list.add(theData);
}
//thread that iterates over the list and processes the current data it contains
Thread thread = new Thread(new Runnable() {
#Override
public void run() {
while (true) {
for(short[] item : list) {
//process the data
}
//clear the list to discared of data which has been processed.
list.clear();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
});

You might want to use a producer consumer queue like an ArrayBlockingQueue instead or a similar concurrent collection.
The producer–consumer problem (also known as the bounded-buffer problem) is a classic example of a multi-process synchronization problem. The problem describes two processes, the producer and the consumer, who share a common, fixed-size buffer used as a queue. The producer's job is to generate a piece of data, put it into the buffer and start again. At the same time, the consumer is consuming the data (i.e., removing it from the buffer) one piece at a time. The problem is to make sure that the producer won't try to add data into the buffer if it's full and that the consumer won't try to remove data from an empty buffer.
One thread offers short[]s and the other take()s them.

The easiest way is to change the type of list to a thread safe list implementation:
private List<short[]> list = new CopyOnWriteArrayList<short[]>();
Note that this type of list is not extremely efficient if you mutate it a lot (add/remove) - but if it works for you that's a simple solution.
If you need more efficiency, you can use a synchronized list instead:
private List<short[]> list = Collections.synchronizedList(new ArrayList<short[]>());
But you will need to synchronize for iterating:
synchronized(list) {
for(short[] item : list) {
//process the data
}
}
EDIT: proposals to use a BlockingQueue are probably better but would need more changes in your code.

You might look into a blockingqueue for this instead of an arraylist.

Take a look at Java's synchronization support.
This page covers making a group of statements synchronized on a specified object. That is: only one thread may execute any sections synchronized on that object at once, all others have to wait.

You can use synchronized blocks, but I think the best solution is to not share mutable data between threads at all.
Make each thread to write in its own space and collect and aggregate the results when the workers are finished.

http://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#synchronizedList%28java.util.List%29
You can ask the Collections class to wrap up your current ArrayList in a synchronized list.

what to use in multithreaded environment; Vector or ArrayList

I have this situation:
web application with cca 200 concurent requests (Threads) are in need to log something to local filesystem. I have one class to which all threads are placing their calls, and that class internally stores messages to one Array (Vector or ArrayList) which then in turn will be written to filesystem.
Idea is to return from thread's call ASAP so thread can do it's job as fast as possible, what thread wanted to log can be written to filesystem later, it is not so crucial.
So, that class in turn removes first element from that list and writes it to filesystem, while in real time there is 10 or 20 threads which are appending new logs at the end of that list.
I would like to use ArrayList since it is not synchronized and therefore thread's calls will last less, question is:
am I risking deadlocks / data loss? Is it better to use Vector since it is thread safe? Is it slower to use Vector?

Actually both ArrayList and Vector are very bad choices here, not because of synchronization (which you would definitely need), but because removing the first element is O(n).
The perfect data structure for your purspose is the ConcurrentLinkedQueue: it offers both thread safety (without using synchronization), and O(1) adding and removing.

Are you limitted to particular (old) java version? It not please consider using java.util.concurrent.LinkedBlockingQueue for this kind of stuff. It's really worth looking at java.util.concurrent.* package when dealing with concurrency.

Vector is worse than useless. Don't use it even when using multithreading. A trivial example of why it's bad is to consider two threads simultaneously iterating and removing elements on the list at the same time. The methods size(), get(), remove() might all be synchronized but the iteration loop is not atomic so - kaboom. One thread is bound to try removing something which is not there, or skip elements because the size() changes.
Instead use synchronized() blocks where you expect two threads to access the same data.
private ArrayList myList;
void removeElement(Object e)
{
synchronized (myList) {
myList.remove(e);
}
}
Java 5 provides explicit Lock objects which allow more finegrained control, such as being able to attempt to timeout if a resource is not available in some time period.
private final Lock lock = new ReentrantLock();
private ArrayList myList;
void removeElement(Object e) {
{
if (!lock.tryLock(1, TimeUnit.SECONDS)) {
// Timeout
throw new SomeException();
}
try {
myList.remove(e);
}
finally {
lock.unlock();
}
}

There actually is a marginal difference in performance between a sychronizedlist and a vector. (http://www.javacodegeeks.com/2010/08/java-best-practices-vector-arraylist.html)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.