Which concurrent Queue implementation should I use in Java? - java

From the JavaDocs:
A ConcurrentLinkedQueue is an appropriate choice when many threads will share access to a common collection. This queue does not permit null elements.
ArrayBlockingQueue is a classic "bounded buffer", in which a fixed-sized array holds elements inserted by producers and extracted by consumers. This class supports an optional fairness policy for ordering waiting producer and consumer threads
LinkedBlockingQueue typically have higher throughput than array-based queues but less predictable performance in most concurrent applications.
I have 2 scenarios, one requires the queue to support many producers (threads using it) with one consumer and the other is the other way around.
I do not understand which implementation to use. Can somebody explain what the differences are?
Also, what is the 'optional fairness policy' in the ArrayBlockingQueue?

ConcurrentLinkedQueue means no locks are taken (i.e. no synchronized(this) or Lock.lock calls). It will use a CAS - Compare and Swap operation during modifications to see if the head/tail node is still the same as when it started. If so, the operation succeeds. If the head/tail node is different, it will spin around and try again.
LinkedBlockingQueue will take a lock before any modification. So your offer calls would block until they get the lock. You can use the offer overload that takes a TimeUnit to say you are only willing to wait X amount of time before abandoning the add (usually good for message type queues where the message is stale after X number of milliseconds).
Fairness means that the Lock implementation will keep the threads ordered. Meaning if Thread A enters and then Thread B enters, Thread A will get the lock first. With no fairness, it is undefined really what happens. It will most likely be the next thread that gets scheduled.
As for which one to use, it depends. I tend to use ConcurrentLinkedQueue because the time it takes my producers to get work to put onto the queue is diverse. I don't have a lot of producers producing at the exact same moment. But the consumer side is more complicated because poll won't go into a nice sleep state. You have to handle that yourself.

Basically the difference between them are performance characteristics and blocking behavior.
Taking the easiest first, ArrayBlockingQueue is a queue of a fixed size. So if you set the size at 10, and attempt to insert an 11th element, the insert statement will block until another thread removes an element. The fairness issue is what happens if multiple threads try to insert and remove at the same time (in other words during the period when the Queue was blocked). A fairness algorithm ensures that the first thread that asks is the first thread that gets. Otherwise, a given thread may wait longer than other threads, causing unpredictable behavior (sometimes one thread will just take several seconds because other threads that started later got processed first). The trade-off is that it takes overhead to manage the fairness, slowing down the throughput.
The most important difference between LinkedBlockingQueue and ConcurrentLinkedQueue is that if you request an element from a LinkedBlockingQueue and the queue is empty, your thread will wait until there is something there. A ConcurrentLinkedQueue will return right away with the behavior of an empty queue.
Which one depends on if you need the blocking. Where you have many producers and one consumer, it sounds like it. On the other hand, where you have many consumers and only one producer, you may not need the blocking behavior, and may be happy to just have the consumers check if the queue is empty and move on if it is.

Your question title mentions Blocking Queues. However, ConcurrentLinkedQueue is not a blocking queue.
The BlockingQueues are ArrayBlockingQueue, DelayQueue, LinkedBlockingDeque, LinkedBlockingQueue, PriorityBlockingQueue, and SynchronousQueue.
Some of these are clearly not fit for your purpose (DelayQueue, PriorityBlockingQueue, and SynchronousQueue). LinkedBlockingQueue and LinkedBlockingDeque are identical, except that the latter is a double-ended Queue (it implements the Deque interface).
Since ArrayBlockingQueue is only useful if you want to limit the number of elements, I'd stick to LinkedBlockingQueue.

ArrayBlockingQueue has lower memory footprint, it can reuse element node, not like LinkedBlockingQueue that have to create a LinkedBlockingQueue$Node object for each new insertion.

SynchronousQueue ( Taken from another question )
SynchronousQueue is more of a handoff, whereas the LinkedBlockingQueue just allows a single element. The difference being that the put() call to a SynchronousQueue will not return until there is a corresponding take() call, but with a LinkedBlockingQueue of size 1, the put() call (to an empty queue) will return immediately. It's essentially the BlockingQueue implementation for when you don't really want a queue (you don't want to maintain any pending data).
LinkedBlockingQueue (LinkedList Implementation but Not Exactly JDK Implementation of LinkedList It uses static inner class Node to maintain Links between elements )
Constructor for LinkedBlockingQueue
public LinkedBlockingQueue(int capacity)
{
if (capacity < = 0) throw new IllegalArgumentException();
this.capacity = capacity;
last = head = new Node< E >(null); // Maintains a underlying linkedlist. ( Use when size is not known )
}
Node class Used to Maintain Links
static class Node<E> {
E item;
Node<E> next;
Node(E x) { item = x; }
}
3 . ArrayBlockingQueue ( Array Implementation )
Constructor for ArrayBlockingQueue
public ArrayBlockingQueue(int capacity, boolean fair)
{
if (capacity < = 0)
throw new IllegalArgumentException();
this.items = new Object[capacity]; // Maintains a underlying array
lock = new ReentrantLock(fair);
notEmpty = lock.newCondition();
notFull = lock.newCondition();
}
IMHO Biggest Difference between ArrayBlockingQueue and LinkedBlockingQueue is clear from constructor one has underlying data structure Array and other linkedList.
ArrayBlockingQueue uses single-lock double condition algorithm and LinkedBlockingQueue is variant of the "two lock queue" algorithm and it has 2 locks 2 conditions ( takeLock , putLock)

ConcurrentLinkedQueue is lock-free, LinkedBlockingQueue is not. Every time you invoke LinkedBlockingQueue.put() or LinkedBlockingQueue.take(), you need acquire the lock first. In other word, LinkedBlockingQueue has poor concurrency. If you care performance, try ConcurrentLinkedQueue + LockSupport.

Related

What's the purpose of the PriorityBlockingQueue?

I've been playing with blocking queues and PriorityQueue, and it got me thinking. I can't see a good usecase for PriorityBlockingQueue. The point of a priority queue is to sort the values put into it before they're retrieved. A blocking queue implies that values are inserted into it and retrieved from it concurrently. But, if that's the case, you'd never be able to guarantee the sort order.
BlockingQueue<Integer> q = new PriorityBlockingQueue<>();
new Thread (()->{ randomSleep(); q.put(2); randomSleep(); q.put(0); }).start();
new Thread (()->{ randomSleep(); q.put(3); randomSleep(); q.put(1); }).start();
ArrayList<Integer> ordered = new ArrayList<>(4);
for (int i = 0; i < 4; i++) {
randomSleep();
ordered.add(q.take());
}
System.out.println(ordered);
In this example, the order in which the main thread gets the offered values is quite random, which seems to defeat the purpose of a priority queue. Even with a single producer and single consumer, the order can not be ensured.
So, what is the use of PriorityBlockingQueue then?
In this example, the order in which the main thread gets the offered
values is quite random
Well you have a race-condition during the insertion and the retrieve of those elements. Hence, the reason why it looks random.
Nonetheless, you could use for instance the PriorityBlockingQueue to sequentially field up with some elements (or tasks) that need to be pick up by multiple threads in parallel by their highest priority element/task. In such case you can take advantage of the thread-safe properties of a structure that guarantees you that the the highest priority element is always ordered first.
One example would be a Queue of tasks in which those tasks have a priority, and you want those same tasks to be processed in parallel.

Can objects get lost if a LinkedList is add/remove fast by lots of threads?

sound like a silly question. I just started Java Concurrency.
I have a LinkedList that acts as a task queue and is accessed by multiple threads. They removeFirst() and execute it, other threads put more tasks (.add()). Tasks can have the thread put them back to the queue.
I notice that when there are a lot of tasks and they are put back to the queue a lot, the number of tasks I add to the queue initially are not what come out, 1, or sometimes 2 is missing.
I checked everything and I synchronized every critical section + notifyAll().
Already mark the LinkedList as 'volatile'.
Exact number is 384 tasks, each is put back 3072 times.
The problem doesn't occur if there is a small number of tasks & put back. Also if I System.out.println() all the steps then it doesn't happens anymore so I can't debug.
Could it be possible that LinkedList.add() is not fast enough so the threads somehow miss it?
Simplified code:
public void callByAllThreads() {
Task executedTask = null;
do
{
// access by multiple thread
synchronized(asyncQueue) {
executedTask = asyncQueue.poll();
if(executedTask == null) {
inProcessCount.incrementAndGet(); // mark that there is some processing going on
}
}
if(executedTask != null) {
executedTask.callMethod(); // subclass of task can override this method
synchronized(asyncQueue) {
inProcessCount.decrementAndGet();
asyncQueue.notifyAll();
}
}
}
while(executedTask != null);
}
The Task can override callMethod:
public void callMethodOverride() {
synchronized(getAsyncQueue()) {
getAsyncQueue().add(this);
getAsyncQueue().notifyAll();
}
}
From the docs for LinkedList:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
i.e. you should synchronize access to the list. You say you are, but if you are seeing items get "lost" then you probably aren't synchronizing properly. Instead of trying to do that, you could use a framework class that does it for you ...
... If you are always removing the next available (first) item (effectively a producer/consumer implementation) then you could use a BlockingQueue implementation, This is guaranteed to be thread safe, and has the advantage of blocking the consumer until an item is available. An example is the ArrayBlockingQueue.
For non-blocking thread-safe queues you can look at ConcurrentLinkedQueue
Marking the list instance variable volatile has nothing to do with your list being synchronized for mutation methods like add or removeFirst. volatile is simply to do with ensuring that read/write for that instance variable is communicated correctly between, and ordered correctly within, threads. Note I said that variable, not the contents of that variable (see the Java Tutorials > Atomic Access)
LinkedList is definitely not thread safe; you cannot use it safely with multiple threads. It's not a question of "fast enough," it's a question of changes made by one thread being visible to other threads. Marking it volatile doesn't help; that only affects references to the LinkedList being changed, not changes to the contents of the LinkedList.
Consider ConcurrentLinkedQueue or ConcurrentLinkedDeque.
LinkedList is not thread safe, so yes, multiple threads accessing it simultaneously will lead to problems. Synchronizing critical sections can solve this, but as you are still having problems you probably made a mistake somewhere. Try wrapping it in a Collections.synchronizedList() to synchronize all method calls.
Linked list is not thread safe , you can use ConcurrentLinkedQueue if it fits your need,which seems possibly can.
As documentation says
An unbounded thread-safe queue based on linked nodes. This queue
orders elements FIFO (first-in-first-out). The head of the queue is
that element that has been on the queue the longest time. The tail of
the queue is that element that has been on the queue the shortest
time. New elements are inserted at the tail of the queue, and the
queue retrieval operations obtain elements at the head of the queue. A
ConcurrentLinkedQueue is an appropriate choice when many threads will
share access to a common collection. This queue does not permit null
elements.
You increment your inProcessCount when executedTask == null which is obviously the opposite of what you want to do. So it’s no wonder that it will have inconsistent values.
But there are other issues as well. You call notifyAll() at several places but as long as there is no one calling wait() that has no use.
Note further that if you access an integer variable consistently from inside synchronized blocks only throughout the code, there is no need to make it an AtomicInteger. On the other hand, if you use it, e.g. because it will be accessed at other places without additional synchronization, you can move the code updating the AtomicInteger outside the synchronized block.
Also, a method which calls a method like getAsyncQueue() three times looks suspicious to a reader. Just call it once and remember the result in a local variable, then everone can be confident that it is the same reference on all three uses. Generally, you have to ensure that all code is using the same list, hence the appropriate modifier for the variable holding it is final, not volatile.

Synchronized collections list

I have 2 threads needing access to a Queue, one for putting and one for getting.
So I have an initiation
public static Queue<WorldData> blockDestructionQueue = Collections.synchronizedList(new LinkedList<WorldData>());
With the above I get a Type mismatch: cannot convert from List to Queue
I tried casting it to a Queue but this did not work.
public static Queue<WorldData> blockDestructionQueue = (Queue<WorldData>)Collections.synchronizedList(new LinkedList<WorldData>());
I was wondering as to why this is not working.
I got this information from another stack overflow answer.
How to use ConcurrentLinkedQueue?
In the correct answer paragraph 6
If you only have one thread putting stuff into the queue, and another
thread taking stuff out of the queue, ConcurrentLinkingQueue is
probably overkill. It's more for when you may have hundreds or even
thousands of threads accessing the queue at the same time. Your needs
will probably be met by using:
Queue<YourObject> queue = Collections.synchronizedList(new LinkedList<YourObject>());
A plus of this is that it locks on the instance (queue), so you can
synchronize on queue to ensure atomicity of composite operations (as
explained by Jared). You CANNOT do this with a ConcurrentLinkingQueue,
as all operations are done WITHOUT locking on the instance (using
java.util.concurrent.atomic variables). You will NOT need to do this
if you want to block while the queue is empty, because poll() will
simply return null while the queue is empty, and poll() is atomic.
Check to see if poll() returns null. If it does, wait(), then try
again. No need to lock.
Additional Information:
edit: Eclipse was trying to be too helpful and decided to add a break point exception where it was not needed and was not asked to put one.
A queue is not a list and a Queue is not an implementation of List, although you can implement a queue with a list.
Have a look at BlockingQueue it is probably a better fit for what you need:
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/BlockingQueue.html
Collections.synchronizedList returns an instance of SynchronizedList which does not extend Queue. LinkedList is a Queue but that's not what you're using at that point.

Using ConcurrentLinkedQueue with some non atomic operations

When we use one of the inbuilt queues like ConcurrentLinkedQueue or even some BlockingQueue, single calls are atomic and guaranteed to be thread safe.
But when of the 5 calls to the API, 4 calls are single, but one call is of the form:
if(some condition)
{
queue.call();
}
This call needs to be in a synchronized block since this operations is non atomic.
But doesn't introducing this call also means that all access to this queue, whether read or write should be synchronized from now on?
If yes, can I assume that once a single non atomic call creeps in the code, which is very likely, then all access to the fancy queue will have to be manually synchronized?
ConcurrentLinkedQueue doesn't make the quite the same atomic guarantees that you assume. From the javadoc:
Memory consistency effects: As with other concurrent collections,
actions in a thread prior to placing an object into a
ConcurrentLinkedQueue happen-before actions subsequent to the access
or removal of that element from the ConcurrentLinkedQueue in another
thread.
It's not the same as wrapping a LinkedList or something in a Collections.synchronizedList; different threads might see different answers to size(), for example, because it doesn't lock the collection.
Based on your comment you can probably replace the if statement with a single call to Queue's poll and check if the retrieved element is null.

Workings of AtomicReferenceArray

I am wondering if AtomicReferenceArray can be used as a replacement for ConcurrentLinkedQueue (if one could live with a bounded structure).
I currently have something like:
ConcurrentLinkedQueue<Object[]> queue = new ConcurrentLinkedQueue<Object[]>();
public void store(Price price, Instrument instrument, Object[] formats){
Object[] elements = {price, instrument, formats};
queue.offer( elements);
}
The store(..) is called by multiple threads.
I also have a consumer thread, which periodically wakes up and consumes the elements.
private class Consumer implements Runnable{
#Override
public void run(){
List<Object[]> holder = drain( queue );
for(Object[] elements : holder ){
for( Object e : elements ){
//process ...
}
}
private final List<Object[]> drain(){
//...
}
}
Can I swap out ConcurrentLinkedQueue in favor of AtomicReferenceArray and still maintain thread safety aspect?
Specifically, atomically storing the elements and establishing a "happens before" relationship so the consumer thread sees all the elements stored by different threads?
I tried reading the source code for AtomicReferenceArray but still not absolutely sure.
Cheers
An AtomicReferenceArray can be used as a lock-free single consumer / multiple producer ring buffer. I was experimenting with an implementation a few months ago and have a working prototype. The advantage is a reduction in garbage creation, better cache locality, and better performance when not full due to being simpler. The disadvantages are a lack of strict fifo semantics and poor performance when the buffer is full as a producer must wait for a drain to occur. This might be mitigated by falling back to a ConcurrentLinkedQueue to avoid stalls.
The happens-before edge must be seen by producers so that they acquire a unique slot. However as only a single consumer is required, this can be delayed until the draining is complete. In my usage the drain is amortized across threads, so the consumer is chosen by the successful acquisition of a try-lock. The release of that lock provides the edge, allowing the array updates to use lazy sets within the critical section.
I would only use this approach in specialized scenarios where performance is highly tuned. In my usage it makes sense as an internal implementation detail for a cache. I wouldn't use it in general, though.
I am wondering if AtomicReferenceArray can be used as a replacement for ConcurrentLinkedQueue (if one could live with a bounded structure).
Basically, no.
While the individual updates to the array elements happen atomically, that is not sufficient for a queue. You also need keep track of where the head and tail of the queue are, and you can't do that atomically with the queue cell read / write unless there's some additional locking / synchronization going on.
(This is independent of the inner workings of AtomicReferenceArray. The problem is simply that the API won't provide you with the functionality that you require to update two things in a single atomic operation.)

Categories