Lock-free and size-restricted queue in Java

Lock-free and size-restricted queue in Java - java

I'm trying to extend an implementation of a lock-free queue in Java according to this posting.
For my implemenation I am restricted in using only atomic variables/references.
The addition is that my queue should have a maximum size.
And therefore a putObject() should block when the queue is full and a getObject() if the queue is empty.
At the moment I don't know how to solve this without using locks.
When using an AtomicInteger for example, the modifying operations would be atomic.
But there is still the problem that I must handle a check-and-modify-situation in putObject() and getObject() right?
So the situation still exists that an enqueuing thread will be interrupted after checking the current queue size.
My question at the moment is if this problem is solvable at all with my current restrictions?
Greets

If you have a viable, correctly working lock-free queue, adding a maximum size can be as easy as just adding an AtomicInteger, and doing checks, inc, dec at the right time.
When adding an element, you basically pre-reserve the place in the queue.
Something like:
while (true) {
int curr = count.get();
if (curr < MAX) {
if (count.compareAndSet(curr, curr + 1)) {
break;
}
}
else {
return FULL;
}
}
Then you add the new element and link it in.
When getting, you can just access the head as usual and check if there is anything at all to return from the queue. If yes, you return it and then decrease the counter. If not, you just return EMPTY. Notice that I'm not using the counter to check if the queue is really empty, since the counter could be 1, while there is nothing linked into the queue yet, because of the pre-reserve approach. So I just trust your queue has a way to tell you "I have something in me" or not (it must, or get() would never work).

It is a very common problem which is usually solved by using a Ring Buffer. This is what network adapters use as does the Disruptor library. I suggest you have a look at Disruptor and a good example of what you can do with ring buffers.

Related

What is the proper way to wait (block) until a LinkedBlockingQueue is nonempty, without mutating it? [duplicate]

I have a blocking queue of objects.
I want to write a thread that blocks till there is a object on the queue. Similar to the functionality provided by BlockingQueue.take().
However, since I do not know if I will be able to process the object successfully, I want to just peek() and not remove the object. I want to remove the object only if I am able to process it successfully.
So, I would like a blocking peek() function. Currently, peek() just returns if the queue is empty as per the javadocs.
Am I missing something? Is there another way to achieve this functionality?
EDIT:
Any thoughts on if I just used a thread safe queue and peeked and slept instead?
public void run() {
while (!exit) {
while (queue.size() != 0) {
Object o = queue.peek();
if (o != null) {
if (consume(o) == true) {
queue.remove();
} else {
Thread.sleep(10000); //need to backoff (60s) and try again
}
}
}
Thread.sleep(1000); //wait 1s for object on queue
}
}
Note that I only have one consumer thread and one (separate) producer thread. I guess this isn't as efficient as using a BlockingQueue... Any comments appreciated.

You could use a LinkedBlockingDeque and physically remove the item from the queue (using takeLast()) but replace it again at the end of the queue if processing fails using putLast(E e). Meanwhile your "producers" would add elements to the front of the queue using putFirst(E e).
You could always encapsulate this behaviour within your own Queue implementation and provide a blockingPeek() method that performs takeLast() followed by putLast() behind the scenes on the underlying LinkedBlockingDeque. Hence from the calling client's perspective the element is never removed from your queue.

However, since I do not know if I will be able to process the object successfully, I want to just peek() and not remove the object. I want to remove the object only if I am able to process it successfully.
In general, it is not thread-safe. What if, after you peek() and determine that the object can be processed successfully, but before you take() it to remove and process, another thread takes that object?

Could you also just add an event listener queue to your blocking queue, then when something is added to the (blocking)queue, send an event off to your listeners? You could have your thread block until it's actionPerformed method was called.

The only thing I'm aware of that does this is BlockingBuffer in Apache Commons Collections:
If either get or remove is called on
an empty Buffer, the calling thread
waits for notification that an add or
addAll operation has completed.
get() is equivalent to peek(), and a Buffer can be made to act like BlockingQueue by decorating a UnboundedFifoBuffer with a BlockingBuffer

The quick answer is, not there's not really a way have a blocking peek, bar implementing a blocking queue with a blocking peek() yourself.
Am I missing something?
peek() can be troublesome with concurrency -
If you can't process your peek()'d message - it'll be left in the queue, unless you have multiple consumers.
Who is going to get that object out of the queue if you can't process it ?
If you have multiple consumers, you get a race condition between you peek()'ing and another thread also processing items, resulting in duplicate processing or worse.
Sounds like you might be better off actually removing the item and process it using a
Chain-of-responsibility pattern
Edit: re: your last example: If you have only 1 consumer, you will never get rid of the object on the queue - unless it's updated in the mean time - in which case you'd better be very very careful about thread safety and probably shouldn't have put the item in the queue anyway.

Not an answer per se, but: JDK-6653412 claims this is not a valid use case.

Looks like BlockingQueue itself doesn't have the functionality you're specifying.
I might try to re-frame the problem a little though: what would you do with objects you can't "process correctly"? If you're just leaving them in the queue, you'll have to pull them out at some point and deal with them. I'd reccommend either figuring out how to process them (commonly, if a queue.get() gives any sort of invalid or bad value, you're probably OK to just drop it on the floor) or choosing a different data structure than a FIFO.

The 'simplest' solution
Do not process the next element until the previous element is processed succesfully.
public void run() {
Object lastSuccessfullyProcessedElement = null;
while (!exit) {
Object obj = lastSuccessfullyProcessedElement == null ? queue.take() : lastSuccessfullyProcessedElement; // blocking
boolean successful = process(obj);
if(!successful) {
lastSuccessfullyProcessedElement = obj;
} else {
lastSuccessfullyProcessedElement = null;
}
}
}
Calling peek() and checking if the value is null is not CPU efficient.
I have seen CPU usage going to 10% on my system when the queue is empty for the following program.
while (true) {
Object o = queue.peek();
if(o == null) continue;
// omitted for the sake of brevity
}
Adding sleep() adds slowness.
Adding it back to the queue using putLast will disturb the order. Moreover, it is a blocking operation which requires locks.

Can objects get lost if a LinkedList is add/remove fast by lots of threads?

sound like a silly question. I just started Java Concurrency.
I have a LinkedList that acts as a task queue and is accessed by multiple threads. They removeFirst() and execute it, other threads put more tasks (.add()). Tasks can have the thread put them back to the queue.
I notice that when there are a lot of tasks and they are put back to the queue a lot, the number of tasks I add to the queue initially are not what come out, 1, or sometimes 2 is missing.
I checked everything and I synchronized every critical section + notifyAll().
Already mark the LinkedList as 'volatile'.
Exact number is 384 tasks, each is put back 3072 times.
The problem doesn't occur if there is a small number of tasks & put back. Also if I System.out.println() all the steps then it doesn't happens anymore so I can't debug.
Could it be possible that LinkedList.add() is not fast enough so the threads somehow miss it?
Simplified code:
public void callByAllThreads() {
Task executedTask = null;
do
{
// access by multiple thread
synchronized(asyncQueue) {
executedTask = asyncQueue.poll();
if(executedTask == null) {
inProcessCount.incrementAndGet(); // mark that there is some processing going on
}
}
if(executedTask != null) {
executedTask.callMethod(); // subclass of task can override this method
synchronized(asyncQueue) {
inProcessCount.decrementAndGet();
asyncQueue.notifyAll();
}
}
}
while(executedTask != null);
}
The Task can override callMethod:
public void callMethodOverride() {
synchronized(getAsyncQueue()) {
getAsyncQueue().add(this);
getAsyncQueue().notifyAll();
}
}

From the docs for LinkedList:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.
i.e. you should synchronize access to the list. You say you are, but if you are seeing items get "lost" then you probably aren't synchronizing properly. Instead of trying to do that, you could use a framework class that does it for you ...
... If you are always removing the next available (first) item (effectively a producer/consumer implementation) then you could use a BlockingQueue implementation, This is guaranteed to be thread safe, and has the advantage of blocking the consumer until an item is available. An example is the ArrayBlockingQueue.
For non-blocking thread-safe queues you can look at ConcurrentLinkedQueue
Marking the list instance variable volatile has nothing to do with your list being synchronized for mutation methods like add or removeFirst. volatile is simply to do with ensuring that read/write for that instance variable is communicated correctly between, and ordered correctly within, threads. Note I said that variable, not the contents of that variable (see the Java Tutorials > Atomic Access)

LinkedList is definitely not thread safe; you cannot use it safely with multiple threads. It's not a question of "fast enough," it's a question of changes made by one thread being visible to other threads. Marking it volatile doesn't help; that only affects references to the LinkedList being changed, not changes to the contents of the LinkedList.
Consider ConcurrentLinkedQueue or ConcurrentLinkedDeque.

LinkedList is not thread safe, so yes, multiple threads accessing it simultaneously will lead to problems. Synchronizing critical sections can solve this, but as you are still having problems you probably made a mistake somewhere. Try wrapping it in a Collections.synchronizedList() to synchronize all method calls.

Linked list is not thread safe , you can use ConcurrentLinkedQueue if it fits your need,which seems possibly can.
As documentation says
An unbounded thread-safe queue based on linked nodes. This queue
orders elements FIFO (first-in-first-out). The head of the queue is
that element that has been on the queue the longest time. The tail of
the queue is that element that has been on the queue the shortest
time. New elements are inserted at the tail of the queue, and the
queue retrieval operations obtain elements at the head of the queue. A
ConcurrentLinkedQueue is an appropriate choice when many threads will
share access to a common collection. This queue does not permit null
elements.

You increment your inProcessCount when executedTask == null which is obviously the opposite of what you want to do. So it’s no wonder that it will have inconsistent values.
But there are other issues as well. You call notifyAll() at several places but as long as there is no one calling wait() that has no use.
Note further that if you access an integer variable consistently from inside synchronized blocks only throughout the code, there is no need to make it an AtomicInteger. On the other hand, if you use it, e.g. because it will be accessed at other places without additional synchronization, you can move the code updating the AtomicInteger outside the synchronized block.
Also, a method which calls a method like getAsyncQueue() three times looks suspicious to a reader. Just call it once and remember the result in a local variable, then everone can be confident that it is the same reference on all three uses. Generally, you have to ensure that all code is using the same list, hence the appropriate modifier for the variable holding it is final, not volatile.

multiple threads accessing an ArrayList

i have an ArrayList that's used to buffer data so that other threads can read them
this array constantly has data added to it since it's reading from a udp source, and the other threads constantly reading from that array.Then the data is removed from the array.
this is not the actual code but a simplified example :
public class PacketReader implements Runnable{
pubic static ArrayList<Packet> buffer = new ArrayList() ;
#Override
public void run(){
while(bActive){
//read from udp source and add data to the array
}
}
public class Player implements Runnable(){
#Override
public void run(){
//read packet from buffer
//decode packets
// now for the problem :
PacketReader.buffer.remove(the packet that's been read);
}
}
The remove() method removes packets from the array and then shifts all the packets on the right to the left to cover the void.
My concern is : since the buffer is constantly being added to and read from by multiple threads , would the remove() method make issues since its gonna have to shift packets to the left?
i mean if .add() or .get() methods get called on that arraylist at the same time that shift is being done would it be a problem ?
i do get index out of bounds exception sometimes and its something like :
index : 100 size 300 , which is strange cuz index is within size , so i want to know if this is what may possibly be causing the problem or should i look for other problems .
thank you

It sounds like what you really want is a BlockingQueue. ArrayBlockingQueue is probably a good choice. If you need an unbounded queue and don't care about extra memory utilization (relative to ArrayBlockingQueue), LinkedBlockingQueue also works.
It lets you push items in and pop them out, in a thread-safe and efficient way. The behavior of those pushes and pops can differ (what happens when you try to push to a full queue, or pop from an empty one?), and the JavaDocs for the BlockingQueue interface have a table that shows all of these behaviors nicely.
A thread-safe List (regardless of whether it comes from synchronizedList or CopyOnWriteArrayList) isn't actually enough, because your use case uses a classic check-then-act pattern, and that's inherently racy. Consider this snippet:
if(!list.isEmpty()) {
Packet p = list.remove(0); // remove the first item
process(p);
}
Even if list is thread-safe, this usage is not! What if list has one element during the "if" check, but then another thread removes it before you get to remove(0)?
You can get around this by synchronizing around both actions:
Pattern p;
synchronized (list) {
if (list.isEmpty()) {
p = null;
} else {
p = list.remove(0);
}
}
if (p != null) {
process(p); // we don't want to call process(..) while still synchronized!
}
This is less efficient and takes more code than a BlockingQueue, though, so there's no reason to do it.

Yes there would be problems because ArrayList is not thread-safe, the internal state of the ArrayList object would be corrupted and eventually you would have some incorrect output or runtime exceptions appearing. You can try using synchronizedList(List list), or if it's a good fit you could try using a CopyOnWriteArrayList.
This issue is the Producer–consumer problem. You can see how much people fix it by using a lock of some kind taking turns extracting an object out of a buffer (a List in your case). There are thread safe buffer implementations you could look at as well if you don't necessarily need a List.

Scalable patterns for thread-safe hashtable puts when keeping track of frequency

This was an interview question I got some time last week and it ended at a cliffhanger. The question was simple: Design a service that keeps track of the frequency of "messages" (a 1 line string, could be in different languages) passed to it. There are 2 broad apis: submitMsg(String msg) and getFrequency(String msg). My immediate reaction was to use as hashMap that uses a String as a key (in this case, a message) and an Integer as a value (to keep track of counts/frequency).
The submitMsg api simply sees whether a message exists in the hashMap. If it doesn't, put the message and set the frequency to 1; if it does, then get the current count and increment it by 1. The interviewer then pointed out this would fail miserably in the event multiple threads access the SAME key at the SAME exact time.
For example: At 12:00:00:000 Thread1 would try to "submitMsg", and thereby my method would do a (1) get on the hashMap and see that the value is not null, it is infact, say 100 (2) do a put by incrementing the frequency by 1 so that the key's value is 101. Meanwhile consider that Thread2 ALSO tried to do a submitMsg at exactly At 12:00:00:000, and the method once again internally did a get on the hashMap (which returned a 100 - this is a race condition), after which the hashMap now increments the frequency to 101. Alas, the true frequency should have been 102 and not 101, and this is a major design flaw in a largely multithreaded environment. I wasn't sure how to stop this from happening: Putting a lock on simply the write isn't good enough, and having a lock on a read didn't make sense. What would have been ideal is to "lock" an element if a get was invoked internally via the submitMsg api because we expect it to be "written to" soonafter. The lock would be released once the frequency had been updated, but if someone were to use the getFrequency() api having a pure lock wouldn't make sense. I'm not sure whether a mutex would help here because I don't have a strong background in distributed systems.
I'm looking to the SO community for help on the best way to think through a problem like this. Is the magic in the datastructure to be used or some kind of synchronization that I need to do in my api itself? How can we maintain the integrity of "frequency" while maintaining the scalability of the service as well?

Well, your initial idea isn't a million miles off, you just need to make it thread safe. For instance, you could use a ConcurrentHashMap<String, AtomicInteger>.
public void submitMsg(String msg) {
AtomicInteger previous = map.putIfAbsent(msg, new AtomicInteger(1));
if (null != previous) {
previous.incrementAndGet();
}
}

The simplest solution is using Guava's com.google.common.collect.ConcurrentHashMultiset:
private final ConcurrentHashMultiset<String> multiset = ConcurrentHashMultiset.create();
public void submitMsg(String msg) {
multiset.add(msg);
}
public int count(String msg) {
return multiset.count(msg);
}
But this is basically the same as Aurand's solution, just that somebody already implemented the boring details like creating the counter if it doesn't exists yet, etc.

Treat it as a Producer–consumer problem.
The service is the producer; it should add each message to a queue that feeds the consumer. You could run one queue per producer to ensure that the producers do not wait.
The consumer encapsulates the HashTable, and pulls the messages off the queue and updates the table.

Synchronized collections list

I have 2 threads needing access to a Queue, one for putting and one for getting.
So I have an initiation
public static Queue<WorldData> blockDestructionQueue = Collections.synchronizedList(new LinkedList<WorldData>());
With the above I get a Type mismatch: cannot convert from List to Queue
I tried casting it to a Queue but this did not work.
public static Queue<WorldData> blockDestructionQueue = (Queue<WorldData>)Collections.synchronizedList(new LinkedList<WorldData>());
I was wondering as to why this is not working.
I got this information from another stack overflow answer.
How to use ConcurrentLinkedQueue?
In the correct answer paragraph 6
If you only have one thread putting stuff into the queue, and another
thread taking stuff out of the queue, ConcurrentLinkingQueue is
probably overkill. It's more for when you may have hundreds or even
thousands of threads accessing the queue at the same time. Your needs
will probably be met by using:
Queue<YourObject> queue = Collections.synchronizedList(new LinkedList<YourObject>());
A plus of this is that it locks on the instance (queue), so you can
synchronize on queue to ensure atomicity of composite operations (as
explained by Jared). You CANNOT do this with a ConcurrentLinkingQueue,
as all operations are done WITHOUT locking on the instance (using
java.util.concurrent.atomic variables). You will NOT need to do this
if you want to block while the queue is empty, because poll() will
simply return null while the queue is empty, and poll() is atomic.
Check to see if poll() returns null. If it does, wait(), then try
again. No need to lock.
Additional Information:
edit: Eclipse was trying to be too helpful and decided to add a break point exception where it was not needed and was not asked to put one.

A queue is not a list and a Queue is not an implementation of List, although you can implement a queue with a list.
Have a look at BlockingQueue it is probably a better fit for what you need:
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/BlockingQueue.html

Collections.synchronizedList returns an instance of SynchronizedList which does not extend Queue. LinkedList is a Queue but that's not what you're using at that point.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.