We have a specialist, multi-producer (User) and single-consumer (Engine), queue. The User threads runs more frequently and always adds individual elements to the queue. The Engine thread operation runs less frequently and processes the stack elements in a batch. If the stack is empty, it'll park until the User thread has added an entry. This way a notify only needs to happen when the queue goes from empty to 1.
In this implementation, instead of the Engine thread iterating and removing one item at a time, it removes them all - a drainAll, instead of drainTo. No other operations can mutate the stack - just the User thread add, and the engine thread drainAll.
Currently we do this via a synchronised linked list, we are wondering if there is a non-blocking way to do this. The drainTo operation on JDK classes will iterate the stack, we just want to take everything in the stack in one operation, without iterating - as each iteration hits volatile/cas related logic, so we'd ideally just like to hit that once, per drainAll. The the engine thread can iterate and operate on each individual element, without touching sync/volatile/cas operations.
The current implementation looks something like:
public class SynchronizedPropagationQueue implements PropagatioQueue {
protected volatile PropagationEntry head;
protected volatile PropagationEntry tail;
protected synchronized void addEntry( PropagationEntry entry ) {
if ( head == null ) {
head = entry;
notifyWaitOnRest();
} else {
tail.setNext( entry );
}
tail = entry;
}
#Override
public synchronized PropagationEntry drainAll() {
PropagationEntry currentHead = head;
head = null;
tail = null;
return currentHead;
}
public synchronized void waitOnRest() {
try {
log.debug("Engine wait");
wait();
} catch (InterruptedException e) {
// do nothing
}
log.debug("Engine resumed");
}
#Override
public synchronized void notifyWaitOnRest() {
notifyAll();
}
}
asdf
Stacks have a very simple non-blocking implementation that supports a concurrent "pop all" operation easily, and can easily detect the empty->non-empty transition. You could have all your producers push items onto a stack and then have the engine empty the whole thing at once. It looks like this:
public class EngineQueue<T>
{
private final AtomicReference<Node<T>> m_lastItem = new AtomicReference<>();
public void add(T item)
{
Node<T> newNode = new Node<T>(item);
do {
newNode.m_next = m_lastItem.get();
} while(!m_lastItem.compareAndSet(newNode.m_next, newNode));
if (newNode.m_next == null)
{
// ... just went non-empty signal any waiting consumer
}
}
public List<T> removeAll()
{
Node<T> stack = m_lastItem.getAndSet(null);
// ... wait for non-empty if necessary
List<T> ret = new ArrayList<>();
for (;stack != null; stack=stack.m_next)
{
ret.add(stack.m_data);
}
Collections.reverse(ret);
return ret;
}
private static class Node<U>
{
Node<U> m_next;
final U m_data;
Node(U data)
{
super();
m_data = data;
}
}
}
For signaling around the empty -> non-empty transition, you can use normal synchronization. This is not going to be expensive if you only do it when you detect an empty state... since you only get to the empty state when you're out of work to do.
Currently we do this via a synchronised linked list, we are wondering if there is a non-blocking way to do this. The drainTo operation on JDK classes will iterate the stack, we just want to take everything in the stack in one operation, without iterating
Maybe I don't understand but it seems like using a BlockingQueue.drainTo(...) method would be better than your implementation. For example the LinkedBlockingQueue.drainTo(...) method just has one lock around that method -- there's no iterating overhead that I see.
If this is not an academic discussion then I'd doubt that your performance problems are with the queue itself and would concentrate your efforts in other areas. If it is academic then #Matt's answer might be better although certainly there's a lot more code to be written to support the full Collection method list.
Related
i am currently using a ConcurrentLinkedQueue, so that I can use natural order FIFO and also use it in a thread safe application . I have a requirement to log the size of the queue every minute and given that this collection does not guarantee size and also cost to calculate size is O(N), is there any alternative bounded non blocking concurrent queue that I can use where in obtaining size will not be a costly operation and at the same time the add/remove operation is not expensive either?
If there is no collection, do I need to use LinkedList with locks?
If you really (REALLY) need to log a correct, current size of the Queue you are currently dealing with - you need to block. There is simply no other way. You can think that maintaining a separate LongAdder field might help, may be making your own interface as a wrapper around ConcurrentLinkedQueue, something like:
interface KnownSizeQueue<T> {
T poll();
long size();
}
And an implementation:
static class ConcurrentKnownSizeQueue<T> implements KnownSizeQueue<T> {
private final ConcurrentLinkedQueue<T> queue = new ConcurrentLinkedQueue<>();
private final LongAdder currentSize = new LongAdder();
#Override
public T poll() {
T result = queue.poll();
if(result != null){
currentSize.decrement();
}
return result;
}
#Override
public long size() {
return currentSize.sum();
}
}
I just encourage you to add one more method, like remove into the interface and try to reason about the code. You will, very shortly realize, that such implementations will still give you a wrong result. So, do not do it.
The only reliable way to get the size, if you really need it, is to block for each operation. This comes at a high price, because ConcurrentLinkedQueue is documented as:
This implementation employs an efficient non-blocking...
You will lose those properties, but if that is a hard requirement that does not care about that, you could write your own:
static class ParallelKnownSizeQueue<T> implements KnownSizeQueue<T> {
private final Queue<T> queue = new ArrayDeque<>();
private final ReentrantLock lock = new ReentrantLock();
#Override
public T poll() {
try {
lock.lock();
return queue.poll();
} finally {
lock.unlock();
}
}
#Override
public long size() {
try {
lock.lock();
ConcurrentLinkedQueue
return queue.size();
} finally {
lock.unlock();
}
}
}
Or, of course, you can use an already existing structure, like LinkedBlockingDeque or ArrayBlockingQueue, etc - depending on what you need.
I'm doing a tree search for a class assignment. I understand the tree search part, but as I have some extra time, I wanted to speed it up by adding more threads.
The final task is to take in a set of constraints, classes and time-slots, and output a schedule with all those classes, and which satisfies all the constraints. An empty or partial assignment goes in, a complete class assignment comes out.
Our search is designed like a tree, with the input being the root node. The function div(n) is as follows: for a node n, find an unused class C, and for each unused slot S, produce a child node with C in S. To make the search more efficient, we use a search control which ranks the quality of nodes, so that the best candidates are selected first, and we don't waste time on bad candidates
A node implements Comparable, with compareTo() implemented using the search control. I use a priority queue to store nodes awaiting processing, so the 'best' nodes are always next in line. A worker removes an node, applies div() and adds the children to the priority queue.
My first approach was using a shared priority queue, specifically PriorityBlockingQueue. The performance was abysmal, since the queue was almost always blocking.
I tried to fix it by adding a background worker and a ConcurrentLinkedQueue buffer. workers would add to the buffer, and the the worker would periodically move elements from the buffer to the priority queue. This didn't work either.
The best performance I have found is to give each worker it's own priority queue. I'm guessing that this is as good as it gets, as now threads aren't connected to the actions of others. with this config, on an 4C/8T machine, I get a speedup of ~2.5. I think the bottleneck here is the allocation of memory for all these nodes, but I could be wrong here.
From Searcher:
private PriorityQueue<Schedule> workQueue;
private static volatile boolean shutdownSignal = false;
private Schedule best;
public Searcher(List<Schedule> instances) {
workQueue = new PriorityQueue<>(instances);
}
public static void stop() {
shutdownSignal = true;
}
/**
* Run the search control starting with the first node in the workQueue
*/
#Override
public void run() {
while (!shutdownSignal) {
try {
Schedule next = workQueue.remove();
List<Schedule> children = next.div(checkBest);
workQueue.addAll(children);
} catch (Exception e) {
//TODO: handle exception
}
}
//For testing
System.out.println("Shutting down: " + workQueue.size());
}
//passing a function as a parameter
Consumer<Schedule> checkBest = new Consumer<Schedule>() {
public void accept(Schedule sched) {
if (best == null || sched.betterThan(best)) {
best = sched;
Model.checkBest.accept(sched);
}
}
};
From Schedule:
public List<Schedule> div(Consumer<Schedule> completion) {
List<Schedule> n = new ArrayList<>();
int selected = 0;
List<Slot> available = Model.getSlots();
List<Slot> allocated = getAssigned();
while (allocated.get(selected) != null) {
selected++;
} // find first available slot to fill.
// Iterate through all available slots
for (Slot t : available) {
//Prepare a fresh copy
List<Slot> newAssignment = new ArrayList<>(allocated.size());
Collections.copy(newAssignment, allocated);
//assign the course to the timeslot
newAssignment.set(selected, t);
Schedule next = new Schedule(this, newAssignment);
n.add(next);
}
/**
* Filter out nodes which violate the hard constraints and which are solved,
* and check if they are the best in a calling thread
*/
List<Schedule> unsolvedNodes = new ArrayList<>();
for (Schedule schedule: n) {
if (schedule.constr() && !schedule.solved()){
unsolvedNodes.add(schedule);
completion.accept(schedule);
}
}
return unsolvedNodes;
}
I would say that fork-join framework is an appropriate tool for your task. You need to extend your task from either ResursiveTask or ResursiveAction and submit it to ForkJoinPool. Here is a pseudo-code sample. Also your shutdown flag must be volatile.
public class Task extends RecursiveAction {
private final Node<Integer> node;
public Task(Node<Integer> node) {
this.node = node;
}
#Override
protected void compute() {
// check result and stop recursion if needed
List<Task> subTasks = new ArrayList<>();
List<Node<Integer>> nodes = div(this.node);
for (Node<Integer> node : nodes) {
Task task = new Task(node);
task.fork();
subTasks.add(task);
}
for(Task task : subTasks) {
task.join();
}
}
public static void main(String[] args) {
Node root = getRootNode();
new ForkJoinPool().invoke(new Task(root));
}
I'm using Spring framework. Need to have a list of objects, which should get all data from database at once. When data is changed, list will be null and next get operation should fill data from database again. Is my code correct for multi-thread environment?
#Component
#Scope("singleton")
public class MyObjectHolder {
private volatile List<MyObject> objectList = null;
public List<MyObject> getObjectList() {
if (objectList == null) {
synchronized (objectList) {
if (objectList == null) {
objectList = getFromDB();
}
}
}
return objectList;
}
synchronized
public void clearObjectList() {
objectList = null;
}
}
Short answer: no.
public class MyObjectHolder {
private final List<MyObject> objectList = new List<>();
public List<MyObject> getObjectList() {
return objectList;
}
This is the preferred singleton pattern.
Now you need to figure out how to get the data into the list in a thread-safe way. For this Java already has some pre-made thread-safe lists in the concurrent package, which should be preferred to any synchronized implementation, as they are much faster under heavy threading.
Your problem could be solved like this:
public class MyObjectHolder {
private final CopyOnWriteArrayList<MyObject> objectList = new CopyOnWriteArrayList<>();
public List<MyObject> getObjectList() {
return objectList;
}
public boolean isEmtpy() {
return objectList.isEmpty();
}
public void readDB() {
final List<MyObject> dbList = getFromDB();
// ?? objectList.clear();
objectList.addAll(dbList);
}
}
Please note the absence of any synchronized, yet the thing is completely thread-safe. Java guarantees that the calls on that list are performed atomically. So I can call isEmpty() while someone else is filling up the list. I will only get a snapshot of a moment in time and can't tell what result I will get, but it will in all cases succeed without error.
The DB call is first written into a temporary list, therefore no threading issues can happen here. Then the addAll() will atomically move the content into the real list, again: all thread-safe.
The worst-case scenario is that Thread A is just about done writing the new data, while at the same time Thread B checks if the list contains any elements. Thread B will receive the information that the list is empty, yet a microsecond later it contains tons of data. You need to deal with this situation by either repeatedly polling or by using an observer pattern to notify the other threads.
No, your code is not thread safe. For example, you could assign objectList in one thread at time X, but set it to null (via clearObjectList()) at time X+1 because you are synchronizing on 2 different objects. The first synchronization is on objectList itself and the second synchronization is on the instance of MyObjectHolder. You should look into locks when using a shared resource instead of using synchonize, specifically something like a ReadWriteLock.
In few words: I want to process large graph with circular references in parallel way. And also I don't have access to full graph, I have to crawl through it. And I want to organize effective queue to do that. I'm interested is there any best practices to do that?
I'm trying to organize infinite data processing flow for such strategy: each thread takes node to process from queue, processes it, after processing - some new nodes for processing might appears - so thread has to put them into queue. But I don't have to process each node more than once. Nodes are immutable entities.
As I understand - I have to use some threadsafe implementation of queue and set (for already visited instances).
I'm trying to avoid synchronized methods. So, my implementation of this flow:
When thread adding nodes to the queue, it checking each node: if visited-nodes-set contains this node, thread don't add it to
the queue. But that's not all
When thread takes node from the queue - it check if visited-nodes-set
contains this node. If contains, thread takes another
node from queue, until get node, which hasn't
been processed yet. After finding unprocessed node - thread also adding
it to the visited-nodes-set.
I've tried to use LinkedBlockingQueue and ConcurrentHashMap (as a set). I've used ConcurrentHashMap, because it contains method putIfAbsent(key, value) - which, as I understand, helps atomically: check if map contains key, and if doesn't contain - add it.
Here is implementation of described algorithm:
public class ParallelDataQueue {
private LinkedBlockingQueue<String> dataToProcess = new LinkedBlockingQueue<String>();
// using map as a set
private ConcurrentHashMap<String, Object> processedData = new ConcurrentHashMap<String, Object>( 1000000 );
private final Object value = new Object();
public String getNextDataInstance() {
while ( true ) {
try {
String data = this.dataToProcess.take();
Boolean dataIsAlreadyProcessed = ( this.processedData.putIfAbsent( data, this.value ) != null );
if ( dataIsAlreadyProcessed ) {
continue;
} else {
return data;
}
} catch ( InterruptedException e ) {
e.printStackTrace();
}
}
}
public void addData( Collection<String> data ) {
for ( String d : data ) {
if ( !this.processedData.containsKey( d ) ) {
try {
this.dataToProcess.put( d );
} catch ( InterruptedException e ) {
e.printStackTrace();
}
}
}
}
}
So my question - does current implementation avoid processing of repeatable nodes. And, maybe there is more elegant solution?
Thanks
P.S.
I understand, that such implementation doesn't avoid appearence duplicates of nodes in queue. But for me it is not critical - all I need, is to avoid processing each node more than once.
Your current implementation does not avoid repeated data instances. Assume that "Thread A" check whether data exist in concurrent map and find out it does not so it will report that data does not exist. But just before executing the if after putIfAbsent line, "Thread A" is suspended. At that time another threat, "Thread B", scheduled to be executed by cpu and check existing of same data element and finds out it does not exist and reports it as absent and it is added to queue. When the Thread A is rescheduled it will continue from the if line and will add it to queue again.
Yes. Use ConcurrentLinkedQueue ( http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html )
also
When thread adding data to the queue, it checking each instance of data: if set contains instance of this data, thread don't add it to the queue. But that's not all
is not a thread-safe approach, unless the underlying Collection is thread-safe. (which means it's synchronized internally) But then it's pointless to do the check, because it's already thread-safe...
If you need to process data in multithreaded manner, you maybe don't need collections at all. Didn't you think about using the Executors framework? :
public static void main(String[] args) throws InterruptedException {
ExecutorService exec = Executors.newFixedThreadPool(100);
while (true) { // provide data ininitely
for (int i = 0; i < 1000; i++)
exec.execute(new DataProcessor(UUID.randomUUID(), exec));
Thread.sleep(10000); // wait a bit, then continue;
}
}
static class DataProcessor implements Runnable {
Object data;
ExecutorService exec;
public DataProcessor(Object data, ExecutorService exec) {
this.data = data;
this.exec = exec;
}
#Override
public void run() {
System.out.println(data); // process data
if (new Random().nextInt(100) < 50) // add new data piece for execution if needed
exec.execute(new DataProcessor(UUID.randomUUID(), exec));
}
}
I have a multithreaded application, where a shared list has write-often, read-occasionally behaviour.
Specifically, many threads will dump data into the list, and then - later - another worker will grab a snapshot to persist to a datastore.
This is similar to the discussion over on this question.
There, the following solution is provided:
class CopyOnReadList<T> {
private final List<T> items = new ArrayList<T>();
public void add(T item) {
synchronized (items) {
// Add item while holding the lock.
items.add(item);
}
}
public List<T> makeSnapshot() {
List<T> copy = new ArrayList<T>();
synchronized (items) {
// Make a copy while holding the lock.
for (T t : items) copy.add(t);
}
return copy;
}
}
However, in this scenario, (and, as I've learned from my question here), only one thread can write to the backing list at any given time.
Is there a way to allow high-concurrency writes to the backing list, which are locked only during the makeSnapshot() call?
synchronized (~20 ns) is pretty fast and even though other operations can allow concurrency, they can be slower.
private final Lock lock = new ReentrantLock();
private List<T> items = new ArrayList<T>();
public void add(T item) {
lock.lock();
// trivial lock time.
try {
// Add item while holding the lock.
items.add(item);
} finally {
lock.unlock();
}
}
public List<T> makeSnapshot() {
List<T> copy = new ArrayList<T>(), ret;
lock.lock();
// trivial lock time.
try {
ret = items;
items = copy;
} finally {
lock.unlock();
}
return ret;
}
public static void main(String... args) {
long start = System.nanoTime();
Main<Integer> ints = new Main<>();
for (int j = 0; j < 100 * 1000; j++) {
for (int i = 0; i < 1000; i++)
ints.add(i);
ints.makeSnapshot();
}
long time = System.nanoTime() - start;
System.out.printf("The average time to add was %,d ns%n", time / 100 / 1000 / 1000);
}
prints
The average time to add was 28 ns
This means if you are creating 30 million entries per second, you will have one thread accessing the list on average. If you are creating 60 million per second, you will have concurrency issues, however you are likely to be having many more resourcing issue at this point.
Using Lock.lock() and Lock.unlock() can be faster when there is a high contention ratio. However, I suspect your threads will be spending most of the time building the objects to be created rather than waiting to add the objects.
You could use a ConcurrentDoublyLinkedList. There is an excellent implementation here ConcurrentDoublyLinkedList.
So long as you iterate forward through the list when you make your snapshot all should be well. This implementation preserves the forward chain at all times. The backward chain is sometimes inaccurate.
First of all, you should investigate if this really is too slow. Adds to ArrayLists are O(1) in the happy case, so if the list has an appropriate initial size, CopyOnReadList.add is basically just a bounds check and an assignment to an array slot, which is pretty fast. (And please, do remember that CopyOnReadList was written to be understandable, not performant.)
If you need a non-locking operation, you can have something like this:
class ConcurrentStack<T> {
private final AtomicReference<Node<T>> stack = new AtomicReference<>();
public void add(T value){
Node<T> tail, head;
do {
tail = stack.get();
head = new Node<>(value, tail);
} while (!stack.compareAndSet(tail, head));
}
public Node<T> drain(){
// Get all elements from the stack and reset it
return stack.getAndSet(null);
}
}
class Node<T> {
// getters, setters, constructors omitted
private final T value;
private final Node<T> tail;
}
Note that while adds to this structure should deal pretty well with high contention, it comes with several drawbacks. The output from drain is quite slow to iterate over, it uses quite a lot of memory (like all linked lists), and you also get things in the opposite insertion order. (Also, it's not really tested or verified, and may actually suck in your application. But that's always the risk with using code from some random dude on the intertubes.)
Yes, there is a way. It is similar to the way ConcurrentHashMap made, if you know.
You should make your own data structure not from one list for all writing threads, but use several independent lists. Each of such lists should be guarded by it's own lock. .add() method should choose list for append current item based on Thread.currentThread.id (for example, just id % listsCount). This will gives you good concurrency properties for .add() -- at best, listsCount threads will be able to write without contention.
On makeSnapshot() you should just iterate over all lists, and for each list you grab it's lock and copy content.
This is just an idea -- there are many places to improve it.
You can use a ReadWriteLock to allow multiple threads to perform add operations on the backing list in parallel, but only one thread to make the snapshot. While the snapshot is being prepared all other add and snapshot request are put on hold.
A ReadWriteLock maintains a pair of associated locks, one for
read-only operations and one for writing. The read lock may be held
simultaneously by multiple reader threads, so long as there are no
writers. The write lock is exclusive.
class CopyOnReadList<T> {
// free to use any concurrent data structure, ConcurrentLinkedQueue used as an example
private final ConcurrentLinkedQueue<T> items = new ConcurrentLinkedQueue<T>();
private final ReadWriteLock rwLock = new ReentrantReadWriteLock();
private final Lock shared = rwLock.readLock();
private final Lock exclusive = rwLock.writeLock();
public void add(T item) {
shared.lock(); // multiple threads can attain the read lock
// try-finally is overkill if items.add() never throws exceptions
try {
// Add item while holding the lock.
items.add(item);
} finally {
shared.unlock();
}
}
public List<T> makeSnapshot() {
List<T> copy = new ArrayList<T>(); // probably better idea to use a LinkedList or the ArrayList constructor with initial size
exclusive.lock(); // only one thread can attain write lock, all read locks are also blocked
// try-finally is overkill if for loop never throws exceptions
try {
// Make a copy while holding the lock.
for (T t : items) {
copy.add(t);
}
} finally {
exclusive.unlock();
}
return copy;
}
}
Edit:
The read-write lock is so named because it is based on the readers-writers problem not on how it is used. Using the read-write lock we can have multiple threads achieve read locks but only one thread achieve the write lock exclusively. In this case the problem is reversed - we want multiple threads to write (add) and only thread to read (make the snapshot). So, we want multiple threads to use the read lock even though they are actually mutating. Only thread is exclusively making the snapshot using the write lock even though snapshot only reads. Exclusive means that during making the snapshot no other add or snapshot requests can be serviced by other threads at the same time.
As #PeterLawrey pointed out, the Concurrent queue will serialize the writes aqlthough the locks will be used for as minimal a duration as possible. We are free to use any other concurrent data structure, e.g. ConcurrentDoublyLinkedList. The queue is used only as an example. The main idea is the use of read-write locks.