I'm doing a tree search for a class assignment. I understand the tree search part, but as I have some extra time, I wanted to speed it up by adding more threads.
The final task is to take in a set of constraints, classes and time-slots, and output a schedule with all those classes, and which satisfies all the constraints. An empty or partial assignment goes in, a complete class assignment comes out.
Our search is designed like a tree, with the input being the root node. The function div(n) is as follows: for a node n, find an unused class C, and for each unused slot S, produce a child node with C in S. To make the search more efficient, we use a search control which ranks the quality of nodes, so that the best candidates are selected first, and we don't waste time on bad candidates
A node implements Comparable, with compareTo() implemented using the search control. I use a priority queue to store nodes awaiting processing, so the 'best' nodes are always next in line. A worker removes an node, applies div() and adds the children to the priority queue.
My first approach was using a shared priority queue, specifically PriorityBlockingQueue. The performance was abysmal, since the queue was almost always blocking.
I tried to fix it by adding a background worker and a ConcurrentLinkedQueue buffer. workers would add to the buffer, and the the worker would periodically move elements from the buffer to the priority queue. This didn't work either.
The best performance I have found is to give each worker it's own priority queue. I'm guessing that this is as good as it gets, as now threads aren't connected to the actions of others. with this config, on an 4C/8T machine, I get a speedup of ~2.5. I think the bottleneck here is the allocation of memory for all these nodes, but I could be wrong here.
From Searcher:
private PriorityQueue<Schedule> workQueue;
private static volatile boolean shutdownSignal = false;
private Schedule best;
public Searcher(List<Schedule> instances) {
workQueue = new PriorityQueue<>(instances);
}
public static void stop() {
shutdownSignal = true;
}
/**
* Run the search control starting with the first node in the workQueue
*/
#Override
public void run() {
while (!shutdownSignal) {
try {
Schedule next = workQueue.remove();
List<Schedule> children = next.div(checkBest);
workQueue.addAll(children);
} catch (Exception e) {
//TODO: handle exception
}
}
//For testing
System.out.println("Shutting down: " + workQueue.size());
}
//passing a function as a parameter
Consumer<Schedule> checkBest = new Consumer<Schedule>() {
public void accept(Schedule sched) {
if (best == null || sched.betterThan(best)) {
best = sched;
Model.checkBest.accept(sched);
}
}
};
From Schedule:
public List<Schedule> div(Consumer<Schedule> completion) {
List<Schedule> n = new ArrayList<>();
int selected = 0;
List<Slot> available = Model.getSlots();
List<Slot> allocated = getAssigned();
while (allocated.get(selected) != null) {
selected++;
} // find first available slot to fill.
// Iterate through all available slots
for (Slot t : available) {
//Prepare a fresh copy
List<Slot> newAssignment = new ArrayList<>(allocated.size());
Collections.copy(newAssignment, allocated);
//assign the course to the timeslot
newAssignment.set(selected, t);
Schedule next = new Schedule(this, newAssignment);
n.add(next);
}
/**
* Filter out nodes which violate the hard constraints and which are solved,
* and check if they are the best in a calling thread
*/
List<Schedule> unsolvedNodes = new ArrayList<>();
for (Schedule schedule: n) {
if (schedule.constr() && !schedule.solved()){
unsolvedNodes.add(schedule);
completion.accept(schedule);
}
}
return unsolvedNodes;
}
I would say that fork-join framework is an appropriate tool for your task. You need to extend your task from either ResursiveTask or ResursiveAction and submit it to ForkJoinPool. Here is a pseudo-code sample. Also your shutdown flag must be volatile.
public class Task extends RecursiveAction {
private final Node<Integer> node;
public Task(Node<Integer> node) {
this.node = node;
}
#Override
protected void compute() {
// check result and stop recursion if needed
List<Task> subTasks = new ArrayList<>();
List<Node<Integer>> nodes = div(this.node);
for (Node<Integer> node : nodes) {
Task task = new Task(node);
task.fork();
subTasks.add(task);
}
for(Task task : subTasks) {
task.join();
}
}
public static void main(String[] args) {
Node root = getRootNode();
new ForkJoinPool().invoke(new Task(root));
}
Related
TL;DR: When several CompletableFutures are waiting to get executed, how can I prioritize those whose values i'm interested in?
I have a list of 10,000 CompletableFutures (which calculate the data rows for an internal report over the product database):
List<Product> products = ...;
List<CompletableFuture<DataRow>> dataRows = products
.stream()
.map(p -> CompletableFuture.supplyAsync(() -> calculateDataRowForProduct(p), singleThreadedExecutor))
.collect(Collectors.toList());
Each takes around 50ms to complete, so the entire thing finishes in 500sec. (they all share the same DB connection, so cannot run in parallel).
Let's say I want to access the data row of the 9000th product:
dataRows.get(9000).join()
The problem is, all these CompletableFutures are executed in the order they have been created, not in the order they are accessed. Which means I have to wait 450sec for it to calculate stuff that at the moment I don't care about, to finally get to the data row I want.
Question:
Is there any way to change this behaviour, so that the Futures I try to access get priority over those I don't care about at the moment?
First thoughts:
I noticed that a ThreadPoolExecutor uses a BlockingQueue<Runnable> to queue up entries waiting for an available Thread.
So I thought about using a PriorityBlockingQueue, to change the priority of the Runnable when I access its CompletableFuture but:
PriorityBlockingQueue does not have a method to reprioritize an existing element, and
I need to figure out a way to get from the CompletableFuture to the corresponding Runnable entry in the queue.
Before I go further down this road, do you think this sounds like the correct approach. Do others ever had this kind of requirement? I tried to search for it, but found exactly nothing. Maybe CompletableFuture is not the correct way of doing this?
Background:
We have an internal report which displays 100 products per page. Initially we precalculated all DataRows for the report, which took way to long if someone has that many products.
So first optimization was to wrap the calculation in a memoized supplier:
List<Supplier<DataRow>> dataRows = products
.stream()
.map(p -> Suppliers.memoize(() -> calculateDataRowForProduct(p)))
.collect(Collectors.toList());
This means that initial display of first 100 entries now takes 5sec instead of 500sec (which is great), but when the user switches to the next pages, it takes another 5sec for each single one of them.
So the idea is, while the user is staring at the first screen, why not precalculate the next pages in the background. Which leads me to my question above.
Interesting problem :)
One way is to roll out custom FutureTask class to facilitate changing priorities of tasks dynamically.
DataRow and Product are both taken as just String here for simplicity.
import java.util.*;
import java.util.concurrent.*;
public class Testing {
private static String calculateDataRowForProduct(String product) {
try {
// Dummy operation.
Thread.sleep(200);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Computation done for " + product);
return "data row for " + product;
}
public static void main(String[] args) throws ExecutionException, InterruptedException {
PriorityBlockingQueue<Runnable> customQueue = new PriorityBlockingQueue<Runnable>(1, new CustomRunnableComparator());
ThreadPoolExecutor executor = new ThreadPoolExecutor(1, 1, 0L, TimeUnit.MILLISECONDS, customQueue);
List<String> products = new ArrayList<>();
for (int i = 0; i < 10; i++) {
products.add("product" + i);
}
Map<Integer, PrioritizedFutureTask<String>> taskIndexMap = new HashMap<>();
for (int i = 0; i < products.size(); i++) {
String product = products.get(i);
Callable callable = () -> calculateDataRowForProduct(product);
PrioritizedFutureTask<String> dataRowFutureTask = new PrioritizedFutureTask<>(callable, i);
taskIndexMap.put(i, dataRowFutureTask);
executor.execute(dataRowFutureTask);
}
List<Integer> accessOrder = new ArrayList<>();
accessOrder.add(4);
accessOrder.add(7);
accessOrder.add(2);
accessOrder.add(9);
int priority = -1 * accessOrder.size();
for (Integer nextIndex : accessOrder) {
PrioritizedFutureTask taskAtIndex = taskIndexMap.get(nextIndex);
assert (customQueue.remove(taskAtIndex));
customQueue.offer(taskAtIndex.set_priority(priority++));
// Now this task will be at the front of the thread pool queue.
// Hence this task will execute next.
}
for (Integer nextIndex : accessOrder) {
PrioritizedFutureTask<String> dataRowFutureTask = taskIndexMap.get(nextIndex);
String dataRow = dataRowFutureTask.get();
System.out.println("Data row for index " + nextIndex + " = " + dataRow);
}
}
}
class PrioritizedFutureTask<T> extends FutureTask<T> implements Comparable<PrioritizedFutureTask<T>> {
private Integer _priority = 0;
private Callable<T> callable;
public PrioritizedFutureTask(Callable<T> callable, Integer priority) {
super(callable);
this.callable = callable;
_priority = priority;
}
public Integer get_priority() {
return _priority;
}
public PrioritizedFutureTask set_priority(Integer priority) {
_priority = priority;
return this;
}
#Override
public int compareTo(#NotNull PrioritizedFutureTask<T> other) {
if (other == null) {
throw new NullPointerException();
}
return get_priority().compareTo(other.get_priority());
}
}
class CustomRunnableComparator implements Comparator<Runnable> {
#Override
public int compare(Runnable task1, Runnable task2) {
return ((PrioritizedFutureTask)task1).compareTo((PrioritizedFutureTask)task2);
}
}
Output:
Computation done for product0
Computation done for product4
Data row for index 4 = data row for product4
Computation done for product7
Data row for index 7 = data row for product7
Computation done for product2
Data row for index 2 = data row for product2
Computation done for product9
Data row for index 9 = data row for product9
Computation done for product1
Computation done for product3
Computation done for product5
Computation done for product6
Computation done for product8
There is one more scope of optimization here.
The customQueue.remove(taskAtIndex) operation has O(n) time complexity with respect to the size of the queue (or the total number of products).
It might not affect much if the number of products is less (<= 10^5).
But it might result in a performance issue otherwise.
One solution to that is to extend BlockingPriorityQueue and roll out functionality to remove an element from a priority queue in O(logn) rather than O(n).
We can achieve that by keeping a hashmap inside the PriorityQueue structure. This hashmap will keep a count of elements vs the index (or indices in case of duplicates) of that element in the underlying array.
Fortunately, I had already implemented such a heap in Python sometime back.
If you have more questions on this optimization, its probably better to ask a new question altogether.
You could avoid submitting all of the tasks to the executor at the start, instead only submit one background task and when it finishes submit the next. If you want to get the 9000th row submit it immediately (if it has not already been submitted):
static class FutureDataRow {
CompletableFuture<DataRow> future;
int index;
List<FutureDataRow> list;
Product product;
FutureDataRow(List<FutureDataRow> list, Product product){
this.list = list;
index = list.size();
list.add(this);
this.product = product;
}
public DataRow get(){
submit();
return future.join();
}
private synchronized void submit(){
if(future == null) future = CompletableFuture.supplyAsync(() ->
calculateDataRowForProduct(product), singleThreadedExecutor);
}
private void background(){
submit();
if(index >= list.size() - 1) return;
future.whenComplete((dr, t) -> list.get(index + 1).background());
}
}
...
List<FutureDataRow> dataRows = new ArrayList<>();
products.forEach(p -> new FutureDataRow(dataRows, p));
dataRows.get(0).background();
If you want you could also submit the next row inside the get method if you expect that they will navigate to the next page afterwards.
If you were instead using a multithreaded executor and you wanted to run multiple background tasks concurrently you could modify the background method to find the next unsubmitted task in the list and start it when the current background task has finished.
private synchronized boolean background(){
if(future != null) return false;
submit();
future.whenComplete((dr, t) -> {
for(int i = index + 1; i < list.size(); i++){
if(list.get(i).background()) return;
}
});
return true;
}
You would also need to start the first n tasks in the background instead of just the first one.
int n = 8; //number of active background tasks
for(int i = 0; i < dataRows.size() && n > 0; i++){
if(dataRows.get(i).background()) n--;
}
To answer my own question...
There is a surprisingly simple (and surprisingly boring) solution to my problem. I have no idea why it took me three days to find it, I guess it required the right mindset, that you only have when walking along an endless tranquilizing beach looking into the sunset on a quiet Sunday evening.
So, ah, it's a little bit embarrassing to write this, but when I need to fetch a certain value (say for 9000th product), and the future has not yet computed that value, I can, instead of somehow forcing the future to produce that value asap (by doing all this repriorisation and scheduling magic), I can, well, I can, ... simply ... compute that value myself! Yes! Wait, what? Seriously, that's it?
It's something like this: if (!future.isDone()) {future.complete(supplier.get());}
I just need to store the original Supplier alongside the CompletableFuture in some wrapper class. This is the wrapper class, which works like a charm, all it needs is a better name:
public static class FuturizedMemoizedSupplier<T> implements Supplier<T> {
private CompletableFuture<T> future;
private Supplier<T> supplier;
public FuturizedSupplier(Supplier<T> supplier) {
this.supplier = supplier;
this.future = CompletableFuture.supplyAsync(supplier, singleThreadExecutor);
}
public T get() {
// if the future is not yet completed, we just calculate the value ourselves, and set it into the future
if (!future.isDone()) {
future.complete(supplier.get());
}
supplier = null;
return future.join();
}
}
Now, I think, there is a small chance for a race condition here, which could lead to the supplier being executed twice. But actually, I don't care, it produces the same value anyway.
Afterthoughts:
I have no idea why I didn't think of this earlier, I was completely fixated on the idea, it has to be the CompletableFuture which calculates the value, and it has to run in one of these background threads, and whatnot, and, well, none of these mattered or were in any way a requirement.
I think this whole question is a classic example of Ask what problem you really want to solve instead of coming up with a half baked broken solution, and ask how to fix that. In the end, I didn't care about CompletableFuture or any of its features at all, it was just the easiest way that came to my mind to run something in the background.
Thanks for your help!
We have a specialist, multi-producer (User) and single-consumer (Engine), queue. The User threads runs more frequently and always adds individual elements to the queue. The Engine thread operation runs less frequently and processes the stack elements in a batch. If the stack is empty, it'll park until the User thread has added an entry. This way a notify only needs to happen when the queue goes from empty to 1.
In this implementation, instead of the Engine thread iterating and removing one item at a time, it removes them all - a drainAll, instead of drainTo. No other operations can mutate the stack - just the User thread add, and the engine thread drainAll.
Currently we do this via a synchronised linked list, we are wondering if there is a non-blocking way to do this. The drainTo operation on JDK classes will iterate the stack, we just want to take everything in the stack in one operation, without iterating - as each iteration hits volatile/cas related logic, so we'd ideally just like to hit that once, per drainAll. The the engine thread can iterate and operate on each individual element, without touching sync/volatile/cas operations.
The current implementation looks something like:
public class SynchronizedPropagationQueue implements PropagatioQueue {
protected volatile PropagationEntry head;
protected volatile PropagationEntry tail;
protected synchronized void addEntry( PropagationEntry entry ) {
if ( head == null ) {
head = entry;
notifyWaitOnRest();
} else {
tail.setNext( entry );
}
tail = entry;
}
#Override
public synchronized PropagationEntry drainAll() {
PropagationEntry currentHead = head;
head = null;
tail = null;
return currentHead;
}
public synchronized void waitOnRest() {
try {
log.debug("Engine wait");
wait();
} catch (InterruptedException e) {
// do nothing
}
log.debug("Engine resumed");
}
#Override
public synchronized void notifyWaitOnRest() {
notifyAll();
}
}
asdf
Stacks have a very simple non-blocking implementation that supports a concurrent "pop all" operation easily, and can easily detect the empty->non-empty transition. You could have all your producers push items onto a stack and then have the engine empty the whole thing at once. It looks like this:
public class EngineQueue<T>
{
private final AtomicReference<Node<T>> m_lastItem = new AtomicReference<>();
public void add(T item)
{
Node<T> newNode = new Node<T>(item);
do {
newNode.m_next = m_lastItem.get();
} while(!m_lastItem.compareAndSet(newNode.m_next, newNode));
if (newNode.m_next == null)
{
// ... just went non-empty signal any waiting consumer
}
}
public List<T> removeAll()
{
Node<T> stack = m_lastItem.getAndSet(null);
// ... wait for non-empty if necessary
List<T> ret = new ArrayList<>();
for (;stack != null; stack=stack.m_next)
{
ret.add(stack.m_data);
}
Collections.reverse(ret);
return ret;
}
private static class Node<U>
{
Node<U> m_next;
final U m_data;
Node(U data)
{
super();
m_data = data;
}
}
}
For signaling around the empty -> non-empty transition, you can use normal synchronization. This is not going to be expensive if you only do it when you detect an empty state... since you only get to the empty state when you're out of work to do.
Currently we do this via a synchronised linked list, we are wondering if there is a non-blocking way to do this. The drainTo operation on JDK classes will iterate the stack, we just want to take everything in the stack in one operation, without iterating
Maybe I don't understand but it seems like using a BlockingQueue.drainTo(...) method would be better than your implementation. For example the LinkedBlockingQueue.drainTo(...) method just has one lock around that method -- there's no iterating overhead that I see.
If this is not an academic discussion then I'd doubt that your performance problems are with the queue itself and would concentrate your efforts in other areas. If it is academic then #Matt's answer might be better although certainly there's a lot more code to be written to support the full Collection method list.
I have this piece of code:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
#Override
public void run(){
while(!intervals.isEmpty()){
//remove one interval
//do calculations
//add some intervals
}
}
This code is being executed by a specific number of threads at the same time. As you see, loop should go on until there are no more intervals left in the collection, but there is a problem. In the beginning of each iteration an interval gets removed from collection and in the end some number of intervals might get added back into same collection.
Problem is, that while one thread is inside the loop the collection might become empty, so other threads that are trying to enter the loop won't be able to do that and will finish their work prematurely, even though collection might be filled with values after the first thread will finish the iteration. I want the thread count to remain constant (or not more than some number n) until all work is really finished.
That means that no threads are currently working in the loop and there are no elements left in the collection. What are possible ways of accomplishing that? Any ideas are welcomed.
One way to solve this problem in my specific case is to give every thread a different piece of the original collection. But after one thread would finish its work it wouldn't be used by the program anymore, even though it could help other threads with their calculations, so I don't like this solution, because it's important to utilize all cores of the machine in my problem.
This is the simplest minimal working example I could come up with. It might be to lengthy.
public class Test{
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
private int threadNumber;
private Thread[] threads;
private double result;
public Test(int threadNumber){
intervals.add(new Interval(0, 1));
this.threadNumber = threadNumber;
threads = new Thread[threadNumber];
}
public double find(){
for(int i = 0; i < threadNumber; i++){
threads[i] = new Thread(new Finder());
threads[i].start();
}
try{
for(int i = 0; i < threadNumber; i++){
threads[i].join();
}
}
catch(InterruptedException e){
System.err.println(e);
}
return result;
}
private class Finder implements Runnable{
#Override
public void run(){
while(!intervals.isEmpty()){
Interval interval = intervals.poll();
if(interval.high - interval.low > 1e-6){
double middle = (interval.high + interval.low) / 2;
boolean something = true;
if(something){
intervals.add(new Interval(interval.low + 0.1, middle - 0.1));
intervals.add(new Interval(middle + 0.1, interval.high - 0.1));
}
else{
intervals.add(new Interval(interval.low + 0.1, interval.high - 0.1));
}
}
}
}
}
private class Interval{
double low;
double high;
public Interval(double low, double high){
this.low = low;
this.high = high;
}
}
}
What you might need to know about the program: After every iteration interval should either disappear (because it's too small), become smaller or split into two smaller intervals. Work is finished after no intervals are left. Also, I should be able to limit number of threads that are doing this work with some number n. The actual program looks for a maximum value of some function by dividing the intervals and throwing away the parts of those intervals that can't contain the maximum value using some rules, but this shouldn't really be relevant to my problem.
The CompletableFuture class is also an interesting solution for these kind of tasks.
It automatically distributes workload over a number of worker threads.
static CompletableFuture<Integer> fibonacci(int n) {
if(n < 2) return CompletableFuture.completedFuture(n);
else {
return CompletableFuture.supplyAsync(() -> {
System.out.println(Thread.currentThread());
CompletableFuture<Integer> f1 = fibonacci(n - 1);
CompletableFuture<Integer> f2 = fibonacci(n - 2);
return f1.thenCombineAsync(f2, (a, b) -> a + b);
}).thenComposeAsync(f -> f);
}
}
public static void main(String[] args) throws Exception {
int fib = fibonacci(10).get();
System.out.println(fib);
}
You can use atomic flag, i.e.:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue<>();
private AtomicBoolean inUse = new AtomicBoolean();
#Override
public void run() {
while (!intervals.isEmpty() && inUse.compareAndSet(false, true)) {
// work
inUse.set(false);
}
}
UPD
Question has been updated, so I would give you better solution. It is more "classic" solution using blocking queue;
private BlockingQueue<Interval> intervals = new ArrayBlockingQueue<Object>();
private volatile boolean finished = false;
#Override
public void run() {
try {
while (!finished) {
Interval next = intervals.take();
// put work there
// after you decide work is finished just set finished = true
intervals.put(interval); // anyway, return interval to queue
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
UPD2
Now it seems better to re-write solution and divide range to sub-ranges for each thread.
Your problem looks like a recursive one - processing one task (interval) might produce some sub-tasks (sub intervals).
For that purpose I would use ForkJoinPool and RecursiveTask:
class Interval {
...
}
class IntervalAction extends RecursiveAction {
private Interval interval;
private IntervalAction(Interval interval) {
this.interval = interval;
}
#Override
protected void compute() {
if (...) {
// we need two sub-tasks
IntervalAction sub1 = new IntervalAction(new Interval(...));
IntervalAction sub2 = new IntervalAction(new Interval(...));
sub1.fork();
sub2.fork();
sub1.join();
sub2.join();
} else if (...) {
// we need just one sub-task
IntervalAction sub3 = new IntervalAction(new Interval(...));
sub3.fork();
sub3.join();
} else {
// current task doesn't need any sub-tasks, just return
}
}
}
public static void compute(Interval initial) {
ForkJoinPool pool = new ForkJoinPool();
pool.invoke(new IntervalAction(initial));
// invoke will return when all the processing is completed
}
I had the same problem, and I tested the following solution.
In my test example I have a queue (the equivalent of your intervals) filled with integers. For the test, at each iteration one number is taken from the queue, incremented and placed back in the queue if the new value is below 7 (arbitrary). This has the same impact as your interval generation on the mechanism.
Here is an example working code (Note that I develop in java 1.8 and I use the Executor framework to handle my thread pool.) :
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.PriorityBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
public class Test {
final int numberOfThreads;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
final BlockingQueue<Integer> sleepingThreadsTokens;
final ThreadPoolExecutor executor;
public static void main(String[] args) {
final Test test = new Test(2); // arbitrary number of thread => 2
test.launch();
}
private Test(int numberOfThreads){
this.numberOfThreads = numberOfThreads;
this.queue = new PriorityBlockingQueue<Integer>();
this.availableThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.sleepingThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numberOfThreads);
}
public void launch() {
// put some elements in queue at the beginning
queue.add(1);
queue.add(2);
queue.add(3);
for(int i = 0; i < numberOfThreads; i++){
availableThreadsTokens.add(1);
}
System.out.println("Start");
boolean algorithmIsFinished = false;
while(!algorithmIsFinished){
if(sleepingThreadsTokens.size() != numberOfThreads){
try {
availableThreadsTokens.take();
} catch (final InterruptedException e) {
e.printStackTrace();
// some treatment should be put there in case of failure
break;
}
if(!queue.isEmpty()){ // Continuation condition
sleepingThreadsTokens.drainTo(availableThreadsTokens);
executor.submit(new Loop(queue.poll(), queue, availableThreadsTokens));
}
else{
sleepingThreadsTokens.add(1);
}
}
else{
algorithmIsFinished = true;
}
}
executor.shutdown();
System.out.println("Finished");
}
public static class Loop implements Runnable{
int element;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
public Loop(Integer element, BlockingQueue<Integer> queue, BlockingQueue<Integer> availableThreadsTokens){
this.element = element;
this.queue = queue;
this.availableThreadsTokens = availableThreadsTokens;
}
#Override
public void run(){
System.out.println("taking element "+element);
for(Long l = (long) 0; l < 500000000L; l++){
}
for(Long l = (long) 0; l < 500000000L; l++){
}
for(Long l = (long) 0; l < 500000000L; l++){
}
if(element < 7){
this.queue.add(element+1);
System.out.println("Inserted element"+(element + 1));
}
else{
System.out.println("no insertion");
}
this.availableThreadsTokens.offer(1);
}
}
}
I ran this code for check, and it seems to work properly. However there are certainly some improvement that can be made :
sleepingThreadsTokens do not have to be a BlockingQueue, since only the main accesses it. I used this interface because it allowed a nice sleepingThreadsTokens.drainTo(availableThreadsTokens);
I'm not sure whether queue has to be blocking or not, since only main takes from it and does not wait for elements (it waits only for tokens).
...
The idea is that the main thread checks for the termination, and for this it has to know how many threads are currently working (so that it does not prematurely stops the algorithm because the queue is empty). To do so two specific queues are created : availableThreadsTokens and sleepingThreadsTokens. Each element in availableThreadsTokens symbolizes a thread that have finished an iteration, and wait to be given another one. Each element in sleepingThreadsTokens symbolizes a thread that was available to take a new iteration, but the queue was empty, so it had no job and went to "sleep". So at each moment availableThreadsTokens.size() + sleepingThreadsTokens.size() = numberOfThreads - threadExcecutingIteration.
Note that the elements on availableThreadsTokens and sleepingThreadsTokens only symbolizes thread activity, they are not thread nor design a specific thread.
Case of termination : let suppose we have N threads (aribtrary, fixed number). The N threads are waiting for work (N tokens in availableThreadsTokens), there is only 1 remaining element in the queue and the treatment of this element won't generate any other element. Main takes the first token, finds that the queue is not empty, poll the element and sends the thread to work. The N-1 next tokens are consumed one by one, and since the queue is empty the token are moved into sleepingThreadsTokens one by one. Main knows that there is 1 thread working in the loop since there is no token in availableThreadsTokens and only N-1 in sleepingThreadsTokens, so it waits (.take()). When the thread finishes and releases the token Main consumes it, discovers that the queue is now empty and put the last token in sleepingThreadsTokens. Since all tokens are now in sleepingThreadsTokens Main knows that 1) all threads are inactive 2) the queue is empty (else the last token wouldn't have been transferred to sleepingThreadsTokens since the thread would have take the job).
Note that if the working thread finishes the treatment before all the availableThreadsTokens are moved to sleepingThreadsTokens it makes no difference.
Now if we suppose that the treatment of the last element would have generated M new elements in the queue then the Main would have put all the tokens from sleepingThreadsTokens back to availableThreadsTokens, and start to assign them treatments again. We put all the token back even if M < N because we don't know how much elements will be inserted in the future, so we have to keep all the thread available.
I would suggest a master/worker approach then.
The master process goes through the intervals and assigns the calculations of that interval to a different process. It also removes/adds as necessary. This way, all the cores are utilized, and only when all intervals are finished, the process is done. This is also known as dynamic work allocation.
A possible example:
public void run(){
while(!intervals.isEmpty()){
//remove one interval
Thread t = new Thread(new Runnable()
{
//do calculations
});
t.run();
//add some intervals
}
}
The possible solution you provided is known as static allocation, and you're correct, it will finish as fast as the slowest processor, but the dynamic approach will utilize all memory.
I've run into this problem as well. The way I solved it was to use an AtomicInteger to know what is in the queue. Before each offer() increment the integer. After each poll() decrement the integer. The CLQ has no real isEmpty() since it must look at head/tail nodes and this can change atomically (CAS).
This doesn't guarantee 100% that some thread may increment after another thread decrements so you need to check again before ending the thread. It is better than relying on while(...isEmpty())
Other than that, you may need to synchronize.
In few words: I want to process large graph with circular references in parallel way. And also I don't have access to full graph, I have to crawl through it. And I want to organize effective queue to do that. I'm interested is there any best practices to do that?
I'm trying to organize infinite data processing flow for such strategy: each thread takes node to process from queue, processes it, after processing - some new nodes for processing might appears - so thread has to put them into queue. But I don't have to process each node more than once. Nodes are immutable entities.
As I understand - I have to use some threadsafe implementation of queue and set (for already visited instances).
I'm trying to avoid synchronized methods. So, my implementation of this flow:
When thread adding nodes to the queue, it checking each node: if visited-nodes-set contains this node, thread don't add it to
the queue. But that's not all
When thread takes node from the queue - it check if visited-nodes-set
contains this node. If contains, thread takes another
node from queue, until get node, which hasn't
been processed yet. After finding unprocessed node - thread also adding
it to the visited-nodes-set.
I've tried to use LinkedBlockingQueue and ConcurrentHashMap (as a set). I've used ConcurrentHashMap, because it contains method putIfAbsent(key, value) - which, as I understand, helps atomically: check if map contains key, and if doesn't contain - add it.
Here is implementation of described algorithm:
public class ParallelDataQueue {
private LinkedBlockingQueue<String> dataToProcess = new LinkedBlockingQueue<String>();
// using map as a set
private ConcurrentHashMap<String, Object> processedData = new ConcurrentHashMap<String, Object>( 1000000 );
private final Object value = new Object();
public String getNextDataInstance() {
while ( true ) {
try {
String data = this.dataToProcess.take();
Boolean dataIsAlreadyProcessed = ( this.processedData.putIfAbsent( data, this.value ) != null );
if ( dataIsAlreadyProcessed ) {
continue;
} else {
return data;
}
} catch ( InterruptedException e ) {
e.printStackTrace();
}
}
}
public void addData( Collection<String> data ) {
for ( String d : data ) {
if ( !this.processedData.containsKey( d ) ) {
try {
this.dataToProcess.put( d );
} catch ( InterruptedException e ) {
e.printStackTrace();
}
}
}
}
}
So my question - does current implementation avoid processing of repeatable nodes. And, maybe there is more elegant solution?
Thanks
P.S.
I understand, that such implementation doesn't avoid appearence duplicates of nodes in queue. But for me it is not critical - all I need, is to avoid processing each node more than once.
Your current implementation does not avoid repeated data instances. Assume that "Thread A" check whether data exist in concurrent map and find out it does not so it will report that data does not exist. But just before executing the if after putIfAbsent line, "Thread A" is suspended. At that time another threat, "Thread B", scheduled to be executed by cpu and check existing of same data element and finds out it does not exist and reports it as absent and it is added to queue. When the Thread A is rescheduled it will continue from the if line and will add it to queue again.
Yes. Use ConcurrentLinkedQueue ( http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html )
also
When thread adding data to the queue, it checking each instance of data: if set contains instance of this data, thread don't add it to the queue. But that's not all
is not a thread-safe approach, unless the underlying Collection is thread-safe. (which means it's synchronized internally) But then it's pointless to do the check, because it's already thread-safe...
If you need to process data in multithreaded manner, you maybe don't need collections at all. Didn't you think about using the Executors framework? :
public static void main(String[] args) throws InterruptedException {
ExecutorService exec = Executors.newFixedThreadPool(100);
while (true) { // provide data ininitely
for (int i = 0; i < 1000; i++)
exec.execute(new DataProcessor(UUID.randomUUID(), exec));
Thread.sleep(10000); // wait a bit, then continue;
}
}
static class DataProcessor implements Runnable {
Object data;
ExecutorService exec;
public DataProcessor(Object data, ExecutorService exec) {
this.data = data;
this.exec = exec;
}
#Override
public void run() {
System.out.println(data); // process data
if (new Random().nextInt(100) < 50) // add new data piece for execution if needed
exec.execute(new DataProcessor(UUID.randomUUID(), exec));
}
}
I am writing a multithreaded parser.
Parser class is as follows.
public class Parser extends HTMLEditorKit.ParserCallback implements Runnable {
private static List<Station> itemList = Collections.synchronizedList(new ArrayList<Item>());
private boolean h2Tag = false;
private int count;
private static int threadCount = 0;
public static List<Item> parse() {
for (int i = 1; i <= 1000; i++) { //1000 of the same type of pages that need to parse
while (threadCount == 20) { //limit the number of simultaneous threads
try {
Thread.sleep(50);
} catch (InterruptedException ex) {
ex.printStackTrace();
}
}
Thread thread = new Thread(new Parser());
thread.setName(Integer.toString(i));
threadCount++; //increase the number of working threads
thread.start();
}
return itemList;
}
public void run() {
//Here is a piece of code responsible for creating links based on
//the thread name and passed as a parameter remained i,
//connection, start parsing, etc.
//In general, nothing special. Therefore, I won't paste it here.
threadCount--; //reduce the number of running threads when current stops
}
private static void addItem(Item item) {
itenList.add(item);
}
//This method retrieves the necessary information after the H2 tag is detected
#Override
public void handleText(char[] data, int pos) {
if (h2Tag) {
String itemName = new String(data).trim();
//Item - the item on which we receive information from a Web page
Item item = new Item();
item.setName(itemName);
item.setId(count);
addItem(item);
//Display information about an item in the console
System.out.println(count + " = " + itemName);
}
}
#Override
public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
if (HTML.Tag.H2 == t) {
h2Tag = true;
}
}
#Override
public void handleEndTag(HTML.Tag t, int pos) {
if (HTML.Tag.H2 == t) {
h2Tag = false;
}
}
}
From another class parser runs as follows:
List<Item> list = Parser.parse();
All is good, but there is a problem. At the end of parsing in the final list "List itemList" contains 980 elements onto, instead of 1000. But in the console there is all of 1000 elements (items). That is, some threads for some reason did not call in the handleText method the addItem method.
I already tried to change the type of itemList to ArrayList, CopyOnWriteArrayList, Vector. Makes the method addItem synchronized, changed its call on the synchronized block. All this only changes the number of elements a little, but the final thousand can not be obtained.
I also tried to parse a smaller number of pages (ten). As the result the list is empty, but in the console all 10.
If I remove multi-threading, then everything works fine, but, of course, slowly. That's not good.
If decrease the number of concurrent threads, the number of items in the list is close to the desired 1000, if increase - a little distanced from 1000. That is, I think, there is a struggle for the ability to record to the list. But then why are synchronization not working?
What's the problem?
After your parse() call returns, all of your 1000 Threads have been started, but it is not guaranteed that they are finished. In fact, they aren't that's the problem you see. I would heavily recommend not write this by yourself but use the tools provided for this kind of job by the SDK.
The documentation Thread Pools and the ThreadPoolExecutor are e.g. a good starting point. Again, don't implement this yourself if you are not absolutely sure you have too, because writing such multi-threading code is pure pain.
Your code should look something like this:
ExecutorService executor = Executors.newFixedThreadPool(20);
List<Future<?>> futures = new ArrayList<Future<?>>(1000);
for (int i = 0; i < 1000; i++) {
futures.add(executor.submit(new Runnable() {...}));
}
for (Future<?> f : futures) {
f.get();
}
There is no problem with the code, it is working as you have coded. the problem is with the last iteration. rest all iterations will work properly, but during the last iteration which is from 980 to 1000, the threads are created, but the main process, does not waits for the other thread to complete, and then return the list. therefore you will be getting some odd number between 980 to 1000, if you are working with 20 threads at a time.
Now you can try adding Thread.wait(50), before returning the list, in that case your main thread will wait, some time, and may be by the time, other threads might finish the processing.
or you can use some syncronization API from java. Instead of Thread.wait(), use CountDownLatch, this will help you to wait for the threads to complete the processing, and then you can create new threads.