Multi - threading

Multi - threading - java

I have tried to create a parallel quicksort in Java which I assume is a naive one (cause I haven't studied yet Interface Executor etc)
I needed a way to print the sorted array once all the threads are done..but I didn't know how many threads I am going to have in advance.. so I was doing it in a way that it will wait each time recursively with the join() method.. so the first join method that was invoked has to wait till all the other threads are done.. right ?
In that way when I execute my last two lines in main() ( of the printing array) I can be sure that all my threads are done...
so I have two questions ..
It is a multi-threading program that runs in parallel, right ? or am I making some mistakes that it actually runs in a linear way thread after thread ?
was I correct with my solution for displaying the sorted array in the main method?
Here is my code:
public class Main {
public static void main(String[] args) {
ArrayList<Integer> array = new ArrayList();
//please assume that I have invoked the input for the array from the user
QuickSortWithThreads obj = new QuickSortWithThreads(array,0 ,array.size()-1 );
for(int i = 0; i < array.size(); i++)
System.out.println(array.get(i));
}
}
public class QuickSortWithThreads {
public QuickSortWithThreads(ArrayList <Integer> arr, int left, int right){
quicksort(arr, left, right);
}
static void quicksort(ArrayList <Integer> arr, int left, int right) {
int pivot;
if(left<right){
pivot = partition(arr, left, right);
QuickSortThread threadLeftSide = new QuickSortThread(arr, pivot + 1, right);
threadLeftSide.start();
quicksort(arr, left, pivot - 1);
try {
threadLeftSide.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
static int partition(ArrayList<Integer> arr, int left, int right) {
int pivot = arr.get(right);
int i = left -1;
for( int j = left; j <= right - 1; j++) {
if (arr.get(j) <= pivot){
i = i + 1;
exchange(arr, i, j);
}
}
exchange(arr, i + 1, right);
return i + 1;
}
static void exchange(ArrayList<Integer> arr, int i, int j) {
int swap = arr.get(i);
arr.set(i, arr.get(j));
arr.set(j, swap);
}
private static class QuickSortThread extends Thread {
int right;
int left;
ArrayList<Integer> refArray;
public QuickSortThread(ArrayList<Integer> array, int left, int right) {
this.right = right;
this.left = left;
refArray = new ArrayList<Integer>();
refArray = array;
}
public void run() {
quicksort(refArray, left, right);
}
}
}

If we knew the overall number of threads, we could use CountDownLatch initialized with the number of threads. But as we don't know the number of threads, we need an extended CountDownLatch which allows to increase the counter after its creation. Unfortunately we cannot just extend the class CountDownLatch as underlying counter is private. One way is to duplicate the original code of CountDownLatch to make access to the underlying counter. Less verbose way is to extend Semaphore to get access to the reducePermits method as it is done in Reduceable Semaphore. In principle, CountDownLatch and Semaphore are similar tools but differ in interpretation of the internal counter: the first counts vetoes and the latter counts permits.
The whole idea is to reduce the number of permits when a thread is created or started, and release permit when it is finished, at the end of the method run(). Initial number of permits is 1, so that if no threads started, the main procedure finishes freely. Note that reducing the number of permits at the beginning of the method run() is too late.
To get really good working code, you need also use a thread pool with fixed number of threads, and make sorting serially for small arrays.

General opinion
Yes, your code runs in parallel. And the result printing looks all right as well.
Limiting number of threads via depth
One problem is the fact that you create a huge number of threads: at the lowest level, you'll have approximately as many threads as there are list elements. And you don't catch exceptions resulting from this, so you'll not know (in your main thread) that this didn't work as intended.
You should probably limit the number of levels for which you spwan new threads. Once you have passes the for say 3 levels, you'll have about 23=8 threads, which should be enough to keep all cores busy on most reasonable machines. You can then let the rest of the computation proceed wthout branching off further threads. You could do that by passing an additional parameter branching to your quicksort method. Set that to 3 in the invocation from the QuickSortWithThreads constructor, and decrement it on every call. Don't branch once the count reaches 0. This will give you the following calls:
quicksort(3, …)
quicksort(2, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
quicksort(2, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
Since each non-leaf call shares a thread with one of its children, you can deduct the maximum of 8 threads from the number of leafs above.
Limiting number of threads via Executors
As an alternative to this home-made way of restricting the number of threads, you might of course do this using the Executor interface you mentioned. You could create a ThreadPoolExecutor to manage your threads, and pass each recursive invocation as an instance of Runnable (which might look similar to your QuickSortThread). One major problem with this approach is detecting termination. Particularly if you want to avoid deadlock in case of an error. So it might be better to use a ForkJoinTask instead, since in that case you can have each task wait on the conclusion of its other child, very similar to what you wrote, and you can still limit the number of actual threads in the associated ForkJoinPool. Your actual implementation would best use RecursiveAction, a specialization of ForkJoinTask if you have no return value, for which the documentation contains an example very similar to your scenario.

The way your threads behave depend on your hardware. With a single core CPU and no hyperthreading, computer processes 1 thread at a time line by line thread by thread in a loop. If you have hyperthreading and/or multiple cores, they can run multiple lines simultaneously. A call to examplethread.join() makes the calling thread to wait until examplethread finishes its job (by returning from run() method).
if you make a thread and 2 lines later call for join you will pretty much have multithreaded synchronized task very similar to making it singlethreaded.
Id suggest to make an ArrayList and add each thread to the list, after all threads are set and working you call a
for(Thread t : mythreadlist) {
try {
t.join();
} catch (InterruptedException e) { System.err.println("Interrupted Thread"); }
}
to make your application wait for all threads to exit.
edit:
// [...]
public class QuickSortWithThreads {
ArrayList<QuickSortThread> threads = new ArrayList<>();
public QuickSortWithThreads(ArrayList <Integer> arr, int left, int right){
quicksort(arr, left, right); // Pretty much make your threads start their jobs
for(Thread t : threads) { // Then wait them to leave.
try {
t.join();
} catch (InterruptedException e) { System.err.println("Interrupted Thread"); }
}
}
// [...]
static void quicksort(ArrayList <Integer> arr, int left, int right) {
int pivot;
if(left<right){
pivot = partition(arr, left, right);
QuickSortThread threadLeftSide = new QuickSortThread(arr, pivot + 1, right);
threadLeftSide.start();
threads.add(threadLeftSide());
//
quicksort(arr, left, pivot - 1);
}
}
// [...]

Related

Multithreading issue in Java, different results at runtime

Whenever I run this program it gives me different result. Can someone explain to me, or give me some topics where I could find answer in order to understand what happens in the code?
class IntCell {
private int n = 0;
public int getN() {return n;}
public void setN(int n) {this.n = n;}
}
public class Count extends Thread {
static IntCell n = new IntCell();
public void run() {
int temp;
for (int i = 0; i < 200000; i++) {
temp = n.getN();
n.setN(temp + 1);
}
}
public static void main(String[] args) {
Count p = new Count();
Count q = new Count();
p.start();
q.start();
try { p.join(); q.join(); }
catch (InterruptedException e) { }
System.out.println("The value of n is " + n.getN());
}
}

The reason is simple: you don't get and modify your counter atomically such that your code is prone to race condition issues.
Here is an example that illustrates the problem:
Thread #1 calls n.getN() gets 0
Thread #2 calls n.getN() gets 0
Thread #1 calls n.setN(1) to set n to 1
Thread #2 is not aware that thread #1 has already set n to 1 so still calls n.setN(1) to set n to 1 instead of 2 as you would expect, this is called a race condition issue.
Your final result would then depend on the total amount of race condition issues met while executing your code which is unpredictable so it changes from one test to another.
One way to fix it, is to get and set your counter in a synchronized block in order to do it atomically as next, indeed it will enforce the threads to acquire an exclusive lock on the instance of IntCell assigned to n before being able to execute this section of code.
synchronized (n) {
temp = n.getN();
n.setN(temp + 1);
}
Output:
The value of n is 400000
You could also consider using AtomicInteger instead of int for your counter in order to rely on methods of type addAndGet(int delta) or incrementAndGet() to increment your counter atomically.

The access to the IntCell n static variable is concurrent between your two threads :
static IntCell n = new IntCell();
public void run() {
int temp;
for (int i = 0; i < 200000; i++) {
temp = n.getN();
n.setN(temp + 1);
}
}
Race conditions make that you cannot have a predictable behavior when n.setN(temp + 1); is performed as it depends on which thread has previously called :temp = n.getN();.
If it the current thread, you have the value put by the thread otherwise you have the last value put by the other thread.
You could add synchronization mechanism to avoid the problem of unexpected behavior.

You are running 2 threads in parallel and updating a shared variable by these 2 threads, that is why your answer is always different. It is not a good practice to update shared variable like this.
To understand, you should first understand Multithreading and then notify and wait, simple cases

You modify the same number n with two concurrent Threads. If Thread1 reads n = 2, then Thread2 reads n = 2 before Thread2 has written the increment, Thread1 will increment n to 3, but Thread2 will no more increment, but write another "3" to n. If Thread1 finishes its incrementation before Thread2 reads, both will increment.
Now both Threads are concurrent and you can never tell which one will get what CPU cycle. This depends on what else runs on your machine. So You will always lose a different number of incrementations by the above mentioned overwriting situation.
To solve it, run real incrementations on n via n++. They go in a single CPU cycle.

Threaded quicksort

Hello I've never tried using threads before, this is my first attempt but it doesn't stop, The normal verion works.
if I remove awaitTermination it looks like it works but I need the method to finish when it's all sorted out(pun intended XD).
Can you tell me what I did wrong?
Thank you.
public class Sorting {
private Sorting() {};
private static Random r = new Random();
private static int cores = Runtime.getRuntime().availableProcessors();
private static ExecutorService executor = Executors.newFixedThreadPool(cores);
public static void qsortP(int[] a) {
qsortParallelo(a, 0, a.length - 1);
}
private static void qsortParallelo(int[] a, int first, int last) {
while (first < last) {
int p = first + r.nextInt(last - first + 1);
int px = a[p];
int i = first, j = last;
do {
while (a[i] < px)
i++;
while (a[j] > px)
j--;
if (i <= j) {
scambia(a, i++, j--);
}
} while (i <= j);
executor.submit(new qsortThread(a, first, j));
first = i;
}
try {
executor.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private static void scambia(int[] a, int x, int y) {
int temp = a[x];
a[x] = a[y];
a[y] = temp;
}
public static class qsortThread implements Runnable {
final int a[], first, last;
public qsortThread(int[] a, int first, int last) {
this.a = a;
this.first = first;
this.last = last;
}
public void run() {
qsortParallelo(a, first, last);
}
}
}

Instead of waiting for termination of the entire executor service (which probably isn't what you want at all), you should save all the Futures returned by executor.submit() and wait until they're all done (by calling 'get()` on them for example).
And though it's tempting to do this in the qsortParallelo() method, that would actually lead to a deadlock by exhaustion of the thread pool: parent tasks would hog the worker threads waiting for their child tasks to complete, but the child tasks would never be scheduled to run because there would be no available worker threads.
So you have to collect all the Future objects into a concurrent collection first, return the result to qsortP() and wait there for the Futures to finish.
Or use a ForkJoinPool, which was designed for exactly this kind of task and does all the donkey work for you. Recursively submitting tasks to an executor from application code is generally not a very good idea, it's very easy to get it wrong.
As an aside, the reason your code is deadlocked as it is is that every worker thread is stuck in executor.awaitTermination(), thereby preventing the termination of the executor service.
In general, the two most useful tools for designing and debugging multi-threaded applications are:
A thread dump. You can generate that with jstack, VisualVM or any other tool, but it's invaluable in deadlock situations, it gives you an accurate image of what's (not) going on with your threads.
A pen, a piece of paper and drawing a good old fashioned swimlane chart.

You are calling executor.awaitTermination inside a Thread which was launched by your executor. Thread will not stop until executor comes out of the awaitTermination and executor will not come out of awaitTermination until the Thread terminates. You need to move this code:
try {
executor.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
e.printStackTrace();
}
into the end of qsortP method.

The mistake in this code is simply the while-loop in qsortParallelo. first and last are never modified. Apart from that you don't need the while-loop, since you already do that the further sorting in the executor. And you'll need to start another task for the second half of the array.

Concurrent checking if collection is empty

I have this piece of code:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
#Override
public void run(){
while(!intervals.isEmpty()){
//remove one interval
//do calculations
//add some intervals
}
}
This code is being executed by a specific number of threads at the same time. As you see, loop should go on until there are no more intervals left in the collection, but there is a problem. In the beginning of each iteration an interval gets removed from collection and in the end some number of intervals might get added back into same collection.
Problem is, that while one thread is inside the loop the collection might become empty, so other threads that are trying to enter the loop won't be able to do that and will finish their work prematurely, even though collection might be filled with values after the first thread will finish the iteration. I want the thread count to remain constant (or not more than some number n) until all work is really finished.
That means that no threads are currently working in the loop and there are no elements left in the collection. What are possible ways of accomplishing that? Any ideas are welcomed.
One way to solve this problem in my specific case is to give every thread a different piece of the original collection. But after one thread would finish its work it wouldn't be used by the program anymore, even though it could help other threads with their calculations, so I don't like this solution, because it's important to utilize all cores of the machine in my problem.
This is the simplest minimal working example I could come up with. It might be to lengthy.
public class Test{
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
private int threadNumber;
private Thread[] threads;
private double result;
public Test(int threadNumber){
intervals.add(new Interval(0, 1));
this.threadNumber = threadNumber;
threads = new Thread[threadNumber];
}
public double find(){
for(int i = 0; i < threadNumber; i++){
threads[i] = new Thread(new Finder());
threads[i].start();
}
try{
for(int i = 0; i < threadNumber; i++){
threads[i].join();
}
}
catch(InterruptedException e){
System.err.println(e);
}
return result;
}
private class Finder implements Runnable{
#Override
public void run(){
while(!intervals.isEmpty()){
Interval interval = intervals.poll();
if(interval.high - interval.low > 1e-6){
double middle = (interval.high + interval.low) / 2;
boolean something = true;
if(something){
intervals.add(new Interval(interval.low + 0.1, middle - 0.1));
intervals.add(new Interval(middle + 0.1, interval.high - 0.1));
}
else{
intervals.add(new Interval(interval.low + 0.1, interval.high - 0.1));
}
}
}
}
}
private class Interval{
double low;
double high;
public Interval(double low, double high){
this.low = low;
this.high = high;
}
}
}
What you might need to know about the program: After every iteration interval should either disappear (because it's too small), become smaller or split into two smaller intervals. Work is finished after no intervals are left. Also, I should be able to limit number of threads that are doing this work with some number n. The actual program looks for a maximum value of some function by dividing the intervals and throwing away the parts of those intervals that can't contain the maximum value using some rules, but this shouldn't really be relevant to my problem.

The CompletableFuture class is also an interesting solution for these kind of tasks.
It automatically distributes workload over a number of worker threads.
static CompletableFuture<Integer> fibonacci(int n) {
if(n < 2) return CompletableFuture.completedFuture(n);
else {
return CompletableFuture.supplyAsync(() -> {
System.out.println(Thread.currentThread());
CompletableFuture<Integer> f1 = fibonacci(n - 1);
CompletableFuture<Integer> f2 = fibonacci(n - 2);
return f1.thenCombineAsync(f2, (a, b) -> a + b);
}).thenComposeAsync(f -> f);
}
}
public static void main(String[] args) throws Exception {
int fib = fibonacci(10).get();
System.out.println(fib);
}

You can use atomic flag, i.e.:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue<>();
private AtomicBoolean inUse = new AtomicBoolean();
#Override
public void run() {
while (!intervals.isEmpty() && inUse.compareAndSet(false, true)) {
// work
inUse.set(false);
}
}
UPD
Question has been updated, so I would give you better solution. It is more "classic" solution using blocking queue;
private BlockingQueue<Interval> intervals = new ArrayBlockingQueue<Object>();
private volatile boolean finished = false;
#Override
public void run() {
try {
while (!finished) {
Interval next = intervals.take();
// put work there
// after you decide work is finished just set finished = true
intervals.put(interval); // anyway, return interval to queue
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
UPD2
Now it seems better to re-write solution and divide range to sub-ranges for each thread.

Your problem looks like a recursive one - processing one task (interval) might produce some sub-tasks (sub intervals).
For that purpose I would use ForkJoinPool and RecursiveTask:
class Interval {
...
}
class IntervalAction extends RecursiveAction {
private Interval interval;
private IntervalAction(Interval interval) {
this.interval = interval;
}
#Override
protected void compute() {
if (...) {
// we need two sub-tasks
IntervalAction sub1 = new IntervalAction(new Interval(...));
IntervalAction sub2 = new IntervalAction(new Interval(...));
sub1.fork();
sub2.fork();
sub1.join();
sub2.join();
} else if (...) {
// we need just one sub-task
IntervalAction sub3 = new IntervalAction(new Interval(...));
sub3.fork();
sub3.join();
} else {
// current task doesn't need any sub-tasks, just return
}
}
}
public static void compute(Interval initial) {
ForkJoinPool pool = new ForkJoinPool();
pool.invoke(new IntervalAction(initial));
// invoke will return when all the processing is completed
}

I had the same problem, and I tested the following solution.
In my test example I have a queue (the equivalent of your intervals) filled with integers. For the test, at each iteration one number is taken from the queue, incremented and placed back in the queue if the new value is below 7 (arbitrary). This has the same impact as your interval generation on the mechanism.
Here is an example working code (Note that I develop in java 1.8 and I use the Executor framework to handle my thread pool.) :
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.PriorityBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
public class Test {
final int numberOfThreads;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
final BlockingQueue<Integer> sleepingThreadsTokens;
final ThreadPoolExecutor executor;
public static void main(String[] args) {
final Test test = new Test(2); // arbitrary number of thread => 2
test.launch();
}
private Test(int numberOfThreads){
this.numberOfThreads = numberOfThreads;
this.queue = new PriorityBlockingQueue<Integer>();
this.availableThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.sleepingThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numberOfThreads);
}
public void launch() {
// put some elements in queue at the beginning
queue.add(1);
queue.add(2);
queue.add(3);
for(int i = 0; i < numberOfThreads; i++){
availableThreadsTokens.add(1);
}
System.out.println("Start");
boolean algorithmIsFinished = false;
while(!algorithmIsFinished){
if(sleepingThreadsTokens.size() != numberOfThreads){
try {
availableThreadsTokens.take();
} catch (final InterruptedException e) {
e.printStackTrace();
// some treatment should be put there in case of failure
break;
}
if(!queue.isEmpty()){ // Continuation condition
sleepingThreadsTokens.drainTo(availableThreadsTokens);
executor.submit(new Loop(queue.poll(), queue, availableThreadsTokens));
}
else{
sleepingThreadsTokens.add(1);
}
}
else{
algorithmIsFinished = true;
}
}
executor.shutdown();
System.out.println("Finished");
}
public static class Loop implements Runnable{
int element;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
public Loop(Integer element, BlockingQueue<Integer> queue, BlockingQueue<Integer> availableThreadsTokens){
this.element = element;
this.queue = queue;
this.availableThreadsTokens = availableThreadsTokens;
}
#Override
public void run(){
System.out.println("taking element "+element);
for(Long l = (long) 0; l < 500000000L; l++){
}
for(Long l = (long) 0; l < 500000000L; l++){
}
for(Long l = (long) 0; l < 500000000L; l++){
}
if(element < 7){
this.queue.add(element+1);
System.out.println("Inserted element"+(element + 1));
}
else{
System.out.println("no insertion");
}
this.availableThreadsTokens.offer(1);
}
}
}
I ran this code for check, and it seems to work properly. However there are certainly some improvement that can be made :
sleepingThreadsTokens do not have to be a BlockingQueue, since only the main accesses it. I used this interface because it allowed a nice sleepingThreadsTokens.drainTo(availableThreadsTokens);
I'm not sure whether queue has to be blocking or not, since only main takes from it and does not wait for elements (it waits only for tokens).
...
The idea is that the main thread checks for the termination, and for this it has to know how many threads are currently working (so that it does not prematurely stops the algorithm because the queue is empty). To do so two specific queues are created : availableThreadsTokens and sleepingThreadsTokens. Each element in availableThreadsTokens symbolizes a thread that have finished an iteration, and wait to be given another one. Each element in sleepingThreadsTokens symbolizes a thread that was available to take a new iteration, but the queue was empty, so it had no job and went to "sleep". So at each moment availableThreadsTokens.size() + sleepingThreadsTokens.size() = numberOfThreads - threadExcecutingIteration.
Note that the elements on availableThreadsTokens and sleepingThreadsTokens only symbolizes thread activity, they are not thread nor design a specific thread.
Case of termination : let suppose we have N threads (aribtrary, fixed number). The N threads are waiting for work (N tokens in availableThreadsTokens), there is only 1 remaining element in the queue and the treatment of this element won't generate any other element. Main takes the first token, finds that the queue is not empty, poll the element and sends the thread to work. The N-1 next tokens are consumed one by one, and since the queue is empty the token are moved into sleepingThreadsTokens one by one. Main knows that there is 1 thread working in the loop since there is no token in availableThreadsTokens and only N-1 in sleepingThreadsTokens, so it waits (.take()). When the thread finishes and releases the token Main consumes it, discovers that the queue is now empty and put the last token in sleepingThreadsTokens. Since all tokens are now in sleepingThreadsTokens Main knows that 1) all threads are inactive 2) the queue is empty (else the last token wouldn't have been transferred to sleepingThreadsTokens since the thread would have take the job).
Note that if the working thread finishes the treatment before all the availableThreadsTokens are moved to sleepingThreadsTokens it makes no difference.
Now if we suppose that the treatment of the last element would have generated M new elements in the queue then the Main would have put all the tokens from sleepingThreadsTokens back to availableThreadsTokens, and start to assign them treatments again. We put all the token back even if M < N because we don't know how much elements will be inserted in the future, so we have to keep all the thread available.

I would suggest a master/worker approach then.
The master process goes through the intervals and assigns the calculations of that interval to a different process. It also removes/adds as necessary. This way, all the cores are utilized, and only when all intervals are finished, the process is done. This is also known as dynamic work allocation.
A possible example:
public void run(){
while(!intervals.isEmpty()){
//remove one interval
Thread t = new Thread(new Runnable()
{
//do calculations
});
t.run();
//add some intervals
}
}
The possible solution you provided is known as static allocation, and you're correct, it will finish as fast as the slowest processor, but the dynamic approach will utilize all memory.

I've run into this problem as well. The way I solved it was to use an AtomicInteger to know what is in the queue. Before each offer() increment the integer. After each poll() decrement the integer. The CLQ has no real isEmpty() since it must look at head/tail nodes and this can change atomically (CAS).
This doesn't guarantee 100% that some thread may increment after another thread decrements so you need to check again before ending the thread. It is better than relying on while(...isEmpty())
Other than that, you may need to synchronize.

Adding a stop condition to fork-join recursion

To simplify my case, let's assume that I'm implementing a Binary Search using Java's Fork-Join framework. My goal is to find a specific integer value (the target integer) in an array of integers. This can be done by breaking the array by half until it's small enough to perform a serial search. The result of the algorithm needs to be a boolean value indicating whether the target integer was found in the array or not.
A similar problem is explored in Klaus Kreft's presentation in slide 28 onward. However, Kreft's goal is to find the largest number in the array so all entries have to be scanned. In my case, it is not necessary to scan the full array because once the target integer was found, the search can be stopped.
My problem is that once I encounter the target integer many tasks have already been inserted to the thread pool and I need to cancel them since there is no point in continuing the search. I tried to call getPool().terminate() from inside a RecursiveTask but that didn't help much since many tasks are already queued and I even noticed that new onces are queued too even after shutdown was called..
My current solution is to use a static volatile boolean that is initiated as 'false' and to check its value at the beginning of the task. If it's still 'false' then the task begins its works, if it's 'true', the task immediately returns. I can actually use a RecursiveAction for that.
So I think that this solution should work, but I wonder if the framework offers some standard way of handling cases like that - i.e. defining a stop condition to the recursion that consequently cancels all queued tasks.
Note that if I want to stop all running tasks immediately when the target integer was found (by one of the running tasks) I have to check the boolean after each line in these tasks and that can affect performance since the value of that boolean cannot be cached (it's defined as volatile).
So indeed, I think that some standard solution is needed and can be provided in the form of clearing the queue and interuppting the running tasks. But I haven't found such a solution and I wonder if anyone else knows about it or has a better idea.
Thank you for your time,
Assaf
EDIT: here is my testing code:
package xxx;
import java.util.Arrays;
import java.util.Random;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveAction;
public class ForkJoinTest {
static final int ARRAY_SIZE = 1000;
static final int THRESHOLD = 10;
static final int MIN_VALUE = 0;
static final int MAX_VALUE = 100;
static Random rand = new Random();
// a function for retrieving a random int in a specific range
public static int randInt(int min, int max) {
return rand.nextInt((max - min) + 1) + min;
}
static volatile boolean result = false;
static int[] array = new int[ARRAY_SIZE];
static int target;
#SuppressWarnings("serial")
static class MyAction extends RecursiveAction {
int startIndex, endIndex;
public MyAction(int startIndex, int endIndex) {
this.startIndex = startIndex;
this.endIndex = endIndex;
}
// if the target integer was not found yet: we first check whether
// the entries to search are too few. In that case, we perform a
// sequential search and update the result if the target was found.
// Otherwise, we break the search into two parts and invoke the
// search in these two tasks.
#Override
protected void compute() {
if (!result) {
if (endIndex-startIndex<THRESHOLD) {
//
for (int i=startIndex ; i<endIndex ; i++) {
if (array[i]==target) {
result = true;
}
}
} else {
int middleIndex = (startIndex + endIndex) / 2;
RecursiveAction action1 = new MyAction(startIndex, middleIndex);
RecursiveAction action2 = new MyAction(middleIndex+1, endIndex);
invokeAll(Arrays.asList(action1,action2));
}
}
}
}
public static void main(String[] args) throws InterruptedException, ExecutionException {
for (int i=0 ; i<ARRAY_SIZE ; i++) {
array[i] = randInt(MIN_VALUE, MAX_VALUE);
}
target = randInt(MIN_VALUE, MAX_VALUE);
ForkJoinPool pool = new ForkJoinPool();
pool.invoke(new MyAction(0,ARRAY_SIZE));
System.out.println(result);
}
}

I think you may be inventing a barrier to the correct solution.
You say that your boolean stop flag must be volatile and so will interfere with the speed of the solution - well, yes and no - accessing a volatile does indeed do cache flushing but have you considered an AtomicBoolean?
I believe the correct solution is to use an AtomicBoolean flag to get all processes to stop. You should check is in as finely grained fashion as is reasonable to get your system to stop quickly.
It would be a mistake to attempt to clear all queues and interrupt all threads - this would lead to a horrible mess.
static AtomicBoolean finished = new AtomicBoolean();
....
protected void compute() {
if (!finished.get()) {
if (endIndex - startIndex < THRESHOLD) {
//
for (int i = startIndex; i < endIndex && !finished.get(); i++) {
if (array[i] == target) {
finished.set(true);
System.out.print("Found at " + i);
}
}
} else {
...
}
}
}

I left a comment above on how to do this by looking at an open source product that does this in many built-in-functions. Let me put some detail here.
If you want to cancel tasks that are beginning or are currently executing, then each task needs to know about every other task. When one task finds what it wants, that task need to inform every other task to stop. You cannot do this with dyadic recursive division (RecursiveTask, etc.) since you create new tasks recursively and the old tasks will never know about the new ones. I’m sure you could pass a reference to a stop-me field to each new task, but it will get very messy and debugging would be “interesting.”
You can do this with Java8 CountedCompleter(). The framework was butchered to support this class so things that should be done by the framework needs doing manually, but it can work.
Each task needs a volatile boolean and a method to set it to true. Each tasks need an array of references to all the other tasks. Create all the tasks up front, each with an empty array of to-be references to the other tasks. Fill in the array of references to every other task. Now submit each task (see the doc for this class, fork() addPendingCount() etc.)
When one tasks finds what it wants, it uses the array of references to the other tasks to set their boolean to true. If there is a race condition with multiple threads, it doesn’t matter since all threads set “true.” You will also need to handle tryComplete(), onCompletion() etc. This class is very muddled. It is used for the Java8 stream processing which is a story in itself.
What you cannot do is purge pending tasks from the deques before they begin. You need to wait until the task starts and check the boolean for true. If the execution is lengthy, then you may also want to check the boolean for true periodically. The overhead of a volatile read is not that bad and there really is no other way.

Java 6 Threading output is not Asynchronous?

This code should produce even and uneven output because there is no synchronized on any methods. Yet the output on my JVM is always even. I am really confused as this example comes straight out of Doug Lea.
public class TestMethod implements Runnable {
private int index = 0;
public void testThisMethod() {
index++;
index++;
System.out.println(Thread.currentThread().toString() + " "
+ index );
}
public void run() {
while(true) {
this.testThisMethod();
}
}
public static void main(String args[]) {
int i = 0;
TestMethod method = new TestMethod();
while(i < 20) {
new Thread(method).start();
i++;
}
}
}
Output
Thread[Thread-8,5,main] 135134
Thread[Thread-8,5,main] 135136
Thread[Thread-8,5,main] 135138
Thread[Thread-8,5,main] 135140
Thread[Thread-8,5,main] 135142
Thread[Thread-8,5,main] 135144

I tried with volatile and got the following (with an if to print only if odd):
Thread[Thread-12,5,main] 122229779
Thread[Thread-12,5,main] 122229781
Thread[Thread-12,5,main] 122229783
Thread[Thread-12,5,main] 122229785
Thread[Thread-12,5,main] 122229787
Answer to comments:
the index is infact shared, because we have one TestMethod instance but many Threads that call testThisMethod() on the one TestMethod that we have.
Code (no changes besides the mentioned above):
public class TestMethod implements Runnable {
volatile private int index = 0;
public void testThisMethod() {
index++;
index++;
if(index % 2 != 0){
System.out.println(Thread.currentThread().toString() + " "
+ index );
}
}
public void run() {
while(true) {
this.testThisMethod();
}
}
public static void main(String args[]) {
int i = 0;
TestMethod method = new TestMethod();
while(i < 20) {
new Thread(method).start();
i++;
}
}
}

First off all: as others have noted there's no guarantee at all, that your threads do get interrupted between the two increment operations.
Note that printing to System.out pretty likely forces some kind of synchronization on your threads, so your threads are pretty likely to have just started a time slice when they return from that, so they will probably complete the two incrementation operations and then wait for the shared resource for System.out.
Try replacing the System.out.println() with something like this:
int snapshot = index;
if (snapshot % 2 != 0) {
System.out.println("Oh noes! " + snapshot);
}

You don't know that. The point of automatic scheduling is that it makes no guarantees. It might treat two threads that run the same code completely different. Or completely the same. Or completely the same for an hour and then suddenly different...
The point is, even if you fix the problems mentioned in the other answers, you still cannot rely on things coming out a particular way; you must always be prepared for any possible interleaving that the Java memory and threading model allows, and that includes the possibility that the println always happens after an even number of increments, even if that seems unlikely to you on the face of it.

The result is exactly as I would expect. index is being incremented twice between outputs, and there is no interaction between threads.
To turn the question around - why would you expect odd outputs?
EDIT: Whoops. I wrongly assumed a new runnable was being created per Thread, and therefore there was a distinct index per thread, rather than shared. Disturbing how such a flawed answer got 3 upvotes though...

You have not marked index as volatile. This means that the compiler is allowed to optimize accesses to it, and it probably merges your 2 increments to one addition.

You get the output of the very first thread you start, because this thread loops and gives no chance to other threads to run.
So you should Thread.sleep() or (not recommended) Thread.yield() in the loop.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.