Threaded quicksort

Threaded quicksort - java

Hello I've never tried using threads before, this is my first attempt but it doesn't stop, The normal verion works.
if I remove awaitTermination it looks like it works but I need the method to finish when it's all sorted out(pun intended XD).
Can you tell me what I did wrong?
Thank you.
public class Sorting {
private Sorting() {};
private static Random r = new Random();
private static int cores = Runtime.getRuntime().availableProcessors();
private static ExecutorService executor = Executors.newFixedThreadPool(cores);
public static void qsortP(int[] a) {
qsortParallelo(a, 0, a.length - 1);
}
private static void qsortParallelo(int[] a, int first, int last) {
while (first < last) {
int p = first + r.nextInt(last - first + 1);
int px = a[p];
int i = first, j = last;
do {
while (a[i] < px)
i++;
while (a[j] > px)
j--;
if (i <= j) {
scambia(a, i++, j--);
}
} while (i <= j);
executor.submit(new qsortThread(a, first, j));
first = i;
}
try {
executor.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private static void scambia(int[] a, int x, int y) {
int temp = a[x];
a[x] = a[y];
a[y] = temp;
}
public static class qsortThread implements Runnable {
final int a[], first, last;
public qsortThread(int[] a, int first, int last) {
this.a = a;
this.first = first;
this.last = last;
}
public void run() {
qsortParallelo(a, first, last);
}
}
}

Instead of waiting for termination of the entire executor service (which probably isn't what you want at all), you should save all the Futures returned by executor.submit() and wait until they're all done (by calling 'get()` on them for example).
And though it's tempting to do this in the qsortParallelo() method, that would actually lead to a deadlock by exhaustion of the thread pool: parent tasks would hog the worker threads waiting for their child tasks to complete, but the child tasks would never be scheduled to run because there would be no available worker threads.
So you have to collect all the Future objects into a concurrent collection first, return the result to qsortP() and wait there for the Futures to finish.
Or use a ForkJoinPool, which was designed for exactly this kind of task and does all the donkey work for you. Recursively submitting tasks to an executor from application code is generally not a very good idea, it's very easy to get it wrong.
As an aside, the reason your code is deadlocked as it is is that every worker thread is stuck in executor.awaitTermination(), thereby preventing the termination of the executor service.
In general, the two most useful tools for designing and debugging multi-threaded applications are:
A thread dump. You can generate that with jstack, VisualVM or any other tool, but it's invaluable in deadlock situations, it gives you an accurate image of what's (not) going on with your threads.
A pen, a piece of paper and drawing a good old fashioned swimlane chart.

You are calling executor.awaitTermination inside a Thread which was launched by your executor. Thread will not stop until executor comes out of the awaitTermination and executor will not come out of awaitTermination until the Thread terminates. You need to move this code:
try {
executor.awaitTermination(1, TimeUnit.DAYS);
} catch (InterruptedException e) {
e.printStackTrace();
}
into the end of qsortP method.

The mistake in this code is simply the while-loop in qsortParallelo. first and last are never modified. Apart from that you don't need the while-loop, since you already do that the further sorting in the executor. And you'll need to start another task for the second half of the array.

Related

Why does a FIFO array queue lock not seem fair?

In §7.5.1 of The Art of Multiprocessor Programming by Herlihy et al. (2nd ed., 2020), the authors present a simple lock that uses an array queue to achieve FIFO locking. Intuitively, the nth thread has a (thread-local) index into an array, and then spins on that array element until the n - 1 thread unlocks the lock. Its code looks like this:
public class ALock {
ThreadLocal<Integer> mySlotIndex = new ThreadLocal<>() {
#Override protected Integer initialValue() { return 0; }
};
AtomicInteger tail;
volatile boolean[] flag;
int size;
public ALock(int capacity) {
size = capacity;
tail = new AtomicInteger(0);
flag = new boolean[capacity];
flag[0] = true;
}
public void lock() {
int slot = tail.getAndIncrement() % size;
mySlotIndex.set(slot);
while (!flag[slot]) {};
}
public void unlock() {
int slot = mySlotIndex.get();
flag[slot] = false;
flag[(slot + 1) % size] = true;
}
}
I am using a minimal test program to check that this lock is fair. In a nutshell, I create NUM_THREADS threads and map each one to an array index id. Each thread tries to acquire the same lock. Once it succeeds, it increments a global COUNT and also increments RUNS_PER_THREAD[id].
If the lock is correct, the final value of COUNT should equal the sum of the values in RUNS_PER_THREAD. If the lock is fair, the elements of RUNS_PER_THREAD should be approximately equal.
public class Main {
static long COUNT = 0;
static int NUM_THREADS = 16;
// static Lock LOCK = new ReentrantLock(true);
static ALock LOCK = new ALock(NUM_THREADS);
static long[] RUNS_PER_THREAD = new long[NUM_THREADS];
static Map<Long, Integer> THREAD_IDS = new HashMap<>();
public static void main(String[] args) {
var threads = IntStream.range(0, NUM_THREADS).mapToObj(Main::makeWorker).toArray(Thread[]::new);
for (int i = 0; i < threads.length; i++) THREAD_IDS.put(threads[i].getId(), i);
for (var thread: threads) thread.start();
try { Thread.sleep(300L); } catch (InterruptedException e) {}
for (var thread: threads) thread.interrupt();
try { Thread.sleep(100L); } catch (InterruptedException e) {}
for (int i = 0; i < NUM_THREADS; i++) System.out.printf("Thread %d:\t%12d%n", i, RUNS_PER_THREAD[i]);
System.out.println("Counted up to: \t\t\t" + COUNT);
System.out.println("Sum for all threads: \t" + Arrays.stream(RUNS_PER_THREAD).sum());
}
private static Thread makeWorker(int i) {
return new Thread(() -> {
while (true) {
if (Thread.interrupted()) return;
LOCK.lock();
try {
COUNT++;
var id = THREAD_IDS.get(Thread.currentThread().getId());
RUNS_PER_THREAD[id]++;
} finally {
LOCK.unlock();
}}});
}
}
If the test program is run with a fair ReentrantLock, the final count of runs per thread with 16 threads (on my M1 Max Mac with Java 17) is almost exactly equal. If the same test is run with ALock, the first few threads seem to acquire the lock approximately 10 times more frequently than the last few threads.
Is ALock, as presented, unfair, and if so, why? Alternatively, is my minimal test flawed, and if so, why does it seem to demonstrate the fairness of ReentrantLock?

Your test code has non-threadsafe update for COUNT++. Switch to COUNT.incrementAndGet() and:
static AtomicLong COUNT = new AtomicLong();
ALock will give unfair results especially when number of threads exceeds CPUs. The implementation relies on high CPU spin loop while (!flag[slot]) and not all threads are having same opportunity to enter their lock spin-loops - the first few threads are performing more of the lock-unlock cycles. Adding Thread.yield should balance out the thread access to the boolean array so all threads have similar opportunities to run through their own lock spin loop.
while (!flag[slot]) {
Thread.yield();
}
You should see different results if you try setting NUM_THREADS to be same or less than Runtime.getRuntime().availableProcessors() - the use of Thread.yield() may not make a difference compared to when NUM_THREADS > Runtime.getRuntime().availableProcessors().
Using this lock class will lead to slower throughput as at any one time up to N-1 threads are in high CPU spin loop waiting for the current locking thread to call unlock(). In ideal lock implementations, N-1 waiters won't be consuming CPU.
The ALock locking stategy will only work if the exact same number of threads is used as provided new ALock(NUM_THREADS) because otherwise the use of int slot = tail.getAndIncrement() % size; may result in 2 threads reading from the same slot.
Note that any code relying on spin loop or Thread.yield() to work is not an effective implementation and should not be used in production code. Both can be avoided with the classes of java.util.concurrent.*.

Multithreading with a variable number of tasks

I have a class that needs to compute n tasks as quickly as possible (up to 625). Therefore, I want to utilize multithreading so that these computations are run in parallel. After some research, I found the fork/join framework but have not been able to figure out how to implement this.
For example, let there be some class Foo (which will be used as an object elsewhere) with some methods and variables:
public class Foo {
int n;
int[][] fooArray;
public Foo(int x) {
n = x;
fooArray = new int[n][];
}
public void fooFunction(int x, int y) {
//Assume (n > x >= 0).
fooArray[x] = new int[y];
}
//Implement multithreading here.
}
I read a basic tutorial on the Java documentation that uses ForkJoinPool to split a task into 2 parts and use recursion to pass them into the invokeAll method. Ideally, I want to do something similar except implement it as a subclass of Foo and split the task (in this case, running fooFunction) into n parts. How should I accomplish this?

After days of extensive trial-and-error, I finally figured out how to do this myself:
Let there be some class foo that needs something that needs many similar (if not identical) tasks to be done in parallel. Let there be some number n that represents the number of times that this task should be run, where n is more than zero and less than the maximum number of threads that you can create.
public class foo {
//do normal class stuff.
public void fooFunction(int n) {
//do normal function things.
executeThreads(n);
}
public void executeThreads(int n) throws InterruptedException {
ExecutorService exec = Executors.newFixedThreadPool(n);
List<Callable<Object>> tasks = new ArrayList<Callable<Object>>();
for(int i = 0; i < n; i++)
tasks.add(Executors.callable(new Task(i)));
exec.invokeAll(tasks);
exec.shutdown();
}
public class Task implements Runnable {
int taskNumber;
public Task(int i) {
taskNumber = i;
}
public void run() {
try {
//this gets run in a thread
System.out.println("Thread number " + taskNumber);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
This is almost certainly not the most efficient method, and it creates a thread for EVERY task that needs to be done. In other words, this is NOT a thread pool. Make sure that you do not create too many threads and that the tasks are large enough to justify running them in parallel. If there are better alternatives, please post an answer.

Concurrent checking if collection is empty

I have this piece of code:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
#Override
public void run(){
while(!intervals.isEmpty()){
//remove one interval
//do calculations
//add some intervals
}
}
This code is being executed by a specific number of threads at the same time. As you see, loop should go on until there are no more intervals left in the collection, but there is a problem. In the beginning of each iteration an interval gets removed from collection and in the end some number of intervals might get added back into same collection.
Problem is, that while one thread is inside the loop the collection might become empty, so other threads that are trying to enter the loop won't be able to do that and will finish their work prematurely, even though collection might be filled with values after the first thread will finish the iteration. I want the thread count to remain constant (or not more than some number n) until all work is really finished.
That means that no threads are currently working in the loop and there are no elements left in the collection. What are possible ways of accomplishing that? Any ideas are welcomed.
One way to solve this problem in my specific case is to give every thread a different piece of the original collection. But after one thread would finish its work it wouldn't be used by the program anymore, even though it could help other threads with their calculations, so I don't like this solution, because it's important to utilize all cores of the machine in my problem.
This is the simplest minimal working example I could come up with. It might be to lengthy.
public class Test{
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue();
private int threadNumber;
private Thread[] threads;
private double result;
public Test(int threadNumber){
intervals.add(new Interval(0, 1));
this.threadNumber = threadNumber;
threads = new Thread[threadNumber];
}
public double find(){
for(int i = 0; i < threadNumber; i++){
threads[i] = new Thread(new Finder());
threads[i].start();
}
try{
for(int i = 0; i < threadNumber; i++){
threads[i].join();
}
}
catch(InterruptedException e){
System.err.println(e);
}
return result;
}
private class Finder implements Runnable{
#Override
public void run(){
while(!intervals.isEmpty()){
Interval interval = intervals.poll();
if(interval.high - interval.low > 1e-6){
double middle = (interval.high + interval.low) / 2;
boolean something = true;
if(something){
intervals.add(new Interval(interval.low + 0.1, middle - 0.1));
intervals.add(new Interval(middle + 0.1, interval.high - 0.1));
}
else{
intervals.add(new Interval(interval.low + 0.1, interval.high - 0.1));
}
}
}
}
}
private class Interval{
double low;
double high;
public Interval(double low, double high){
this.low = low;
this.high = high;
}
}
}
What you might need to know about the program: After every iteration interval should either disappear (because it's too small), become smaller or split into two smaller intervals. Work is finished after no intervals are left. Also, I should be able to limit number of threads that are doing this work with some number n. The actual program looks for a maximum value of some function by dividing the intervals and throwing away the parts of those intervals that can't contain the maximum value using some rules, but this shouldn't really be relevant to my problem.

The CompletableFuture class is also an interesting solution for these kind of tasks.
It automatically distributes workload over a number of worker threads.
static CompletableFuture<Integer> fibonacci(int n) {
if(n < 2) return CompletableFuture.completedFuture(n);
else {
return CompletableFuture.supplyAsync(() -> {
System.out.println(Thread.currentThread());
CompletableFuture<Integer> f1 = fibonacci(n - 1);
CompletableFuture<Integer> f2 = fibonacci(n - 2);
return f1.thenCombineAsync(f2, (a, b) -> a + b);
}).thenComposeAsync(f -> f);
}
}
public static void main(String[] args) throws Exception {
int fib = fibonacci(10).get();
System.out.println(fib);
}

You can use atomic flag, i.e.:
private ConcurrentLinkedQueue<Interval> intervals = new ConcurrentLinkedQueue<>();
private AtomicBoolean inUse = new AtomicBoolean();
#Override
public void run() {
while (!intervals.isEmpty() && inUse.compareAndSet(false, true)) {
// work
inUse.set(false);
}
}
UPD
Question has been updated, so I would give you better solution. It is more "classic" solution using blocking queue;
private BlockingQueue<Interval> intervals = new ArrayBlockingQueue<Object>();
private volatile boolean finished = false;
#Override
public void run() {
try {
while (!finished) {
Interval next = intervals.take();
// put work there
// after you decide work is finished just set finished = true
intervals.put(interval); // anyway, return interval to queue
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
UPD2
Now it seems better to re-write solution and divide range to sub-ranges for each thread.

Your problem looks like a recursive one - processing one task (interval) might produce some sub-tasks (sub intervals).
For that purpose I would use ForkJoinPool and RecursiveTask:
class Interval {
...
}
class IntervalAction extends RecursiveAction {
private Interval interval;
private IntervalAction(Interval interval) {
this.interval = interval;
}
#Override
protected void compute() {
if (...) {
// we need two sub-tasks
IntervalAction sub1 = new IntervalAction(new Interval(...));
IntervalAction sub2 = new IntervalAction(new Interval(...));
sub1.fork();
sub2.fork();
sub1.join();
sub2.join();
} else if (...) {
// we need just one sub-task
IntervalAction sub3 = new IntervalAction(new Interval(...));
sub3.fork();
sub3.join();
} else {
// current task doesn't need any sub-tasks, just return
}
}
}
public static void compute(Interval initial) {
ForkJoinPool pool = new ForkJoinPool();
pool.invoke(new IntervalAction(initial));
// invoke will return when all the processing is completed
}

I had the same problem, and I tested the following solution.
In my test example I have a queue (the equivalent of your intervals) filled with integers. For the test, at each iteration one number is taken from the queue, incremented and placed back in the queue if the new value is below 7 (arbitrary). This has the same impact as your interval generation on the mechanism.
Here is an example working code (Note that I develop in java 1.8 and I use the Executor framework to handle my thread pool.) :
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.PriorityBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
public class Test {
final int numberOfThreads;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
final BlockingQueue<Integer> sleepingThreadsTokens;
final ThreadPoolExecutor executor;
public static void main(String[] args) {
final Test test = new Test(2); // arbitrary number of thread => 2
test.launch();
}
private Test(int numberOfThreads){
this.numberOfThreads = numberOfThreads;
this.queue = new PriorityBlockingQueue<Integer>();
this.availableThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.sleepingThreadsTokens = new LinkedBlockingQueue<Integer>(numberOfThreads);
this.executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numberOfThreads);
}
public void launch() {
// put some elements in queue at the beginning
queue.add(1);
queue.add(2);
queue.add(3);
for(int i = 0; i < numberOfThreads; i++){
availableThreadsTokens.add(1);
}
System.out.println("Start");
boolean algorithmIsFinished = false;
while(!algorithmIsFinished){
if(sleepingThreadsTokens.size() != numberOfThreads){
try {
availableThreadsTokens.take();
} catch (final InterruptedException e) {
e.printStackTrace();
// some treatment should be put there in case of failure
break;
}
if(!queue.isEmpty()){ // Continuation condition
sleepingThreadsTokens.drainTo(availableThreadsTokens);
executor.submit(new Loop(queue.poll(), queue, availableThreadsTokens));
}
else{
sleepingThreadsTokens.add(1);
}
}
else{
algorithmIsFinished = true;
}
}
executor.shutdown();
System.out.println("Finished");
}
public static class Loop implements Runnable{
int element;
final BlockingQueue<Integer> queue;
final BlockingQueue<Integer> availableThreadsTokens;
public Loop(Integer element, BlockingQueue<Integer> queue, BlockingQueue<Integer> availableThreadsTokens){
this.element = element;
this.queue = queue;
this.availableThreadsTokens = availableThreadsTokens;
}
#Override
public void run(){
System.out.println("taking element "+element);
for(Long l = (long) 0; l < 500000000L; l++){
}
for(Long l = (long) 0; l < 500000000L; l++){
}
for(Long l = (long) 0; l < 500000000L; l++){
}
if(element < 7){
this.queue.add(element+1);
System.out.println("Inserted element"+(element + 1));
}
else{
System.out.println("no insertion");
}
this.availableThreadsTokens.offer(1);
}
}
}
I ran this code for check, and it seems to work properly. However there are certainly some improvement that can be made :
sleepingThreadsTokens do not have to be a BlockingQueue, since only the main accesses it. I used this interface because it allowed a nice sleepingThreadsTokens.drainTo(availableThreadsTokens);
I'm not sure whether queue has to be blocking or not, since only main takes from it and does not wait for elements (it waits only for tokens).
...
The idea is that the main thread checks for the termination, and for this it has to know how many threads are currently working (so that it does not prematurely stops the algorithm because the queue is empty). To do so two specific queues are created : availableThreadsTokens and sleepingThreadsTokens. Each element in availableThreadsTokens symbolizes a thread that have finished an iteration, and wait to be given another one. Each element in sleepingThreadsTokens symbolizes a thread that was available to take a new iteration, but the queue was empty, so it had no job and went to "sleep". So at each moment availableThreadsTokens.size() + sleepingThreadsTokens.size() = numberOfThreads - threadExcecutingIteration.
Note that the elements on availableThreadsTokens and sleepingThreadsTokens only symbolizes thread activity, they are not thread nor design a specific thread.
Case of termination : let suppose we have N threads (aribtrary, fixed number). The N threads are waiting for work (N tokens in availableThreadsTokens), there is only 1 remaining element in the queue and the treatment of this element won't generate any other element. Main takes the first token, finds that the queue is not empty, poll the element and sends the thread to work. The N-1 next tokens are consumed one by one, and since the queue is empty the token are moved into sleepingThreadsTokens one by one. Main knows that there is 1 thread working in the loop since there is no token in availableThreadsTokens and only N-1 in sleepingThreadsTokens, so it waits (.take()). When the thread finishes and releases the token Main consumes it, discovers that the queue is now empty and put the last token in sleepingThreadsTokens. Since all tokens are now in sleepingThreadsTokens Main knows that 1) all threads are inactive 2) the queue is empty (else the last token wouldn't have been transferred to sleepingThreadsTokens since the thread would have take the job).
Note that if the working thread finishes the treatment before all the availableThreadsTokens are moved to sleepingThreadsTokens it makes no difference.
Now if we suppose that the treatment of the last element would have generated M new elements in the queue then the Main would have put all the tokens from sleepingThreadsTokens back to availableThreadsTokens, and start to assign them treatments again. We put all the token back even if M < N because we don't know how much elements will be inserted in the future, so we have to keep all the thread available.

I would suggest a master/worker approach then.
The master process goes through the intervals and assigns the calculations of that interval to a different process. It also removes/adds as necessary. This way, all the cores are utilized, and only when all intervals are finished, the process is done. This is also known as dynamic work allocation.
A possible example:
public void run(){
while(!intervals.isEmpty()){
//remove one interval
Thread t = new Thread(new Runnable()
{
//do calculations
});
t.run();
//add some intervals
}
}
The possible solution you provided is known as static allocation, and you're correct, it will finish as fast as the slowest processor, but the dynamic approach will utilize all memory.

I've run into this problem as well. The way I solved it was to use an AtomicInteger to know what is in the queue. Before each offer() increment the integer. After each poll() decrement the integer. The CLQ has no real isEmpty() since it must look at head/tail nodes and this can change atomically (CAS).
This doesn't guarantee 100% that some thread may increment after another thread decrements so you need to check again before ending the thread. It is better than relying on while(...isEmpty())
Other than that, you may need to synchronize.

Continuing a Thread After Sleeping

I am currently writing a Java application that requires quite a lot of calls to the Twitter API. Because of this I have to worry about exceeding the rate limit. I figured out that I can make 180 calls per 14 minutes and then I have to wait a period of time before I can start calls to the API again (this number is returned in the application). So, when calls reach a certain number I have my thread sleep. My intention is to have the thread pick up where it left off automatically when sleep() is over. Does this work or do I have to worry about CPU scheduling and things like that!?
Maybe I don't fully understand how sleep is supposed to work. Any help would be greatly appreciated is seeing whether or not what I am doing is right. Thank you!
Below is just a couple of lines of pseudo code:
for (int i = 0; i < arr.length; i++)
{
if (calls are a certain number)
{
Thread.sleep(840*1000);
continue;
}
//CALL TO METHOD THAT REQUESTS INFORMATION FROM TWITTER API
}

Use the CyclicBarrier class.
Example from the CyclicBarrier's javadoc:
class Solver {
final int N;
final float[][] data;
final CyclicBarrier barrier;
class Worker implements Runnable {
int myRow;
Worker(int row) { myRow = row; }
public void run() {
while (!done()) {
processRow(myRow);
try {
barrier.await();
} catch (InterruptedException ex) {
return;
} catch (BrokenBarrierException ex) {
return;
}
}
}
}
public Solver(float[][] matrix) {
data = matrix;
N = matrix.length;
barrier = new CyclicBarrier(N,
new Runnable() {
public void run() {
mergeRows(...);
}
});
for (int i = 0; i < N; ++i)
new Thread(new Worker(i)).start();
waitUntilDone();
}
}
You can use only two threads to solve this task, with simple Locks (from java.util.concurrent too). CyclicBarrier just provides more extensible solution.

IIRC, in Java you can object.wait() with a timeout. Is this not what you want? If you want to change the timeout from another thread, change some 'waitValue' variable and notify(). The thread will then 'immediately' run and then wait again with the new timeout value. No explicit sleep required.

Multi - threading

I have tried to create a parallel quicksort in Java which I assume is a naive one (cause I haven't studied yet Interface Executor etc)
I needed a way to print the sorted array once all the threads are done..but I didn't know how many threads I am going to have in advance.. so I was doing it in a way that it will wait each time recursively with the join() method.. so the first join method that was invoked has to wait till all the other threads are done.. right ?
In that way when I execute my last two lines in main() ( of the printing array) I can be sure that all my threads are done...
so I have two questions ..
It is a multi-threading program that runs in parallel, right ? or am I making some mistakes that it actually runs in a linear way thread after thread ?
was I correct with my solution for displaying the sorted array in the main method?
Here is my code:
public class Main {
public static void main(String[] args) {
ArrayList<Integer> array = new ArrayList();
//please assume that I have invoked the input for the array from the user
QuickSortWithThreads obj = new QuickSortWithThreads(array,0 ,array.size()-1 );
for(int i = 0; i < array.size(); i++)
System.out.println(array.get(i));
}
}
public class QuickSortWithThreads {
public QuickSortWithThreads(ArrayList <Integer> arr, int left, int right){
quicksort(arr, left, right);
}
static void quicksort(ArrayList <Integer> arr, int left, int right) {
int pivot;
if(left<right){
pivot = partition(arr, left, right);
QuickSortThread threadLeftSide = new QuickSortThread(arr, pivot + 1, right);
threadLeftSide.start();
quicksort(arr, left, pivot - 1);
try {
threadLeftSide.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
static int partition(ArrayList<Integer> arr, int left, int right) {
int pivot = arr.get(right);
int i = left -1;
for( int j = left; j <= right - 1; j++) {
if (arr.get(j) <= pivot){
i = i + 1;
exchange(arr, i, j);
}
}
exchange(arr, i + 1, right);
return i + 1;
}
static void exchange(ArrayList<Integer> arr, int i, int j) {
int swap = arr.get(i);
arr.set(i, arr.get(j));
arr.set(j, swap);
}
private static class QuickSortThread extends Thread {
int right;
int left;
ArrayList<Integer> refArray;
public QuickSortThread(ArrayList<Integer> array, int left, int right) {
this.right = right;
this.left = left;
refArray = new ArrayList<Integer>();
refArray = array;
}
public void run() {
quicksort(refArray, left, right);
}
}
}

If we knew the overall number of threads, we could use CountDownLatch initialized with the number of threads. But as we don't know the number of threads, we need an extended CountDownLatch which allows to increase the counter after its creation. Unfortunately we cannot just extend the class CountDownLatch as underlying counter is private. One way is to duplicate the original code of CountDownLatch to make access to the underlying counter. Less verbose way is to extend Semaphore to get access to the reducePermits method as it is done in Reduceable Semaphore. In principle, CountDownLatch and Semaphore are similar tools but differ in interpretation of the internal counter: the first counts vetoes and the latter counts permits.
The whole idea is to reduce the number of permits when a thread is created or started, and release permit when it is finished, at the end of the method run(). Initial number of permits is 1, so that if no threads started, the main procedure finishes freely. Note that reducing the number of permits at the beginning of the method run() is too late.
To get really good working code, you need also use a thread pool with fixed number of threads, and make sorting serially for small arrays.

General opinion
Yes, your code runs in parallel. And the result printing looks all right as well.
Limiting number of threads via depth
One problem is the fact that you create a huge number of threads: at the lowest level, you'll have approximately as many threads as there are list elements. And you don't catch exceptions resulting from this, so you'll not know (in your main thread) that this didn't work as intended.
You should probably limit the number of levels for which you spwan new threads. Once you have passes the for say 3 levels, you'll have about 23=8 threads, which should be enough to keep all cores busy on most reasonable machines. You can then let the rest of the computation proceed wthout branching off further threads. You could do that by passing an additional parameter branching to your quicksort method. Set that to 3 in the invocation from the QuickSortWithThreads constructor, and decrement it on every call. Don't branch once the count reaches 0. This will give you the following calls:
quicksort(3, …)
quicksort(2, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
quicksort(2, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
quicksort(1, …)
quicksort(0, …)
quicksort(0, …)
Since each non-leaf call shares a thread with one of its children, you can deduct the maximum of 8 threads from the number of leafs above.
Limiting number of threads via Executors
As an alternative to this home-made way of restricting the number of threads, you might of course do this using the Executor interface you mentioned. You could create a ThreadPoolExecutor to manage your threads, and pass each recursive invocation as an instance of Runnable (which might look similar to your QuickSortThread). One major problem with this approach is detecting termination. Particularly if you want to avoid deadlock in case of an error. So it might be better to use a ForkJoinTask instead, since in that case you can have each task wait on the conclusion of its other child, very similar to what you wrote, and you can still limit the number of actual threads in the associated ForkJoinPool. Your actual implementation would best use RecursiveAction, a specialization of ForkJoinTask if you have no return value, for which the documentation contains an example very similar to your scenario.

The way your threads behave depend on your hardware. With a single core CPU and no hyperthreading, computer processes 1 thread at a time line by line thread by thread in a loop. If you have hyperthreading and/or multiple cores, they can run multiple lines simultaneously. A call to examplethread.join() makes the calling thread to wait until examplethread finishes its job (by returning from run() method).
if you make a thread and 2 lines later call for join you will pretty much have multithreaded synchronized task very similar to making it singlethreaded.
Id suggest to make an ArrayList and add each thread to the list, after all threads are set and working you call a
for(Thread t : mythreadlist) {
try {
t.join();
} catch (InterruptedException e) { System.err.println("Interrupted Thread"); }
}
to make your application wait for all threads to exit.
edit:
// [...]
public class QuickSortWithThreads {
ArrayList<QuickSortThread> threads = new ArrayList<>();
public QuickSortWithThreads(ArrayList <Integer> arr, int left, int right){
quicksort(arr, left, right); // Pretty much make your threads start their jobs
for(Thread t : threads) { // Then wait them to leave.
try {
t.join();
} catch (InterruptedException e) { System.err.println("Interrupted Thread"); }
}
}
// [...]
static void quicksort(ArrayList <Integer> arr, int left, int right) {
int pivot;
if(left<right){
pivot = partition(arr, left, right);
QuickSortThread threadLeftSide = new QuickSortThread(arr, pivot + 1, right);
threadLeftSide.start();
threads.add(threadLeftSide());
//
quicksort(arr, left, pivot - 1);
}
}
// [...]

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.