How to schedule threads in java whose tasks depend on one another?

How to schedule threads in java whose tasks depend on one another? - java

This is my first attempt at multithreading after learning about threads (in theory, heh) in my OS class at school, and I think the way I've gone about doing what I'm trying to do is bad practice/sloppy.
I'm parallelizing a minimax algorithm by spawning a separate thread for each branch of the game on which the algorithm is set to run. The scheduling part of this is a little tricky; as the algorithm deepens iteratively, and I want depth consistency across all the threads.
So, first I have a master thread which spawns a subthread for each available move in the game:
public void run(){
// initializes all the threads
for (AlphaBetaMultiThread t : threadList) {
t.start();
}
try { //This won't ever happen; the subthreads run forever
evals = 0;
for (AlphaBetaMultiThread t : threadList) {
t.join();
evals += t.evals;
}
} catch (Exception e) {
System.out.println("Error joining threads: " + e);
}
}
The thread passes itsself to the constructor so each subthread so that the threads can access the master thread's maxDepth property and signalDepth methods:
public synchronized void signalDepth(){
signals++;
if (signals % threadList.length() == 0){
if (verbose)
System.out.println(toString());
depth++;
}
}
And finally, here's the subthread evaluation process. Whenever it's ahead of the rest of the threads, it lowers its own priority, and then yields until all the subthreads have signalled.
public void run() {
startTime = System.currentTimeMillis();
while(true){
if (depth >= master.maxDepth) {
this.setPriority(4);
this.yield();
break;
} else {
this.setPriority(5);
}
eval = -1*alphabeta(0, infHolder.MIN, infHolder.MAX);
manager.signalDepth();
depth += 1;
}
}
Besides the fact that my implementation seems not to work at all right now (still trying to figure out why), I really feel as if what I'm doing isn't the standard way of doing things. My intuition is that there are probably all kinds of built-in multithreading libraries that could make my life a lot easier, but I don't really know what I'm looking for.
Oh, I'm also getting a warning that Thread.destroy() is deprecated (which is how I'm planning on destroying everything after the computer player finally plays its move).
I guess my question is this: what should I be using to manage my subthreads?
edit: Oh, and if there's stuff I've left out that is relevant to my question, feel free to look at my complete code on github: https://github.com/cowpig/MagneticCave
The relevant files are GameThread and AlphaBetaMultiThread.
I apologize for my cluelessness!
Another edit: I want the threads to deepen iteratively forever, until the gamePlayer object (the one creating the master thread) decides it is time to choose a move-- at which it will access the list of moves and find the one with the highest evaluation. This means .join() won't work, unless I create a new set of threads for every depth iteration, but then that would require a lot more overhead (I think) so I don't really want to have to do that.

Your intuition is correct. Java 5 introduced a slew of useful concurrency constructs. One in particular you might want to research is CyclicBarrier. It is a synchronization aid that allows multiple threads to wait on each other to reach a common, barrier point (in your case, that would be the master depth).

You should be using Thread.join() to wait for the sub-threads, which should just exit when done. No need for Thread.destroy() at all, and no need for fiddling with priorities either.

Instead of using collections of raw Thread instances for this, you should be submitting Runnable instances to an ExecutorService. If all your threads are computing results of some kind, you probably want to use Callable<?> and ExecutorCompletionService.
If you have a dependency graph of Runnables, I'd recommend using ListenableFuture from the excellent Guava library. See the ListenableFuture explained wiki article for details.

If dataflow graph representation is suitable for your task, you can make use of my dataflow library df4j.

Related

Java Thread Interruption in a general sense

One of the most suggested ways to pause a thread is to extend the Runnable interface by adding a pause() method:
interface RunnablePausable extends Runnable {
public void pause();
}
This never made sense to me since you don't actually want to pause the runnable but the Thread that runs it, in the same way you start/interrupt a Thread, not a Runnable.
A more elegant approach: since the interrupt() functionality is well built-in and supported by multiple methods, what if we interrupt() a Thread not just to terminate it, but for a general request instead (like you would interrupt a CPU, in a way)? And then let the Runnable handle this specific request
As an example: interrupt() the thread and, instead of straight up terminating it, handle its request to pause, stop, resume, or do anything else you like.
Not sure if this makes sense.
Something like this:
public void run() {
try {
//...
} catch (InterruptedException ie) { //interrupted
if (i_wanted_to_pause) { //manage request
//wait
}
if (i_wanted_to_stop) {
//return
}
if (any_other_request) {
//handle it
}
}
}
And:
public void run() {
if (Thread.currentThread().isInterrupted()) { //interrupted
if (request_to_pause) { //manage request
//wait
}
if (request_to_stop) {
//return
}
if (any_request) {
//handle it
}
}
}
Now the problem is: how to make a specific request to the interrupted thread?
How can I communicate my request to the interrupted thread, as if was meant to stop, pause, or do anything else?
Ideas:
Subclass InterruptedException into InterruptedExceptionStop and InterruptedExceptionPause (no idea how I can throw them)
Create a separate object containing the request. Don't know what would be the best way to achieve this without over-complicating things
Other?

Yes, as #markspace said in the comments, there is no any practical reason to request a thread to do things like pause/resume/etc at any moment(?).
The thread is just the calculation in general meaning. You know, there is the following popular pattern for CPU intensive executions:
ExecutorService es = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
This means a thread ~= a CPU core. See a thread as a CPU core here. I believe this is an idiomatic view - a long-living sequential calculation, a conveyor. Do you think there should be a way to pause a CPU core by a user? To stop the conveyor by a button stuck to each box on it? I don't think so. So, if you want to prevent CPU from the calculation, just don't ask it to do the calculation. A classical example - Job/Task Queue baked with a thread and a BlockingQueue. You split your calculation into several jobs to consume them to the queue. If you don't consume new ones (optionally clear the queue), your thread is 'paused' naturally on take(). The same for IO, until you are OK to burn CPU with completely non-blocking solutions. With your code, you also have to take care about 3rd party things/objects you use in your run() to don't get them accidentally broken after the interruption, since it's true that "interruption == termination" is a commonplace semantically.
Another possible argument against the approach is mostly an architectural one. Runnable, Callable are examples of the IoC (Inversion of Control) pattern. But we introduce a control/execution management method into them, and this smells IMO.
If you had explained what was the specific problem you were trying to address, we would suggest a more suitable, more idiomatic than RunnablePausable approach.
Now, why do I like the question. It's inspiring to me when developers think about their things so deeply. It's nice when they invent something, even if these are their own homebrewed Continuations and Schedulers:) It may be an instructive game/experiment.

Java: two threads executing until the boolean flag is false: the second thread's first run stops the first thread

I have this threaded program that has three threads: main, lessonThread, questionThread.
It works like this:
Lesson continues continues to gets printed while the finished
the variable is true;
every 5 seconds the questionThread asks Finish
the lesson? and if the answer is y, it sets finished to false
The problem is that the Lesson continues never gets printed after the question gets asked the first time:
Also, as seen on the picture, sometimes lessonThread sneaks in with its Lesson continues before the user can enter the answer to the questionThread's question.
public class Lesson {
private boolean finished;
private boolean waitingForAnswer;
private Scanner scanner = new Scanner(System.in);
private Thread lessonThread;
private Thread questionThread;
public static void main(String[] args) {
Lesson lesson = new Lesson();
lesson.lessonThread = lesson.new LessonThread();
lesson.questionThread = lesson.new QuestionThread();
lesson.lessonThread.start();
lesson.questionThread.start();
}
class LessonThread extends Thread {
#Override
public void run() {
while (!finished && !waitingForAnswer) {
System.out.println("Lesson continues");
}
}
}
class QuestionThread extends Thread {
private Instant sleepStart = Instant.now();
private boolean isAsleep = true;
#Override
public void run() {
while (!finished) {
if (isAsleep && Instant.now().getEpochSecond() - sleepStart.getEpochSecond() >= 5) {
System.out.print("Finish a lesson? y/n");
waitingForAnswer = true;
String reply = scanner.nextLine().substring(0, 1);
switch (reply.toLowerCase()) {
case "y":
finished = true;
}
waitingForAnswer = false;
isAsleep = true;
sleepStart = Instant.now();
}
}
}
}
}
I think the waitingForAnswer = true might be at fault here, but then, the lessonThread has 5 seconds until the questionThread asks the question again, during which the waitingForAnswer is false.
Any help is greatly appreciated.
EDIT: I found a buy in the loop in the lessonThread and changed it to:
#Override
public void run() {
while (!finished) {
if (!waitingForAnswer) {
System.out.println("Lesson continues");
}
}
}
However, I get the same result.
EDIT: I can get it working when inside a debugger:

this just isn't how you're supposed to work with threads. You have 2 major problems here:
java memory model.
Imagine that one thread writes to some variable, and a fraction of a second later, another thread reads it. If that would be guaranteed to work the way you want it to, that means that write has to propagate all the way through any place that could ever see it before code can continue.. and because you have absolutely no idea which fields are read by some thread until a thread actually reads it (java is not in the business of attempting to look ahead and predict what the code will be doing later), that means every single last write to any variable needs a full propagate sync across all threads that can see it... which is all of them! Modern CPUs have multiple cores and each core has their own cache, and if we apply that rule (all changes must be visible immediately everywhere) you might as well take all that cache and chuck it in the garbage because you wouldn't be able to use it.
If it worked like that - java would be slower than molasses.
So java does not work like that. Any thread is free to make a copy of any field or not, at its discretion. If thread A writes 'true' to some instance's variable, and thread B reads that boolean from the exact same instance many seconds later, java is entirely free to act as if the value is 'false'... even if when code in thread A looks at it, it sees 'true'. At some arbitrary later point the values will sync up. It may take a long time, no guarantees are available to you.
So how do you work with threads in java?
The JMM (Java Memory Model) works by describing so called comes-before/comes-after relationships: Only if code is written to clearly indicate that you intend for some event in thread A to clearly come before some other event in thread B, then java will guarantee that any effects performed in thread A and visible there will also be visible in thread B once B's event (the one that 'came after') has finished.
For example, if thread A does:
synchronized (someRef) {
someRef.intVal1 = 1;
someRef.intVal2 = 2;
}
and thread B does:
synchronized(someRef) {
System.out.println(someRef.intVal1 + someRef.intVal2);
}
then you are guaranteed to witness in B either 0 (which will be the case where B 'won' the fight and got to the synchronized statement first), or 3, which is always printed if B got there last; that synchronized block is establishing a CBCA relationship: The 'winning' thread's closing } 'comes before' the losing thread's opening one, as far as execution is concerned, therefore any writes done by thread A will be visible by thread B by the time it enters it sync block.
Your code does not establish any such relationships, therefore, you have no guarantees.
You establish them with writes/reads from volatile fields, with synchronized(), and with any code that itself uses these, which is a lot of code: Most classes in the java.util.concurrent package, starting threads, and many other things do some sync/volatile access internally.
The flying laptop issue.
It's not the 1980s anymore. Your CPU is capable of doing enough calculations at any given moment to draw enough power to heat a small house comfortably. The reason your laptop or desktop or phone isn't a burning ball of lava is because the CPU is almost always doing entirely nothing whatsoever, and thus not drawing any current and heating up. In fact, once a CPU gets going, it will very very quickly overheat itself and throttle down and run slower. That's because 95%+ of common PC workloads involve a 'burst' of calculations to be done, which the CPU can do in a fraction of a second at full turboboosted power, and then it can go back to idling again whilst the fans and the cooling paste and the heat fins dissipate the heat that this burst of power caused. That's why if you try to do something that causes the CPU to be engaged for a long time, such as encoding video, it seems to go a little faster at first before it slows down to a stable level.. whilst your battery is almost visibly draining down and your fans sound like the laptop is about to take off for higher orbit and follow Doug and Bob to the ISS - because that stable level is 'as fast as the fans and heat sinks can draw the heat away from the CPU so that it doesn't explode'. Which is not as fast as when it was still colder, but still pretty fast. Especially if you have powerful fans.
The upshot of all this?
You must idle that CPU.
something like:
while (true) {}
is a so-called 'busy loop': It does nothing, looping forever, whilst keeping the CPU occupied, burning a hole into the laptop and causing the fans to go ape. This is not a good thing. If you want execution to wait for some event before continuing, then wait for it. Keyword: wait. If you just want to wait for 5 seconds, Thread.sleep(5000) is what you want. Not a busy-loop. If you want to wait until some other thread has performed a job, use the core wait/notifyAll system (these are methods on j.l.Object and interact with the synchronized keyword), or better yet, use a latch or a lock object from java.util.concurrent, those classes are fantastic. If you just want to ensure that 2 threads don't conflict while they touch the same data, use synchronized. All these features will let the CPU idle down. endlessly spinning away in a while loop, checking an if clause - that is a bad idea.
And you get CBCA relationships to boot, which is what is required for any 2 threads to communicate with each other.
And because you're overloading the CPU with work, that sync point where your '= false' writes get synced back over to the other thread probably aren't happening - normally it's relatively hard to observe JMM issues (which is what makes multithreaded programming so tricky - it is complex, you will mess up, it's hard to test for errors, and it's plausible you'll never personally run into this problem today. But tomorrow, with another song on the winamp, on another system, happens all the time). This is a fine way to observe it a lot.

I managed to make it work with making waitingForAnswer volatile:
private volatile boolean waitingForAnswer;

Java Threads Not Running Asynchronously - Some not completing at all

I am attempting to complete an experiment for a university project. I want to run it in parallel, utilising multiple cores, so that I can increase the sample size. To achieve this I am creating multiple (up to 7, but I have tried using as few as 2) java threads and executing my class in all of those threads at once. My PC has 8 cores.
The problem I am having is that Java seems to be haphazard in how those threads are executed. All 7 threads start fine. They run asynchronously for a while. In a typical run maybe 3 of them will finish in the expected time, a fourth might finish a few minutes later and the final 3 don't finish at all.
The experimental class is designed to run for a certain amount of wall-clock time (not cpu-clock time). This factor is outside of my control. So I need my threads to be running simultaneously on separate cores at all times.
The following code snippet exhibits the method I am using to create the threads and kick them off. It is obviously not calling the class I am using for my experiment and if you copy it and run it yourself, you will see that it works fine. I have provided it here simply to demonstrate that I am creating and using the threads correctly. I have been searching for an answer to this for days and can't see that I am doing anything wrong.
This is a test class demonstrating the means used by my experimental class. It just concatenates some string data to ensure a long enough running process.
public class ThreadTestClass implements Runnable {
#Override
public void run() {
Thread.currentThread().setPriority(Thread.MAX_PRIORITY);
System.out.println("This thread is underway");
int i=0;
String a="a";
while(i<25){
a=a+a;
i++;
}
System.out.println("This thread ran fine");
}
}
This is how it is called:
private static void ThreatTestMethod(){
Thread[] threads = new Thread[7];
int i=0;
while(i<threads.length){
threads[i] = new Thread(new ThreadTestClass());
threads[i].start();
i++;
}
while(threads[0].isAlive() || threads[1].isAlive() || threads[2].isAlive() || threads[3].isAlive() || threads[4].isAlive() || threads[5].isAlive() || threads[6].isAlive()){
try {
Thread.sleep(5000);
}
catch (InterruptedException e) {
System.out.println("Interrupted Exception Occurred");
}
}
}
My understanding is that Java should be automatically utilising all of the cores, and that when I execute my threads they should utilise all available cores. That is normally what happens. It is not happening when I run my experiment. Is there anything I can do to force the threads to run simultaneously on separate cores?

Your example code is fine.
If some of your threads aren't returning, either they're blocking (e.g. on a read(), write(), wait()), or they're stuck in a loop; just like any other program that doesn't return. Attach a debugger, or just get a stack trace dump, to find out what they're doing.
The Java API gives you no means to specify how threads are allocated to cores. It's implementation dependent, depending both on the Java implementation and on the operating system.
However in practice, you'll find that as long as you have a reasonably up-to-date Java, threads will be spread between cores.

In short - no.
There are multiple factors influencing the scheduling of different tasks and threads over the processors available. Most obvious may be the JVM, the hardware and the actual scheduler you are running. There is no simple way to force this or to guarantee to run each thread on a different core.
There may be different ways of making it more probable that they will end up on different cores, though, but I think you might have reached the end of the line with threads.

Your string concat is trivial - almost a no-op.
Your threads are probably spending most of their time blocked on the System.Out I/O lock, so not running at all.
If you are going to start threads, give them some reasonable work to do.

Why Thread.sleep is bad to use

Apologies for this repeated question but I haven't found any satisfactory answers yet. Most of the question had their own specific use case:
Java - alternative to thread.sleep
Is there any better or alternative way to skip/avoid using Thread.sleep(1000) in Java?
My question is for the very generic use case. Wait for a condition to complete. Do some operation. Check for a condition. If the condition is not true, wait for some time and again do the same operation.
For e.g. Consider a method that creates a DynamoDB table by calling its createAPI table. DynamoDB table takes some time to become active so that method would call its DescribeTable API to poll for status at regular intervals until some time(let's say 5 mins - deviation due to thread scheduling is acceptable). Returns true if the table becomes active in 5 mins else throws exception.
Here is pseudo code:
public void createDynamoDBTable(String name) {
//call create table API to initiate table creation
//wait for table to become active
long endTime = System.currentTimeMillis() + MAX_WAIT_TIME_FOR_TABLE_CREATE;
while(System.currentTimeMillis() < endTime) {
boolean status = //call DescribeTable API to get status;
if(status) {
//status is now true, return
return
} else {
try {
Thread.sleep(10*1000);
} catch(InterruptedException e) {
}
}
}
throw new RuntimeException("Table still not created");
}
I understand that by using Thread.sleep blocks the current thread, thereby consuming resources. but in a fairly mid size application, is one thread a big concern?
I read somewhere that use ScheduledThreadPoolExecutor and do this status polling there. But again, we would have to initialize this pool with at least 1 thread where runnable method to do the polling would run.
Any suggestions on why using Thread.sleep is said to be such a bad idea and what are the alternative options for achieving same as above.
http://msmvps.com/blogs/peterritchie/archive/2007/04/26/thread-sleep-is-a-sign-of-a-poorly-designed-program.aspx

It's fine to use Thread.sleep in that situation. The reason people discourage Thread.sleep is because it's frequently used in an ill attempt to fix a race condition, used where notification based synchronization is a much better choice etc.
In this case, AFAIK you don't have an option but poll because the API doesn't provide you with notifications. I can also see it's a infrequent operation because presumably you are not going to create thousand tables.
Therefore, I find it fine to use Thread.sleep here. As you said, spawning a separate thread when you are going to block the current thread anyways seems to complicate things without merit.

Yes, one should try to avoid usage of Thread.sleep(x) but it shouldn't be totally forgotten:
Why it should be avoided
It doesn't release the lock
It doesn't gurantee that the execution will start after sleeping time (So it may keep waiting forever - obviously a rare case)
If we mistakenly put a foreground processing thread on sleep then we wouldn't be able to close that application till x milliseconds.
We now full loaded with new concurrency package for specific problems (like design patterns (ofcourse not exactly), why to use Thread.sleep(x) then.
Where to use Thread.sleep(x):
For providing delays in background running threads
And few others.

Balancing multiple queues

I suspect this is really easy but I’m unsure if there’s a naïve way of doing it in Java. Here’s my problem, I have two scripts for processing data and both have the same inputs/outputs except one is written for the single CPU and the other is for GPUs. The work comes from a queue server and I’m trying to write a program that sends the data to either the CPU or GPU script depending on which one is free.
I do not understand how to do this.
I know with executorservice I can specify how many threads I want to keep running but not sure how to balance between two different ones. I have 2 GPU’s and 8 CPU cores on the system and thought I could have threadexecutorservice keep 2 GPU and 8 CPU processes running but unsure how to balance between them since the GPU will be done a lot quicker than the CPU tasks.
Any suggestions on how to approach this? Should I create two queues and keep pooling them to see which one is less busy? or is there a way to just put all the work units(all the same) into one queue and have the GPU or CPU process take from the same queue as they are free?
UPDATE: just to clarify. the CPU/GPU programs are outside the scope of the program I'm making, they are simply scripts that I call via two different method. I guess the simplified version of what I'm asking is if two methods can take work from the same queue?

Can two methods take work from the same queue?
Yes, but you should use a BlockingQueue to save yourself some synchronization heartache.
Basically, one option would be to have a producer which places tasks into the queue via BlockingQueue.offer. Then design your CPU/GPU threads to call BlockingQueue.take and perform work on whatever they receive.
For example:
main (...) {
BlockingQueue<Task> queue = new LinkedBlockingQueue<>();
for (int i=0;i<CPUs;i++) {
new CPUThread(queue).start();
}
for (int i=0;i<GPUs;i++) {
new GPUThread(queue).start();
}
for (/*all data*/) {
queue.offer(task);
}
}
class CPUThread {
public void run() {
while(/*some condition*/) {
Task task = queue.take();
//do task work
}
}
}
//etc...

Obviously there is more than one way to do it, usually simplest is the best. I would suggest threadpools, one with 2 threads for CPU tasks, second with 8 threads will run GPU tasks. Your work unit manager can submit work to the pool that has idle threads at the moment (I would recommend synchronizing that block of code). Standard Java ThreadPoolExecutor has getActiveCount() method you can use for it, see
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ThreadPoolExecutor.html#getActiveCount().

Use Runnables like this:
CPUGPURunnable implements Runnable {
run() {
if ( Thread.currentThread() instance of CPUGPUThread) {
CPUGPUThread t = Thread.currentThread();
if ( t.isGPU())
runGPU();
else
runCPU();
}
}
}
CPUGPUThreads is a Thread subclass that knows if it runs in CPU or GPU mode, using a flag. Have a ThreadFactory for ThreadPoolExecutors that creates either a CPU of GPU thread. Set up a ThreadPoolExecutor with two workers. Make sure the Threadfactory creates a CPU and then a GPU thread instance.

I suppose you have two objects that represents two GPUs, with methods like boolean isFree() and void execute(Runnable). Then you should start 8 threads which in a loop take next job from the queue, put it in a free GPU, if any, otherwise execute the job itself.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.