I am a beginner in multithreading and came across the following in OCJP7 edition:
Avoid using methods such as Object.wait, Object.notify, and Object
.notifyAll in tasks (Runnable and Callable instances) that are
submitted to an Executor or ExecutorService.
Can someone please explain why is this the case?
With an Executor you do not know how tasks are scheduled on threads. Indeed, it is possible for there to be only a single thread.
In the pathological case you end up in a situation where you Object.wait() on a single threaded executor in a task... and there is nothing being run to notify() because it is a single threaded executor. As a result you have deadlock.
Even with more threads you can still end up in a similar deadlock situation if the relevant tasks happen to be scheduled on the same thread.
In any case, the blocking behaviour of Object.wait() means you are stalling a thread, whereas the whole idea of an executor is to farm out as many jobs as possible to a much more limited number of threads. What this means is that even at best you are reducing the throughput significantly by blocking an entire thread and all the other waiting tasks that were scheduled for it. I.e. you are not just blocking the task that does wait() you are also blocking any task scheduled behind it on the same thread since that task must wait to run until your blocking tasks is finished.
Related
About switching context (synchronizing with wait/notify) between threads and re-submit a task (Callable/Runnable) to Executor service which is better for performance? as I know switching context need to save/reload thread data but if I re-submit a task to an Executor service, JVM need to re-allocate stack for the submitted task so I think it has same cost with switching context?
I design a task queue for worker threads put tasks to that and a monitor thread to take the tasks in the queue, submit the tasks to a thread pool (executor service). But I considering about when does the monitor thread work?
Option 1: Using thread "wait" for monitor thread and worker thread will notify the monitor thread after they put task to the queue.
Option 2: Using a scheduler executor service for monitor thread to check the queue.
-> Which option is better (for speed, performance) and with option 2: how often to check the queue is the best?
many thanks for your help
An executor is usually backed by a thread pool. So the stacks are already allocated. Furthermore, a context switch will only take place if the same CPU core has to execute both threads. Modern CPUs have multiple cores, so there would be no context switch involved.
Having said that, there is surely some overhead in transferring work to another thread. So tasks should be sufficiently coarse-grained and there should be other work that the main thread can perform in the meantime for the pool to be beneficial.
There is no need to have a monitor thread if the queue itself is synchronized. Have a look at ArrayBlockingQueue for example; producers can call the put() method (which blocks when no space in the queue is available), and consumers (threads in the pool) call take(), which blocks when no work is available. This is how ThreadPoolExecutor is implemented (to be precise, it actually calls offer(), so by default doesn't block on the producer side).
In a ForkJoinPool ForkJoinTask, does the current worker thread participate in work stealing?
I have read implications that a fork join pool can work steal from blocked or waiting threads. The current worker seems an obvious candidate. Once the worker calls .join() on another task, then that task is essentially blocked.
On the other hand, I see many articles that imply different conclusions. For example, the general consensus that the current worker thread should do work before waiting for forked tasks.
There are a few articles that discuss the use of ForkJoinTask.getSurplusQueuedTaskCount as a method of balancing the work in the queue by having the current worker do some of the work. If the current worker is also stealing, then this doesn't seem necessary.
Naturally, I would like to maximize thread operations and keep all workers running maximally. Understanding if the current thread also steals work (for example when .join is called) will help to clarify.
It is the responsibility of the ForkJoinPool to manage threads. Client code should feed it tasks, not micromanage the threading. Note that tasks and threads are two different things; tasks are units of work to be executed, and threads execute that work.
ForkJoinTask.compute() should fork() into smaller subtasks if the task is large enough to benefit from running parts of the task in parallel, and simply process the task if the task is small enough that it would better be run in a single thread. If the work turns out to be more than expected, it can fork() some of the work and do the rest of it.
If ForkJoinTask.compute() forks into smaller subtasks, it can call join() before returning. The ForkJoinPool will then either free the thread to work on other tasks, or spawn a temporary thread to work on other tasks to ensure the available parallelism is fully utilized.
I think it's reasonable to assume that the appropriate number of worker threads are kept busy for as long as there are uncompleted tasks, unless you explicitly block the thread in the compute() method.
The Sun tutorial provides more specifics on how to use these classes:
https://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.
I am implementing the following functionality in a load test tool to simulate heavy load on a target application:
Multiple threads are launched in concurrency to perform the same kind of operations.
Each thread will loop for n times. At the end of each loop, test results are available and are added to a list which is returned after all loops finish running.
I'm currently using Callable and Future, and putting lists of results returned by all the threads into another list after all threads finish running and give me the Future. The problem is that I can lose what is available if the execution of the program is interrupted. I want to be able to save results that are available in finishes loops while the threads are still processing remaining loops.
Is there something in Java concurrency library suitable for this purpose? Or is there a better design to the load test functionality I am building?
Thanks in advance!
You can pass your results to a BlockingQueue as they occur. This can be picked up by another thread or the one which triggered the tasks in the first place.
The java.util.concurrent.CyclicBarrier class is a synchronization mechanism that can synchronize threads progressing through some algorithm. In other words, it is a barrier that all threads must wait at, until all threads reach it, before any of the threads can continue.
Creating a CyclicBarrier
When you create a CyclicBarrier you specify how many threads are to wait at it, before releasing them. Here is how you create a CyclicBarrier:
CyclicBarrier barrier = new CyclicBarrier(2);
Waiting at a CyclicBarrier
Here is how a thread waits at a CyclicBarrier:
barrier.await();
You can also specify a timeout for the waiting thread. When the timeout has passed the thread is also released, even if not all N threads are waiting at the CyclicBarrier. Here is how you specify a timeout:
barrier.await(10, TimeUnit.SECONDS);
The waiting threads waits at the CyclicBarrier until either:
The last thread arrives (calls await() )
The thread is interrupted by another thread (another thread calls its interrupt() method)
Another waiting thread is interrupted
Another waiting thread times out while waiting at the CyclicBarrier
The CyclicBarrier.reset() method is called by some external thread.
The docs for ScheduledThreadPoolExecutor says that -
Tasks scheduled for exactly the same execution time are enabled in first-in-first-out (FIFO) order of submission.
Does this mean that the tasks which SHOULD be done at the same time are never done at the same time. Instead they are executed in FIFO order ?
If that is true then which class do I use which is better than Timer and also does not have this FIFO problem ?
The way a ScheduledThreadPoolExecutor works is there is a single "scheduling" or master thread which checks for tasks to execute.
If it finds a task, it delegates it to a "worker" thread from the pool.
If multiple tasks are ready to be executed, they are "kicked off" one at a time, though once "kicked off", subsequent processing is concurrent, per Java's definition.
If you have two tasks that are both scheduled through the executor for the same time, the order in which they complete could vary from run to run and unless you put in specific controls such as locks, waits, etc... to handle this, it's up to java's thread scheduling (how java allots time to threads on a core) to determine how and when what gets processed. Please note that setting up such locks, waits, etc... is a deceptively complex task prone to race conditions leading to unexpected deadlocks, live locks, etc...
It depends on the size of your thread pool. If you schedule 1000 tasks to fire at midnight, and you only have 25 threads, then only 25 can be executed initially, while the rest must wait for available threads. FIFO here refers to the order in which the executor will hand tasks off to the execution threads.
Please note that the docs talk about "enabling" the tasks and that we are talking about a threadpool executor. :-)
That means the tasks will wait until the designated time, then they are treated as if put into a normal ThreadPoolExecutor. If there are enough threads available in the pool all these tasks will be run in parallel.
Only if you have more tasks becoming active than available threads in the pool some tasks will have to wait.