I have many threads that runs concurrently. In these few threads execution depends on the completion of other threads.
For eg,
Thread - 1, Thread - 2, Thread - 3
These can run independently.
Thread - 4, depends on the completion of 1 and 2
Thread - 5, depends on the completion of 1 and 3
Thread - 6, depends in the completion of 1, 2, and 3
All the threads are submitted to the executor. Thread 4, 5, and 6 has to implement some waiting mechanism before starting. What are the possible blocking mechanisms available in java for the above situation.
You get a Future<T> obejct when you use
Future<?> ExecutorService.submit(Runnable task)
Just pass the future to the thread that must wait for the completion (e.g it's constructor) and do:
future.get();
which will block until the thread for this future has finished.
Proposals to use blocking facilities like CountDownLatch or Future can work on an unbounded Executor or at least able to start all 6 threads simultaneously, otherwise there is a risk of thread starvation. So using Executor has no advantage compared to starting all 6 threads directly, without an Executor.
Meanwhile, the dependencies dictate that no more than 3 threads will run at each moment of time. In case the actual number of threads matters, you should use event-driven facilities instead of blocking ones. What I mean is an object which collects signals and, when signals from both tasks 1 and 2 has arrived, submits task 4 etc. From theoretical point of view, this resembles Petri nets. Unfortunately, JDK does not provide standard classes for event-driven task coordination. It is not hard to implement such a signal collector/task emitter from scratch, or you can use dataflow library df4j.
You may try to use the ExecutorService implementation of Threads for which on finishing the job (the tasks you want to accomplish with threads) you may try to invoke await() method of CountDownLatch class as follows:
public void finishWork() {
try {
System.out.println("START WAITING for thread");
countDownLatch.await();
System.out.println("DONE WAITING for thread");
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
And to monitor each thread you may try to invoke countDown() method available from this class. So, before calling the shutdown() on ExecutorService (or, completing with the job with threads), the above method could be used.
Related
The Problem:
I am parsing a large log file (around 625_000_000 lines) and saving it into the database.
public class LogScheduler {
static int fileNumber = 1;
public Importer(IRequestService service) {
this.service = service;
}
#Override
public void run() {
try {
service.saveAll(getRequestListFromFile("segment_directory/Log_segment_"+fileNumber+".txt"));
} catch (IOException e) {
e.printStackTrace();
}
}
}
The method that runs this thread is:
public void scheduledDataSave() throws InterruptedException {
int availableCores = Runtime.getRuntime().availableProcessors();
String directory = "segment_directory";
int filesInDirectory = Objects.requireNonNull(new File(directory).list()).length;
ExecutorService executorService = Executors.newFixedThreadPool(availableCores);
for (int i = 1; i <= filesInDirectory; i++) {
executorService.execute(new Importer(service));
}
executorService.shutdown();
}
Inserting the Thread.sleep(); method after the executorService.execute(new Importer(service)); sleeps after the execution of every thread, and not 8 threads like it should since they are in the Executorservice
And I have no idea why that happens since it should not behave like that.
From what I understand, the ExecutorService should run 8 threads in parallel, finish them, sleep, and start the pool again.
How to "sleep" after every 8 threads?
Sleeping the thread submitting tasks does not sleep the submitted tasks
Your question is not clear, but apparently centers around your expectation that adding a Thread.sleep after each call to executorService.execute would sleep all the threads of the executor service.
for ( int i = 1 ; i <= filesInDirectory ; i++ ) {
executorService.execute( new Importer( service ) ); // Executor service assigns this task to one of the background threads in its backing pool of threads.
Thread.sleep( Duration.ofMillis( 100 ).toMillis() ) ; // Sleeping this thread doing the looping. *Not* sleeping the background threads managed by the executor service.
}
Your expectation in incorrect.
That Thread.sleep is sleeping the thread doing the for loop.
The executor service has its own backing pool of threads. Those threads are not affected by a Thread.sleep is some other thread. Those background threads will only sleep if you call Thread.sleep within the code running on each of those threads.
So you are feeding the first task to the executor service. The executor service immediately dispatches that work to one of its backing threads. That task is immediately executed (if a thread is available immediately, and not otherwise occupied by previous tasks).
After assigning that task, your for loop sleeps for a hundred milliseconds, in this example code shown here. While the for loop is asleep, no further tasks are being assigned to the executor service. But while the for loop is asleep, the submitted task is executing on a background thread. That background thread is not sleeping.
Eventually, your for loop thread wakes, assigns a second task, and goes back to sleep. Meanwhile the background thread executes at full speed ahead.
So sleeping the thread submitting tasks does not sleep tasks already submitted.
Waiting for submitted tasks to complete
Your title asks:
ExecutorService should wait until batch of taksk is finished before starting again
After submitting your tasks, call shutdown and awaitTermination on your executor service. After those calls, your code blocks, waiting until all the submitted tasks are are completed/canceled/failed.
ExecutorService executorService = Executors.newVirtualThreadExecutor() ;
… submit tasks to that executor service …
executorService.shutdown() ;
executorSerivce.awaitTermination() ; // At this point, the flow-of-control blocks until the submitted tasks are done.
System.out.println( "INFO - Tasks on background threads are done. " + Instant.now() );
I would suggest using the ExecutorService#submit method rather than ExecutorService#execute method. The difference is that the first method returns a Future object. You can collect these Future objects as you submit tasks to the executor service. After the shutdown & awaitTermination, you can examine your collection of Future objects to check their completion status.
Project Loom
If Project Loom succeeds, such code will be a bit simpler and more clear. Experimental builds of Project Loom technology are available now, based on early-access Java 17. The Loom team seeks feedback now.
With Project Loom, ExecutorService becomes AutoCloseable. This means we can use try-with-resources syntax to automatically call a new close method on ExecutorService. This close method first blocks until all the tasks are completed/canceled/failed, then shuts down the executor service. No need to call shutdown nor awaitTermination.
By the way, Project Loom also bring virtual threads (fibers). This is likely to dramatically increase the performance of your code because it involves much blocking for storage i/o and database access.
try (
ExecutorService executorService = Executors.newVirtualThreadExecutor() ;
)
{
… submit tasks to that executor service …
}
// At this point, with Project Loom technology, the flow-of-control blocks until the submitted tasks are done.
// Also, the `ExecutorService` is automatically closed/shutdown by this point, via try-with-resources syntax.
System.out.println( "INFO - Tasks on background threads are done. " + Instant.now() );
With Project Loom, you can collect the returned Future objects in the same manner as discussed above to examine completion status.
You have other issues in your code. But you've not disclosed enough to address them all.
How to "sleep" after every 8 threads?
So if you are doing something like this, then it isn't doing what you think.
for (int i = 1; i <= filesInDirectory; i++) {
executorService.execute(new Importer(service));
Thread.sleep(...);
}
This causes the thread which is starting the background jobs to sleep and does not affect the running on each of the jobs. I believe what you are missing is to wait for the thread-pool to finish:
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
This waits for all of the jobs in the thread-pool to complete before continuing.
One more thing. I use executorService.submit(...) versus execute(...). Here's a description of their difference. For me, one additional difference is that any exceptions thrown by tasks run with execute(...) cause the running thread to terminate and possibly be restarted. With submit(...) it allows you to get that exception if needed and stops the threads from having to be respawned unnecessarily.
If you explain a bit more about what you are trying to accomplish, we should be able to help.
I have a ThreadPoolExecutor:
ThreadPoolExecutor service = new ThreadPoolExecutor(N_THREADS, N_THREADS, 0L, TimeUnit.MILLISECONDS, blockingQueue, rejectedExecutionHandler);
The service executes threads implementing the Runnable interface. Each thread processes a file on disk. I found that after several hours, two threads (or cores depending on what htop shows in Linux) were running and had been running for 13 hours. What's even worse is that the remaining cores showed no activity as if they were waiting for the two threads to complete.
Questions:
1 - I have read a lot on how this problem may be resolved but nothing conclusive. As far as I can work out, you CANNOT stop a Runnable using the ThreadPoolExecutor because it is an independent thread that just runs. Using the Future framework:
Future<?> f = f.get(submittedtask,XX)
allows you to set a timeout and fetch the future result, but get blocks all the threads effectively making the implementation serial. Is it possible to interrupt a Runnable after a given time using the threadpoolexecutor, get the thread back to the pool so it can pickup a new task and carry on.
2 - My big concern is why, using htop, I see two threads/cores running and no other core/thread are running despite many tasks are still waiting to execute (i.e. there are many files left to process). Any insight?
You could create a second scheduled thread pool to which you would submit cancellation tasks for each of the returned Futures. Each of these tasks after a given timeout would check if it's associated Future is done and if not, cancel it. Cancellation would trigger thread interruption, so you might need to support it in your tasks by checking the interrupted flag: Thread.interrupted().
The size of this second thread pool could be minimal, i.e. 1 as this job takes minimum of CPU time.
Code example:
ScheduledExecutorService service = Executors.newScheduledThreadPool(1);
...
while(...){
final Future<?> f = pool.submit(...);
service.schedule(new Runnable() {
#Override
public void run() {
if(!f.isDone()){
f.cancel(true);
}
}
}, 1, TimeUnit.MINUTES);
}
service.awaitTermination(1, TimeUnit.MINUTES);
service.shutdown();
You can tell a thread that you wish to interrupt:
An interrupt is an indication to a thread that it should stop what it is doing and do something else.
You can interrupt your thread with Future.cancel(true). It's in your responsibility to implement the Runnable in a manner that it obeys that wish by checking its Thread.interrupted() state.
In order to see details about process thread run:
ps -eLf | grep <PROCESS_PID>
Since htop shows you the running processes list, where each process has at least one thread.
Is there an ExecutorService that allows an existing thread to perform the executions instead of spawning new threads? Bonus if it’s a ScheduledExecutor. Most executors spawn worker threads to do the execution, but I want the worker thread to be an existing thread that I’m on. Here's the API that I imagine:
while (!executor.isTerminated()) {
Runnable r = executor.take();
r.run();
}
This is similar to the way that SWT and JavaFX allow the main thread to dispatch events, as opposed to Swing, which requires its own event dispatch thread to be spawned to handle events.
Motivation: I currently have lots of places where a thread spawn a new executor and then just calls awaitTermination() to wait for it to finish. I’d like to save some resources and keep the stack traces from being split in two.
Note that I don’t want an executor that runs tasks in execute(Runnable)’s caller threads, which is what this answer and Guava’s MoreExecutors.sameThreadExecutor() do.
Most executors from java.util.concurrent behave exactly as you supposed. Some spawn additional threads when there are too many tasks, but usually they can be configured to set a limit.
To exploit such a behaviour, do not start new executor each time - use the same executor. To wait for a set of tasks to finish, use invokeAll(), or submit() and then future.get()
I'm assuming what you want is control over the creation of new threads, such as name, daemon-status, etc. Use a ThreadFactory:
public class MyThreadFactory implements ThreadFactory {
public Thread newThread(Runnable runnable) {
Thread t = new Thread(runnable, "MyThreadName");
t.setDaemon(true);
return t;
}
}
This allows you to control thread creation so that the execution happens in threads that you manufacture instead of some default thread from a default ThreadFactory.
Then to use it, all of the methods in Executors take a ThreadFactory:
Executors.newExecutorOfSomeKind(new MyThreadFactory());
Edit: I see what you mean now. Unfortunately, the behavior of all Executor implementations (as far as I'm aware) is to create new threads to run the task, except the sameThreadExecutor you mentioned. Without going through the Thread objects that are creating executors just to execute one task (which is a horrible design -- see comments for what I mean by this), there's no easy way to accomplish what you want. I would recommend changing the code to use a single Executor with something like an ExecutorCompletionService (see this question) or use a fork/join pattern. Fork/join is made easier in Java 7 (see this Java trail). For pre-Java 7 code, read up on the counting Semaphore in Java (and in general).
I am using an ExecutorService (a ThreadPoolExecutor) to run (and queue) a lot of tasks. I am attempting to write some shut down code that is as graceful as possible.
ExecutorService has two ways of shutting down:
I can call ExecutorService.shutdown() and then ExecutorService.awaitTermination(...).
I can call ExecutorService.shutdownNow().
According to the JavaDoc, the shutdown command:
Initiates an orderly shutdown in which previously submitted
tasks are executed, but no new tasks will be accepted.
And the shutdownNow command:
Attempts to stop all actively executing tasks, halts the
processing of waiting tasks, and returns a list of the tasks that were
awaiting execution.
I want something in between these two options.
I want to call a command that:
a. Completes the currently active task or tasks (like shutdown).
b. Halts the processing of waiting tasks (like shutdownNow).
For example: suppose I have a ThreadPoolExecutor with 3 threads. It currently has 50 tasks in the queue with the first 3 actively running. I want to allow those 3 active tasks to complete but I do not want the remaining 47 tasks to start.
I believe I can shutdown the ExecutorService this way by keeping a list of Future objects around and then calling cancel on all of them. But since tasks are being submitted to this ExecutorService from multiple threads, there would not be a clean way to do this.
I'm really hoping I'm missing something obvious or that there's a way to do it cleanly.
Thanks for any help.
I ran into this issue recently. There may be a more elegant approach, but my solution is to first call shutdown(), then pull out the BlockingQueue being used by the ThreadPoolExecutor and call clear() on it (or else drain it to another Collection for storage). Finally, calling awaitTermination() allows the thread pool to finish what's currently on its plate.
For example:
public static void shutdownPool(boolean awaitTermination) throws InterruptedException {
//call shutdown to prevent new tasks from being submitted
executor.shutdown();
//get a reference to the Queue
final BlockingQueue<Runnable> blockingQueue = executor.getQueue();
//clear the Queue
blockingQueue.clear();
//or else copy its contents here with a while loop and remove()
//wait for active tasks to be completed
if (awaitTermination) {
executor.awaitTermination(SHUTDOWN_TIMEOUT, TimeUnit.SECONDS);
}
}
This method would be implemented in the directing class wrapping the ThreadPoolExecutor with the reference executor.
It's important to note the following from the ThreadPoolExecutor.getQueue() javadoc:
Access to the task queue is intended primarily for debugging and
monitoring. This queue may be in active use. Retrieving the task queue
does not prevent queued tasks from executing.
This highlights the fact that additional tasks may be polled from the BlockingQueue while you drain it. However, all BlockingQueue implementations are thread-safe according to that interface's documentation, so this shouldn't cause problems.
The shutdownNow() is exactly what you need. You've missed the 1st word Attempts and the entire 2nd paragraph of its javadoc:
There are no guarantees beyond best-effort attempts to stop processing actively executing tasks. For example, typical implementations will cancel via Thread.interrupt(), so any task that fails to respond to interrupts may never terminate.
So, only tasks which are checking Thread#isInterrupted() on a regular basis (e.g. in a while (!Thread.currentThread().isInterrupted()) loop or something), will be terminated. But if you aren't checking on that in your task, it will still keep running.
You can wrap each submitted task with a little extra logic
wrapper = new Runnable()
public void run()
if(executorService.isShutdown())
throw new Error("shutdown");
task.run();
executorService.submit(wrapper);
the overhead of extra checking is negligible. After executor is shutdown, the wrappers will still be executed, but the original tasks won't.
I was trying to run ExecutorService object with FixedThreadPool and I ran into problems.
I expected the program to run in nanoseconds but it was hung. I found that I need to use Semaphore along with it so that the items in the queue do not get added up.
Is there any way I can come to know that all the threads of the pool are used.
Basic code ...
static ExecutorService pool = Executors.newFixedThreadPool(4);
static Semaphore permits = new Semaphore(4);
try {
permits.acquire();
pool.execute(p); // Assuming p is runnable on large number of objects
permits.release();
} catch ( InterruptedException ex ) {
}
This code gets hanged and I really don't know why. How to know if pool is currently waiting for all the threads to finish?
By default, if you submit more than 4 tasks to your pool then the extra tasks will be queued until a thread becomes available.
The blog you referenced in your comment uses the semaphore to limit the amount of work that can be queued at once, which won't be a problem for you until you have many thousands of tasks queued up and they start eating into the available memory. There's an easier way to do this, anyway - construct a ThreadPoolExecutor with a bounded queue.* But this isn't your problem.
If you want to know when a task completes, notice that ExecutorService.submit() returns a Future object which can be used to wait for the task's completion:
Future<?> f = pool.execute(p);
f.get();
System.out.println("task complete");
If you have several tasks and want to wait for all of them to complete, either store each Future in a list and then call get() on each in turn, or investigate ExecutorService.invokeAll() (which essentially does the same but in a single method call).
You can also tell whether a task has completed or not:
Future<?> f = pool.execute(p);
while(!f.isDone()) {
// do something else, task not complete
}
f.get();
Finally, note that even if your tasks are complete, your program may not exit (and thus appears to "hang") if you haven't called shutdown() on the thread pool; the reason is that the threads are still running, waiting to be given more work to do.
*Edit: sorry, I just re-read my answer and realised this part is incorrect - ThreadPoolExecutor offers tasks to the queue and rejects them if they aren't accepted, so a bounded queue has different semantics to the semaphore approach.
You do not need the Semaphore.
If you are hanging it is probably because the threads are locking themselves elsewhere.
Run the code in a Debuger and when it hangs pause it and see what the threads are doing.
You could change to using a ThreadPoolExecutor. It contains a getActiveCount() method which returns an approximate count of the active threads. Why it is approximate I'm not sure.