ExecutorService never stops. When execute new task inside another executing task - java

Good day.
I have blocker issue with my web crawler project.
Logic is simple. First creates one Runnable, it downloads html document, scans all links and then on all funded links it creates new Runnable objects. Each new created Runnable in its turn creates new Runnable objects for each link and execute them.
Problem is that ExecutorService never stops.
CrawlerTest.java
public class CrawlerTest {
public static void main(String[] args) throws InterruptedException {
new CrawlerService().crawlInternetResource("https://jsoup.org/");
}
}
CrawlerService.java
import java.io.IOException;
import java.util.Collections;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class CrawlerService {
private Set<String> uniqueUrls = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>(10000));
private ExecutorService executorService = Executors.newFixedThreadPool(8);
private String baseDomainUrl;
public void crawlInternetResource(String baseDomainUrl) throws InterruptedException {
this.baseDomainUrl = baseDomainUrl;
System.out.println("Start");
executorService.execute(new Crawler(baseDomainUrl)); //Run first thread and scan main domain page. This thread produce new threads.
executorService.awaitTermination(10, TimeUnit.MINUTES);
System.out.println("End");
}
private class Crawler implements Runnable { // Inner class that encapsulates thread and scan for links
private String urlToCrawl;
public Crawler(String urlToCrawl) {
this.urlToCrawl = urlToCrawl;
}
public void run() {
try {
findAllLinks();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private void findAllLinks() throws InterruptedException {
/*Try to add new url in collection, if url is unique adds it to collection,
* scan document and start new thread for finded links*/
if (uniqueUrls.add(urlToCrawl)) {
System.out.println(urlToCrawl);
Document htmlDocument = loadHtmlDocument(urlToCrawl);
Elements findedLinks = htmlDocument.select("a[href]");
for (Element link : findedLinks) {
String absLink = link.attr("abs:href");
if (absLink.contains(baseDomainUrl) && !absLink.contains("#")) { //Check that we are don't go out of domain
executorService.execute(new Crawler(absLink)); //Start new thread for each funded link
}
}
}
}
private Document loadHtmlDocument(String internetResourceUrl) {
Document document = null;
try {
document = Jsoup.connect(internetResourceUrl).ignoreHttpErrors(true).ignoreContentType(true)
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0")
.timeout(10000).get();
} catch (IOException e) {
System.out.println("Page load error");
e.printStackTrace();
}
return document;
}
}
}
This app need about 20 secs to scan jsoup.org for all unique links. But it just wait 10 minutes executorService.awaitTermination(10, TimeUnit.MINUTES);
and then I see dead main thread and still working executor.
Threads
How to force ExecutorService work correctly?
I think problem is that it invoke executorService.execute inside another task instead in main thread.

You are misusing awaitTermination. According to javadoc you should call shutdown first:
Blocks until all tasks have completed execution after a shutdown request, or the timeout occurs, or the current thread is interrupted, whichever happens first.
To achieve your goal I'd suggest to use CountDownLatch (or latch that support increments like this one) to determine exact moment when there is no tasks left so you safely can do shutdown.

I see your comment from earlier:
I can't use CountDownLatch because I don't know beforehand how many unique links I will collect from resource.
First off, vsminkov is spot on with the answer as to why awaitTermniation will sit and wait for 10 minutes. I will offer an alternate solution.
Instead of using a CountDownLatch use a Phaser. For each new task, you can register, and await completion.
Create a single phaser and register each time a execute.submit is invoked and arrive each time a Runnable completes.
public void crawlInternetResource(String baseDomainUrl) {
this.baseDomainUrl = baseDomainUrl;
Phaser phaser = new Phaser();
executorService.execute(new Crawler(phaser, baseDomainUrl));
int phase = phaser.getPhase();
phase.awaitAdvance(phase);
}
private class Crawler implements Runnable {
private final Phaser phaser;
private String urlToCrawl;
public Crawler(Phaser phaser, String urlToCrawl) {
this.urlToCrawl = urlToCrawl;
this.phaser = phaser;
phaser.register(); // register new task
}
public void run(){
...
phaser.arrive(); //may want to surround this in try/finally
}

You are not calling shutdown.
This may work - An AtomicLong variable in the CrawlerService. Increment before every new sub task is submitted to executor service.
Modify your run() method to decrement this counter and if 0, shutdown the executor service
public void run() {
try {
findAllLinks();
} catch (InterruptedException e) {
e.printStackTrace();
} finally {
//decrements counter
//If 0, shutdown executor from here or just notify CrawlerService who would be doing wait().
}
}
In the "finally", reduce the counter and when the counter is zero, shutdown executor or just notify CrawlerService. 0 means, this is the last one, no other is running, none pending in queue. No task will submit any new sub tasks.

How to force ExecutorService work correctly?
I think problem is that it invoke executorService.execute inside another task instead in main thread.
No. The problem is not with ExecutorService. You are using APIs in incorrect manner and hence not getting right result.
You have to use three APIs in a certain order to get right result.
1. shutdown
2. awaitTermination
3. shutdownNow
Recommended way from oracle documentation page of ExecutorService:
void shutdownAndAwaitTermination(ExecutorService pool) {
pool.shutdown(); // Disable new tasks from being submitted
try {
// Wait a while for existing tasks to terminate
if (!pool.awaitTermination(60, TimeUnit.SECONDS)) {
pool.shutdownNow(); // Cancel currently executing tasks
// Wait a while for tasks to respond to being cancelled
if (!pool.awaitTermination(60, TimeUnit.SECONDS))
System.err.println("Pool did not terminate");
}
} catch (InterruptedException ie) {
// (Re-)Cancel if current thread also interrupted
pool.shutdownNow();
// Preserve interrupt status
Thread.currentThread().interrupt();
}
shutdown(): Initiates an orderly shutdown in which previously submitted tasks are executed, but no new tasks will be accepted.
shutdownNow():Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution.
awaitTermination():Blocks until all tasks have completed execution after a shutdown request, or the timeout occurs, or the current thread is interrupted, whichever happens first.
On a different note: If you want to wait for all tasks to complete, refer to this related SE question:
wait until all threads finish their work in java
I prefer using invokeAll() or ForkJoinPool(), which are best suited for your use case.

Related

What is the right way to use Java executor?

I am using Java executor in the following way, but not sure if every line is necessary and if this is the correct way to use it :
ExecutorService executor=Executors.newFixedThreadPool(30);
...
int N=200;
CountDownLatch doneSignal=new CountDownLatch(N);
for (int i=0;i<N;i++) executor.execute(new Test_Runner(doneSignal,...));
doneSignal.await();
executor.shutdown();
while (!executor.isTerminated()) { Thread.sleep(1000); }
// Blocks until all tasks have completed execution after a shutdown request
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
...
class Test_Runner implements Runnable
{
private CountDownLatch doneSignal;
Thread Test_Runner_Thread;
public Tes_Runner(CountDownLatch doneSignal,...)
{
this.doneSignal=doneSignal;
}
// Define some methods
public void run()
{
try
{
// do some work
}
catch (Exception e)
{
e.printStackTrace();
}
doneSignal.countDown();
}
public void start()
{
if (Test_Runner_Thread==null)
{
Test_Runner_Thread=new Thread(this);
Test_Runner_Thread.setPriority(Thread.NORM_PRIORITY);
Test_Runner_Thread.start();
}
}
public void stop() { if (Test_Runner_Thread!=null) Test_Runner_Thread=null; }
}
Looks correct to me. In the past I have followed the suggested implementation from the Java 7 JavaDoc for ExecutorService for stopping it. You can get it fromt he Java 7 Javadoc but I provide it below for convenience. Edit it to fit your needs, for example you might want to pass the number of seconds to wait. The good thing about using a CountDownLatch is that by the time it is done waiting you know the ExecutorService will terminate right away. Also, you might want to add a timeout to your latch's await if needed in future real world cases. Also, put your latch.countDOwn() in a try's finally block when using in real world application.
void shutdownAndAwaitTermination(ExecutorService pool) {
pool.shutdown(); // Disable new tasks from being submitted
try {
// Wait a while for existing tasks to terminate
if (!pool.awaitTermination(60, TimeUnit.SECONDS)) {
pool.shutdownNow(); // Cancel currently executing tasks
// Wait a while for tasks to respond to being cancelled
if (!pool.awaitTermination(60, TimeUnit.SECONDS))
System.err.println("Pool did not terminate");
}
} catch (InterruptedException ie) {
// (Re-)Cancel if current thread also interrupted
pool.shutdownNow();
// Preserve interrupt status
Thread.currentThread().interrupt();
}
}
You can further simplify the code.
You can remove CountDownLatch.
Change Test_Runner to Callable task.
Create a ArrayList of Callable Tasks.
List<Test_Runner> callables = new ArrayList<Test_Runner>();
for (int i=0;i<N;i++) {
callables.add(new Test_Runner());
}
Use invokeAll() on executorService.
List<Future<String>> futures = executorService.invokeAll(callables);
From javadocs,
<T> List<Future<T>> invokeAll(Collection<? extends Callable<T>> tasks)
throws InterruptedException
Executes the given tasks, returning a list of Futures holding their status and results when all complete. Future.isDone() is true for each element of the returned list. Note that a completed task could have terminated either normally or by throwing an exception. The results of this method are undefined if the given collection is modified while this operation is in progress.
And you can shutdown executorService as proposed by Jose Martinez
Relate SE question : How to shutdown an ExecutorService?

How can I get a RejectedExecutionException

Anybody able to provide me with an example of getting a RejectedExecutionException
Possibly a real life example.
Thanks in advance.
Anybody able to provide me with an example of getting a RejectedExecutionException Possibly a real life example.
Sure. The following code submits 2 jobs into a thread-pool with only 1 thread running. It uses a SynchronousQueue which means that no jobs will be stored in the job queue.
Since each job takes a while to run, the 2nd execute fills the queue and throws a RejectedExecutionException.
// create a 1 thread pool with no buffer for the runnable jobs
ExecutorService threadPool =
new ThreadPoolExecutor(1, 1, 0L, TimeUnit.MILLISECONDS,
new SynchronousQueue<Runnable>());
// submit 2 jobs that take a while to run
/// this job takes the only thread
threadPool.execute(new SleepRunnable());
// this tries to put the job into the queue, throws RejectedExecutionException
threadPool.execute(new SleepRunnable());
public class SleepRunnable implements Runnable {
public void run() {
try {
// this just sleeps for a while which pauses the thread
Thread.sleep(10000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return;
}
}
}
Sending tasks to an executor after calling shutdown( on it will throw this exception.
In addition, if the executor uses a bounded blocking queue if the queue is full submitting the task will not block but will fail-fast with the exception.
This question has already been asked and answered :
What could be the cause of RejectedExecutionException
Submitting tasks to a thread-pool gives RejectedExecutionException
This code gives you the error because we try to launch the task but the executor is shut down you can refer to the link above for further explications the answer looked pretty complete:
public class Executorz {
public static void main(String[] args) {
Executorz ex = new Executorz();
ExecutorService es = Executors.newFixedThreadPool(10);
for (int i = 0; i<100 ; i++){
System.out.println("Executed");
es.execute(ex.getNewCountin());
if (i==20)
es.shutdown();
}
}
public Countin getNewCountin(){
return new Countin();
}
public class Countin implements Runnable {
#Override
public void run() {
for (double i =0; i<1000000000 ; i++){
}
System.out.println("Done");
}
}
}

Waiting list of threads in java

I'm writing a swing application with HttpClient and I need a way to make a download list because I need to wait 1 minute (for example) before starting a new download.
So I would like to create a waiting list of threads (downloads).
I would have a class that takes a time parameter and contains a list of threads and when I add a thread in the list it starts if there is no running thread. Otherwise it waits for its turn.
Is there any tool to do that ?
Thanks a lot for your help.
Yes. ScheduledExecutorService. You can create a fixed length service via Executors.newScheduledThreadPool(corePoolSize). When you are ready to submit the task to wait the amount of time just submit it to ScheduledExecutorService.schedule
ScheduledExecutorService e = Executors.newScheduledThreadPool(10)
private final long defaultWaitTimeInMinutes = 1;
public void submitTaskToWait(Runnable r){
e.schedule(r, defaultWaitTimeInMinutes, TimeUnit.MINUTES);
}
Here the task will launch in 1 minute from the time of being submitted. And to address your last point. If there are currently tasks being downloaded (this configuration means 10 tasks being downloaded) after the 1 minute is up the runnable submitted will have to wait until one of the other downloads are complete.
Keep in mind this deviates a bit from the way you are designing it. For each new task you wouldnt create a new thread, rather you would submit to a service that already has thread(s) waiting. For instance, if you only want one task to download at a time you change from Executors.newScheduledThreadPool(10) to Executors.newScheduledThreadPool(1)
Edit: I'll leave my previous answer but update it with a solution to submit a task to start exactly 1 minute after the previous task completes. You would use two ExecutorServices. One to submit to the scheuled Executor and the other to do the timed executions. Finally the first Executor will wait on the completion and continue with the other tasks queued up.
ExecutorService e = Executors.newSingleThreadExecutor();
ScheduledExecutorService scheduledService = Executors.newScheduledThreadPool(1)
public void submitTask(final Runnable r){
e.submit(new Runnable(){
public void run(){
ScheduledFuture<?> future= scheduledService.schedule(r, defaultWaitTimeInMinutes, TimeUnit.MINUTES);
future.get();
}
});
}
Now when the future.get(); completes the next Runnable submitted through submitTask will be run and then scheduled for a minute. Finally this will work only if you require the task to wait the 1 minute even if there is no other tasks submitted.
I think this would be a wrong way of going about the problem. A bit more logical way would be to create "download job" objects which will be added to a job queue. Create a TimerTask which would query this "queue" every 1 minute, pick up the Runnable/Callable jobs and submit them to the ExecutorService.
You could use the built-in ExecutorService. You can queue up tasks as Runnables and they will run on the available threads. If you want only a single task to run at a time use newFixedThreadPool(1);
ExecutorService executor = Executors.newFixedThreadPool(1);
You could then append an artificial Thread.sleep at the beginning of each Runnable run method to ensure that it waits the necessary amount of time before starting (not the most elegant choice, I know).
The Java Concurrency package contains classes for doing what you ask. The general construct you're talking about is an Executor which is backed by a ThreadPool. You generate a list of Runables and send them to an Executor. The Executor has a ThreadPool behind it which will run the Runnables as the threads become available.
So as an example here, you could have a Runnable like:
private static class Downloader implements Runnable {
private String file;
public Downloader(String file) {
this.file = file;
}
#Override
public void run() {
// Use HttpClient to download file.
}
}
Then You can use it by creating Downloader objects and submitting it to an ExecutorService:
public static void main(String[] args) throws Exception {
ExecutorService executorService = Executors.newFixedThreadPool(5);
for (String file : args) {
executorService.submit(new Downloader(file));
}
executorService.awaitTermination(100, TimeUnit.SECONDS);
}
It is maybe not the best solution but here is what I came up with thanks to the answer of John Vint. I hope it will help someone else.
package tests;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class RunnableQueue
{
private long waitTime;
private TimeUnit unit;
ExecutorService e;
public RunnableQueue(long waitTime, TimeUnit unit) {
e = Executors.newSingleThreadExecutor();
this.waitTime = waitTime;
this.unit = unit;
}
public void submitTask(final Runnable r){
e.submit(new Runnable(){
public void run(){
Thread t = new Thread(r);
t.start();
try {
t.join();
Thread.sleep(unit.toMillis(waitTime));
} catch (InterruptedException e) {
e.printStackTrace();
}
}
});
}
public static void main(String[] args) {
RunnableQueue runQueue = new RunnableQueue(3, TimeUnit.SECONDS);
for(int i=1; i<11; i++)
{
runQueue.submitTask(new DownloadTask(i));
System.out.println("Submitted task " + i);
}
}
}

How to make ThreadPoolExecutor's submit() method block if it is saturated?

I want to create a ThreadPoolExecutor such that when it has reached its maximum size and the queue is full, the submit() method blocks when trying to add new tasks. Do I need to implement a custom RejectedExecutionHandler for that or is there an existing way to do this using a standard Java library?
One of the possible solutions I've just found:
public class BoundedExecutor {
private final Executor exec;
private final Semaphore semaphore;
public BoundedExecutor(Executor exec, int bound) {
this.exec = exec;
this.semaphore = new Semaphore(bound);
}
public void submitTask(final Runnable command)
throws InterruptedException, RejectedExecutionException {
semaphore.acquire();
try {
exec.execute(new Runnable() {
public void run() {
try {
command.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
semaphore.release();
throw e;
}
}
}
Are there any other solutions? I'd prefer something based on RejectedExecutionHandler since it seems like a standard way to handle such situations.
You can use ThreadPoolExecutor and a blockingQueue:
public class ImageManager {
BlockingQueue<Runnable> blockingQueue = new ArrayBlockingQueue<Runnable>(blockQueueSize);
RejectedExecutionHandler rejectedExecutionHandler = new ThreadPoolExecutor.CallerRunsPolicy();
private ExecutorService executorService = new ThreadPoolExecutor(numOfThread, numOfThread,
0L, TimeUnit.MILLISECONDS, blockingQueue, rejectedExecutionHandler);
private int downloadThumbnail(String fileListPath){
executorService.submit(new yourRunnable());
}
}
You should use the CallerRunsPolicy, which executes the rejected task in the calling thread. This way, it can't submit any new tasks to the executor until that task is done, at which point there will be some free pool threads or the process will repeat.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/ThreadPoolExecutor.CallerRunsPolicy.html
From the docs:
Rejected tasks
New tasks submitted in method execute(java.lang.Runnable) will be
rejected when the Executor has been
shut down, and also when the Executor
uses finite bounds for both maximum
threads and work queue capacity, and
is saturated. In either case, the
execute method invokes the
RejectedExecutionHandler.rejectedExecution(java.lang.Runnable,
java.util.concurrent.ThreadPoolExecutor)
method of its
RejectedExecutionHandler. Four
predefined handler policies are
provided:
In the default ThreadPoolExecutor.AbortPolicy, the
handler throws a runtime
RejectedExecutionException upon
rejection.
In ThreadPoolExecutor.CallerRunsPolicy,
the thread that invokes execute itself
runs the task. This provides a simple
feedback control mechanism that will
slow down the rate that new tasks are
submitted.
In ThreadPoolExecutor.DiscardPolicy, a
task that cannot be executed is simply
dropped.
In ThreadPoolExecutor.DiscardOldestPolicy,
if the executor is not shut down, the
task at the head of the work queue is
dropped, and then execution is retried
(which can fail again, causing this to
be repeated.)
Also, make sure to use a bounded queue, such as ArrayBlockingQueue, when calling the ThreadPoolExecutor constructor. Otherwise, nothing will get rejected.
Edit: in response to your comment, set the size of the ArrayBlockingQueue to be equal to the max size of the thread pool and use the AbortPolicy.
Edit 2: Ok, I see what you're getting at. What about this: override the beforeExecute() method to check that getActiveCount() doesn't exceed getMaximumPoolSize(), and if it does, sleep and try again?
I know, it is a hack, but in my opinion most clean hack between those offered here ;-)
Because ThreadPoolExecutor uses blocking queue "offer" instead of "put", lets override behaviour of "offer" of the blocking queue:
class BlockingQueueHack<T> extends ArrayBlockingQueue<T> {
BlockingQueueHack(int size) {
super(size);
}
public boolean offer(T task) {
try {
this.put(task);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return true;
}
}
ThreadPoolExecutor tp = new ThreadPoolExecutor(1, 2, 1, TimeUnit.MINUTES, new BlockingQueueHack(5));
I tested it and it seems to work.
Implementing some timeout policy is left as a reader's exercise.
Hibernate has a BlockPolicy that is simple and may do what you want:
See: Executors.java
/**
* A handler for rejected tasks that will have the caller block until
* space is available.
*/
public static class BlockPolicy implements RejectedExecutionHandler {
/**
* Creates a <tt>BlockPolicy</tt>.
*/
public BlockPolicy() { }
/**
* Puts the Runnable to the blocking queue, effectively blocking
* the delegating thread until space is available.
* #param r the runnable task requested to be executed
* #param e the executor attempting to execute this task
*/
public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
try {
e.getQueue().put( r );
}
catch (InterruptedException e1) {
log.error( "Work discarded, thread was interrupted while waiting for space to schedule: {}", r );
}
}
}
The BoundedExecutor answer quoted above from Java Concurrency in Practice only works correctly if you use an unbounded queue for the Executor, or the semaphore bound is no greater than the queue size. The semaphore is state shared between the submitting thread and the threads in the pool, making it possible to saturate the executor even if queue size < bound <= (queue size + pool size).
Using CallerRunsPolicy is only valid if your tasks don't run forever, in which case your submitting thread will remain in rejectedExecution forever, and a bad idea if your tasks take a long time to run, because the submitting thread can't submit any new tasks or do anything else if it's running a task itself.
If that's not acceptable then I suggest checking the size of the executor's bounded queue before submitting a task. If the queue is full, then wait a short time before trying to submit again. The throughput will suffer, but I suggest it's a simpler solution than many of the other proposed solutions and you're guaranteed no tasks will get rejected.
The following class wraps around a ThreadPoolExecutor and uses a Semaphore to block then the work queue is full:
public final class BlockingExecutor {
private final Executor executor;
private final Semaphore semaphore;
public BlockingExecutor(int queueSize, int corePoolSize, int maxPoolSize, int keepAliveTime, TimeUnit unit, ThreadFactory factory) {
BlockingQueue<Runnable> queue = new LinkedBlockingQueue<Runnable>();
this.executor = new ThreadPoolExecutor(corePoolSize, maxPoolSize, keepAliveTime, unit, queue, factory);
this.semaphore = new Semaphore(queueSize + maxPoolSize);
}
private void execImpl (final Runnable command) throws InterruptedException {
semaphore.acquire();
try {
executor.execute(new Runnable() {
#Override
public void run() {
try {
command.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
// will never be thrown with an unbounded buffer (LinkedBlockingQueue)
semaphore.release();
throw e;
}
}
public void execute (Runnable command) throws InterruptedException {
execImpl(command);
}
}
This wrapper class is based on a solution given in the book Java Concurrency in Practice by Brian Goetz. The solution in the book only takes two constructor parameters: an Executor and a bound used for the semaphore. This is shown in the answer given by Fixpoint. There is a problem with that approach: it can get in a state where the pool threads are busy, the queue is full, but the semaphore has just released a permit. (semaphore.release() in the finally block). In this state, a new task can grab the just released permit, but is rejected because the task queue is full. Of course this is not something you want; you want to block in this case.
To solve this, we must use an unbounded queue, as JCiP clearly mentions. The semaphore acts as a guard, giving the effect of a virtual queue size. This has the side effect that it is possible that the unit can contain maxPoolSize + virtualQueueSize + maxPoolSize tasks. Why is that? Because of the
semaphore.release() in the finally block. If all pool threads call this statement at the same time, then maxPoolSize permits are released, allowing the same number of tasks to enter the unit. If we were using a bounded queue, it would still be full, resulting in a rejected task. Now, because we know that this only occurs when a pool thread is almost done, this is not a problem. We know that the pool thread will not block, so a task will soon be taken from the queue.
You are able to use a bounded queue though. Just make sure that its size equals virtualQueueSize + maxPoolSize. Greater sizes are useless, the semaphore will prevent to let more items in. Smaller sizes will result in rejected tasks. The chance of tasks getting rejected increases as the size decreases. For example, say you want a bounded executor with maxPoolSize=2 and virtualQueueSize=5. Then take a semaphore with 5+2=7 permits and an actual queue size of 5+2=7. The real number of tasks that can be in the unit is then 2+5+2=9. When the executor is full (5 tasks in queue, 2 in thread pool, so 0 permits available) and ALL pool threads release their permits, then exactly 2 permits can be taken by tasks coming in.
Now the solution from JCiP is somewhat cumbersome to use as it doesn't enforce all these constraints (unbounded queue, or bounded with those math restrictions, etc.). I think that this only serves as a good example to demonstrate how you can build new thread safe classes based on the parts that are already available, but not as a full-grown, reusable class. I don't think that the latter was the author's intention.
you can use a custom RejectedExecutionHandler like this
ThreadPoolExecutor tp= new ThreadPoolExecutor(core_size, // core size
max_handlers, // max size
timeout_in_seconds, // idle timeout
TimeUnit.SECONDS, queue, new RejectedExecutionHandler() {
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
// This will block if the queue is full
try {
executor.getQueue().put(r);
} catch (InterruptedException e) {
System.err.println(e.getMessage());
}
}
});
I don't always like the CallerRunsPolicy, especially since it allows the rejected task to 'skip the queue' and get executed before tasks that were submitted earlier. Moreover, executing the task on the calling thread might take much longer than waiting for the first slot to become available.
I solved this problem using a custom RejectedExecutionHandler, which simply blocks the calling thread for a little while and then tries to submit the task again:
public class BlockWhenQueueFull implements RejectedExecutionHandler {
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
// The pool is full. Wait, then try again.
try {
long waitMs = 250;
Thread.sleep(waitMs);
} catch (InterruptedException interruptedException) {}
executor.execute(r);
}
}
This class can just be used in the thread-pool executor as a RejectedExecutinHandler like any other, for example:
executorPool = new ThreadPoolExecutor(1, 1, 10,
TimeUnit.SECONDS, new SynchronousQueue<Runnable>(),
new BlockWhenQueueFull());
The only downside I see is that the calling thread might get locked slightly longer than strictly necessary (up to 250ms). Moreover, since this executor is effectively being called recursively, very long waits for a thread to become available (hours) might result in a stack overflow.
Nevertheless, I personally like this method. It's compact, easy to understand, and works well.
Create your own blocking queue to be used by the Executor, with the blocking behavior you are looking for, while always returning available remaining capacity (ensuring the executor will not try to create more threads than its core pool, or trigger the rejection handler).
I believe this will get you the blocking behavior you are looking for. A rejection handler will never fit the bill, since that indicates the executor can not perform the task. What I could envision there is that you get some form of 'busy waiting' in the handler. That is not what you want, you want a queue for the executor that blocks the caller...
To avoid issues with #FixPoint solution. One could use ListeningExecutorService and release the semaphore onSuccess and onFailure inside FutureCallback.
Recently I found this question having the same problem. The OP does not say so explicitly, but we do not want to use the RejectedExecutionHandler which executes a task on the submitter's thread, because this will under-utilize the worker threads if this task is a long running one.
Reading all the answers and comments, in particular the flawed solution with the semaphore or using afterExecute I had a closer look at the code of the ThreadPoolExecutor to see if there is some way out. I was amazed to see that there are more than 2000 lines of (commented) code, some of which make me feel dizzy. Given the rather simple requirement I actually have --- one producer, several consumers, let the producer block when no consumers can take work --- I decided to roll my own solution. It is not an ExecutorService but just an Executor. And it does not adapt the number of threads to the work load, but holds a fixed number of threads only, which also fits my requirements. Here is the code. Feel free to rant about it :-)
package x;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Executor;
import java.util.concurrent.RejectedExecutionException;
import java.util.concurrent.SynchronousQueue;
/**
* distributes {#code Runnable}s to a fixed number of threads. To keep the
* code lean, this is not an {#code ExecutorService}. In particular there is
* only very simple support to shut this executor down.
*/
public class ParallelExecutor implements Executor {
// other bounded queues work as well and are useful to buffer peak loads
private final BlockingQueue<Runnable> workQueue =
new SynchronousQueue<Runnable>();
private final Thread[] threads;
/*+**********************************************************************/
/**
* creates the requested number of threads and starts them to wait for
* incoming work
*/
public ParallelExecutor(int numThreads) {
this.threads = new Thread[numThreads];
for(int i=0; i<numThreads; i++) {
// could reuse the same Runner all over, but keep it simple
Thread t = new Thread(new Runner());
this.threads[i] = t;
t.start();
}
}
/*+**********************************************************************/
/**
* returns immediately without waiting for the task to be finished, but may
* block if all worker threads are busy.
*
* #throws RejectedExecutionException if we got interrupted while waiting
* for a free worker
*/
#Override
public void execute(Runnable task) {
try {
workQueue.put(task);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RejectedExecutionException("interrupt while waiting for a free "
+ "worker.", e);
}
}
/*+**********************************************************************/
/**
* Interrupts all workers and joins them. Tasks susceptible to an interrupt
* will preempt their work. Blocks until the last thread surrendered.
*/
public void interruptAndJoinAll() throws InterruptedException {
for(Thread t : threads) {
t.interrupt();
}
for(Thread t : threads) {
t.join();
}
}
/*+**********************************************************************/
private final class Runner implements Runnable {
#Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
Runnable task;
try {
task = workQueue.take();
} catch (InterruptedException e) {
// canonical handling despite exiting right away
Thread.currentThread().interrupt();
return;
}
try {
task.run();
} catch (RuntimeException e) {
// production code to use a logging framework
e.printStackTrace();
}
}
}
}
}
I believe there is quite elegant way to solve this problem by using java.util.concurrent.Semaphore and delegating behavior of Executor.newFixedThreadPool.
The new executor service will only execute new task when there is a thread to do so. Blocking is managed by Semaphore with number of permits equal to number of threads. When a task is finished it returns a permit.
public class FixedThreadBlockingExecutorService extends AbstractExecutorService {
private final ExecutorService executor;
private final Semaphore blockExecution;
public FixedThreadBlockingExecutorService(int nTreads) {
this.executor = Executors.newFixedThreadPool(nTreads);
blockExecution = new Semaphore(nTreads);
}
#Override
public void shutdown() {
executor.shutdown();
}
#Override
public List<Runnable> shutdownNow() {
return executor.shutdownNow();
}
#Override
public boolean isShutdown() {
return executor.isShutdown();
}
#Override
public boolean isTerminated() {
return executor.isTerminated();
}
#Override
public boolean awaitTermination(long timeout, TimeUnit unit) throws InterruptedException {
return executor.awaitTermination(timeout, unit);
}
#Override
public void execute(Runnable command) {
blockExecution.acquireUninterruptibly();
executor.execute(() -> {
try {
command.run();
} finally {
blockExecution.release();
}
});
}
I had the same need in the past: a kind of blocking queue with a fixed size for each client backed by a shared thread pool. I ended up writing my own kind of ThreadPoolExecutor:
UserThreadPoolExecutor
(blocking queue (per client) + threadpool (shared amongst all clients))
See: https://github.com/d4rxh4wx/UserThreadPoolExecutor
Each UserThreadPoolExecutor is given a maximum number of threads from a shared ThreadPoolExecutor
Each UserThreadPoolExecutor can:
submit a task to the shared thread pool executor if its quota is not reached. If its quota is reached, the job is queued (non-consumptive blocking waiting for CPU). Once one of its submitted task is completed, the quota is decremented, allowing another task waiting to be submitted to the ThreadPoolExecutor
wait for the remaining tasks to complete
I found this rejection policy in elastic search client. It blocks caller thread on blocking queue. Code below-
static class ForceQueuePolicy implements XRejectedExecutionHandler
{
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor)
{
try
{
executor.getQueue().put(r);
}
catch (InterruptedException e)
{
//should never happen since we never wait
throw new EsRejectedExecutionException(e);
}
}
#Override
public long rejected()
{
return 0;
}
}
I recently had a need to achieve something similar, but on a ScheduledExecutorService.
I had to also ensure that I handle the delay being passed on the method and ensure that either the task is submitted to execute at the time as the caller expects or just fails thus throwing a RejectedExecutionException.
Other methods from ScheduledThreadPoolExecutor to execute or submit a task internally call #schedule which will still in turn invoke the methods overridden.
import java.util.concurrent.*;
public class BlockingScheduler extends ScheduledThreadPoolExecutor {
private final Semaphore maxQueueSize;
public BlockingScheduler(int corePoolSize,
ThreadFactory threadFactory,
int maxQueueSize) {
super(corePoolSize, threadFactory, new AbortPolicy());
this.maxQueueSize = new Semaphore(maxQueueSize);
}
#Override
public ScheduledFuture<?> schedule(Runnable command,
long delay,
TimeUnit unit) {
final long newDelayInMs = beforeSchedule(command, unit.toMillis(delay));
return super.schedule(command, newDelayInMs, TimeUnit.MILLISECONDS);
}
#Override
public <V> ScheduledFuture<V> schedule(Callable<V> callable,
long delay,
TimeUnit unit) {
final long newDelayInMs = beforeSchedule(callable, unit.toMillis(delay));
return super.schedule(callable, newDelayInMs, TimeUnit.MILLISECONDS);
}
#Override
public ScheduledFuture<?> scheduleAtFixedRate(Runnable command,
long initialDelay,
long period,
TimeUnit unit) {
final long newDelayInMs = beforeSchedule(command, unit.toMillis(initialDelay));
return super.scheduleAtFixedRate(command, newDelayInMs, unit.toMillis(period), TimeUnit.MILLISECONDS);
}
#Override
public ScheduledFuture<?> scheduleWithFixedDelay(Runnable command,
long initialDelay,
long period,
TimeUnit unit) {
final long newDelayInMs = beforeSchedule(command, unit.toMillis(initialDelay));
return super.scheduleWithFixedDelay(command, newDelayInMs, unit.toMillis(period), TimeUnit.MILLISECONDS);
}
#Override
protected void afterExecute(Runnable runnable, Throwable t) {
super.afterExecute(runnable, t);
try {
if (t == null && runnable instanceof Future<?>) {
try {
((Future<?>) runnable).get();
} catch (CancellationException | ExecutionException e) {
t = e;
} catch (InterruptedException ie) {
Thread.currentThread().interrupt(); // ignore/reset
}
}
if (t != null) {
System.err.println(t);
}
} finally {
releaseQueueUsage();
}
}
private long beforeSchedule(Runnable runnable, long delay) {
try {
return getQueuePermitAndModifiedDelay(delay);
} catch (InterruptedException e) {
getRejectedExecutionHandler().rejectedExecution(runnable, this);
return 0;
}
}
private long beforeSchedule(Callable callable, long delay) {
try {
return getQueuePermitAndModifiedDelay(delay);
} catch (InterruptedException e) {
getRejectedExecutionHandler().rejectedExecution(new FutureTask(callable), this);
return 0;
}
}
private long getQueuePermitAndModifiedDelay(long delay) throws InterruptedException {
final long beforeAcquireTimeStamp = System.currentTimeMillis();
maxQueueSize.tryAcquire(delay, TimeUnit.MILLISECONDS);
final long afterAcquireTimeStamp = System.currentTimeMillis();
return afterAcquireTimeStamp - beforeAcquireTimeStamp;
}
private void releaseQueueUsage() {
maxQueueSize.release();
}
}
I have the code here, will appreciate any feedback.
https://github.com/AmitabhAwasthi/BlockingScheduler

Is it a good way to use java.util.concurrent.FutureTask?

First of all, I must say that I am quite new to the API java.util.concurrent, so maybe what I am doing is completely wrong.
What do I want to do?
I have a Java application that basically runs 2 separate processing (called myFirstProcess, mySecondProcess), but these processing must be run at the same time.
So, I tried to do that:
public void startMyApplication() {
ExecutorService executor = Executors.newFixedThreadPool(2);
FutureTask<Object> futureOne = new FutureTask<Object>(myFirstProcess);
FutureTask<Object> futureTwo = new FutureTask<Object>(mySecondProcess);
executor.execute(futureOne);
executor.execute(futureTwo);
while (!(futureOne.isDone() && futureTwo.isDone())) {
try {
// I wait until both processes are finished.
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
logger.info("Processing finished");
executor.shutdown();
// Do some processing on results
...
}
myFirstProcess and mySecondProcess are classes that implements Callable<Object>, and where all their processing is made in the call() method.
It is working quite well but I am not sure that it is the correct way to do that.
Is a good way to do what I want? If not, can you give me some hints to enhance my code (and still keep it as simple as possible).
You'd be better off using the get() method.
futureOne.get();
futureTwo.get();
Both of which wait for notification from the thread that it finished processing, this saves you the busy-wait-with-timer you are now using which is not efficient nor elegant.
As a bonus, you have the API get(long timeout, TimeUnit unit) which allows you to define a maximum time for the thread to sleep and wait for a response, and otherwise continues running.
See the Java API for more info.
The uses of FutureTask above are tolerable, but definitely not idiomatic. You're actually wrapping an extra FutureTask around the one you submitted to the ExecutorService. Your FutureTask is treated as a Runnable by the ExecutorService. Internally, it wraps your FutureTask-as-Runnable in a new FutureTask and returns it to you as a Future<?>.
Instead, you should submit your Callable<Object> instances to a CompletionService. You drop two Callables in via submit(Callable<V>), then turn around and call CompletionService#take() twice (once for each submitted Callable). Those calls will block until one and then the other submitted tasks are complete.
Given that you already have an Executor in hand, construct a new ExecutorCompletionService around it and drop your tasks in there. Don't spin and sleep waiting; CompletionService#take() will block until either one of your tasks are complete (either finished running or canceled) or the thread waiting on take() is interrupted.
Yuval's solution is fine. As an alternative you can also do this:
ExecutorService executor = Executors.newFixedThreadPool();
FutureTask<Object> futureOne = new FutureTask<Object>(myFirstProcess);
FutureTask<Object> futureTwo = new FutureTask<Object>(mySecondProcess);
executor.execute(futureOne);
executor.execute(futureTwo);
executor.shutdown();
try {
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
// interrupted
}
What is the advantage of this approach? There's not a lot of difference really except that this way you stop the executor accepting any more tasks (you can do that the other way too). I tend to prefer this idiom to that one though.
Also, if either get() throws an exception you may end up in a part of your code that assumes both tasks are done, which might be bad.
You can use invokeall(Colelction....) method
package concurrent.threadPool;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class InvokeAll {
public static void main(String[] args) throws Exception {
ExecutorService service = Executors.newFixedThreadPool(5);
List<Future<java.lang.String>> futureList = service.invokeAll(Arrays.asList(new Task1<String>(),new Task2<String>()));
System.out.println(futureList.get(1).get());
System.out.println(futureList.get(0).get());
}
private static class Task1<String> implements Callable<String>{
#Override
public String call() throws Exception {
Thread.sleep(1000 * 10);
return (String) "1000 * 5";
}
}
private static class Task2<String> implements Callable<String>{
#Override
public String call() throws Exception {
Thread.sleep(1000 * 2);
int i=3;
if(i==3)
throw new RuntimeException("Its Wrong");
return (String) "1000 * 2";
}
}
}
You may want to use a CyclicBarrier if you are interested in starting the threads at the same time, or waiting for them to finish and then do some further processing.
See the javadoc for more information.
If your futureTasks are more then 2, please consider [ListenableFuture][1].
When several operations should begin as soon as another operation
starts -- "fan-out" -- ListenableFuture just works: it triggers all of
the requested callbacks. With slightly more work, we can "fan-in," or
trigger a ListenableFuture to get computed as soon as several other
futures have all finished.

Categories