ExecutorService SingleThreadExecutor - java

I have a list of objects, from which depending on user interaction some objects need to do work asynchronically. Something like this:
for(TheObject o : this.listOfObjects) {
o.doWork();
}
The class TheObject implements an ExecutorService (SingleThread!), which is used to do the work. Every object of type TheObject instantiates an ExecutorService. I don't want to make lasagna code. I don't have enough Objects at the same time, to make an extra extraction layer with thread pooling needed.
I want to cite the Java Documentation about CachedThreadPools:
Threads that have not been used for sixty seconds are terminated and
removed from the cache. Thus, a pool that remains idle for long enough
will not consume any resources.
First question: Is this also true for a SingleThreadExecutor? Does the thread get terminated? JavaDoc doesn't say anything about SingleThreadExecutor. It wouldn't even matter in this application, as I have an amount of objects I can count on one hand. Just curiosity.
Furthermore the doWork() method of TheObject needs to call the ExecutorService#.submit() method to do the work async. Is it possible (I bet it is) to call the doWork() method implicitly? Is this a viable way of designing an async method?
void doWork() {
if(!isRunningAsync) {
myExecutor.submit(doWork());
} else {
// Do Work...
}
}

First question: Is this also true for a SingleThreadExecutor? Does the thread get terminated?
Take a look at the source code of Executors, comparing the implementations of newCachedThreadPool and newSingleThreadExecutor:
public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
}
public static ExecutorService newSingleThreadExecutor() {
return new FinalizableDelegatedExecutorService
(new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>()));
}
The primary difference (of interest here) is the 60L, TimeUnit.SECONDS and 0L, TimeUnit.MILLISECONDS.
Effectively (but not actually), these parameters are passed to ThreadPoolExecutor.setKeepAliveTime. Looking at the Javadoc of that method:
A time value of zero will cause excess threads to terminate immediately after executing tasks.
where "excess threads" actually refers to "threads in excess of the core pool size".
The cached thread pool is created with zero core threads, and an (effectively) unlimited number of non-core threads; as such, any of its threads can be terminated after the keep alive time.
The single thread executor is created with 1 core thread and zero non-core threads; as such, there are no threads which can be terminated after the keep alive time: its one core thread remains active until you shut down the entire ThreadPoolExecutor.
(Thanks to #GPI for pointing out that I was wrong in my interpretation before).

First question:
Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources.
Is this also true for a SingleThreadExecutor?
SingleThreadExecutor works differently. It don't have time-out concept due to the values configured during creation.
Termination of SingleThread is possible. But it guarantees that always one Thread exists to handle tasks from task queue.
From newSingleThreadExecutor documentation:
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.)
Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
Second question:
Furthermore the doWork() method of TheObject needs to call the ExecutorService#.submit() method to do the work async
for(TheObject o : this.listOfObjects) {
o.doWork();
}
can be changed to
ExecutorService executorService = Executors.newSingleThreadExecutor();
executorService.execute(new Runnable() {
public void run() {
System.out.println("Asynchronous task");
}
});
executorService.shutdown();
with Callable or Runnable interface and add your doWork() code in run() method or call() method. The task will be executed concurrently.

Related

How to avoid congesting/stalling/deadlocking an executorservice with recursive callable

All the threads in an ExecutorService are busy with tasks that wait for tasks that are stuck in the queue of the executor service.
Example code:
ExecutorService es=Executors.newFixedThreadPool(8);
Set<Future<Set<String>>> outerSet=new HashSet<>();
for(int i=0;i<8;i++){
outerSet.add(es.submit(new Callable<Set<String>>() {
#Override
public Set<String> call() throws Exception {
Thread.sleep(10000); //to simulate work
Set<Future<String>> innerSet=new HashSet<>();
for(int j=0;j<8;j++) {
int k=j;
innerSet.add(es.submit(new Callable<String>() {
#Override
public String call() throws Exception {
return "number "+k+" in inner loop";
}
}));
}
Set<String> out=new HashSet<>();
while(!innerSet.isEmpty()) { //we are stuck at this loop because all the
for(Future<String> f:innerSet) { //callable in innerSet are stuckin the queue
if(f.isDone()) { //of es and can't start since all the threads
out.add(f.get()); //in es are busy waiting for them to finish
}
}
}
return out;
}
}));
}
Are there any way to avoid this other than by making more threadpools for each layer or by having a threadpool that is not fixed in size?
A practical example would be if some callables are submitted to ForkJoinPool.commonPool() and then these tasks use objects that also submit to the commonPool in one of their methods.
You should use a ForkJoinPool. It was made for this situation.
Whereas your solution blocks a thread permanently while it's waiting for its subtasks to finish, the work stealing ForkJoinPool can perform work while in join(). This makes it efficient for these kinds of situations where you may have a variable number of small (and often recursive) tasks that are being run. With a regular thread-pool you would need to oversize it, to make sure that you don't run out of threads.
With CompletableFuture you need to handle a lot more of the actual planning/scheduling yourself, and it will be more complex to tune if you decide to change things. With FJP the only thing you need to tune is the amount of threads in the pool, with CF you need to think about then vs. thenAsync as well.
I would recommend trying to decompose the work to use completion stages via CompletableFuture
CompletableFuture.supplyAsync(outerTask)
.thenCompose(CompletableFuture.allOf(innerTasks)
That way your outer task doesn’t hog the execution thread while processing inner tasks, but you still get a Future that resolves when the entire job is done. It can be hard to split those stages up if they’re too tightly coupled though.
The approach that you are suggesting which basically is based on the hypothesis that there is a possible resolution if the number of threads are more than the number of task, will not work here, if you are already allocating a single thread pool. You may try it to see it. It's a simple case of deadlock as you have stated in the comments of your code.
In such a case, use two separate thread pools, one for the outer and another for the inner. And when the task from the inner pool completes, simply return back the value to the outer.
Or you can simply create a thread on the fly, get the work done in it, get the result and return it back to the outer.

Frequent concurrent method calls in Java data-logger

I'm implementing a Java Data-logger which reads, at precise intervals of time, some datas from different production machines. To avoid having one call blocking the following ones, I was thinking of making a new thread for every call to the parser class.
However, this would require the creation of many threads, and then to stop them, every 10 seconds (which is my reading interval). A non-concurrent approach would cause me to have many delays when the parser gets an exception (due to the possible timeouts of the IoT devices i'm using) making the next calls to be delayed.
while(!error){
//JDBC connections and other calls here
//Queryresult is a ResultSet that returns all the machine addresses needing to be read
while(queryresult.next()){
//Parser.ParseSpeedV is the method I need to call concurrently
Double v = Parser.ParseSpeedV(..Params..);
Double s = v*queryresult.getDouble("const");
st = conn.createStatement();
st.executeUpdate("INSERT INTO ...");
}
st.close();
Thread.sleep(10000);
}
What is the best way to achieve a concurrent method calls (to the method ParseSpeedV) without having the overhead caused by thousands of thread starting every day?
What you want to use is a ScheduledExecutorService. It allows you to add tasks that are repeated at a fixed rate or fixed delay. So you can i.E. add a task that fetches data from a device every 10 seconds. The Executor service then makes sure that it is run in that interval with resonably low deviation.
final ScheduledExecutorService myScheduledExecutor = Executors.newScheduledThreadPool(16);
myScheduledExecutor.scheduleAtFixedRate(myTask, 0L, 10L, TimeUnit.SECONDS);
Your situation is the perfect use case for a Thread Pool. This part of Java's library that's built on top of simple Threads and allows you to create a fixed-sized pool of threads and reuse them over and over:
ExecutorService executor = Executors.newFixedThreadPool(5);
Any time you want to do some work you add it to the executor
executor.execute(new Runnable() {
#Override
public void run() {
// Do some work
}
});
If you call execute more than 5 times, the extra runnables are held in a queue until there's room.
Now, if you need to receive information from these runnning tasks, you need to write a class that implements Runnable and accepts some kind of object that wishes to have the information that your runnable has:
public class Worker implements Runnable {
Consumer consumer;
public Worker(Consumer consumer) {
this.consumer = consumer;
}
#Override public void run() {
// Do work
value = // get value
consumer.put(value);
}
}
Now all you have to do is define a Consumer class that operates on the value (has that put() method, or whatever) and create your Workers like this:
Consumer consumer = new Consumer();
Worker worker = new Worker(myConsumer);
executor.execute(worker);

How does newCachedThreadPool cache thread

Per the comment of method public static ExecutorService newCachedThreadPool() in Executor Class:
Threads that have not been used for sixty seconds are terminated and
removed from the **cache**.
I was wondering where is the cache and how it functions? As I didn't see any possible static Collection variable in the ThreadPoolExecutor or it's super class.
Technically Worker is a Runnable containing a reference to a Thread and not a Thread by itself.
Let us dig deeper into the mechanics of this class.
Executors.cachedThreadPool uses this constructor from ThreadPoolExecutor
new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
where 60s corresponds to the keepAliveTime time.
Worker Addition / Task submission
A RunnableFuture is created out of the submitted Callable or Runnable.
This is passed down to the execute() method.
The execute method tries to insert the task on to the workQueue which in our case is the SynchronousQueue. This will fail and return false due to the semantics of SynchronousQueue.
(Just hold on to this thought, we will revisit this when we talk about caching aspect)
The call goes on to the addIfUnderMaximumPoolSize method within execute which will create a java.util.concurrent.ThreadPoolExecutor.Worker runnable and creates a Thread and adds the created Worker to the workers hashSet. (the one others have mentioned in the answers)
and then it calls the thread.start() .
The run method of Worker is very important and should be noted.
public void run() {
try {
Runnable task = firstTask;
firstTask = null;
while (task != null || (task = getTask()) != null) {
runTask(task);
task = null;
}
} finally {
workerDone(this);
}
}
At this point in time you have a submitted a task and a thread is created and running it.
Worker Removal
In the run method if you have noticed there is a while loop.
It is an incredibly interesting piece of code.
If the task is not null it will short circuit and not check for the second condition.
Once the task has run using runTask and the task reference is set to null, the call comes to the second check condition which takes it into getTask method.
Here is the part which decides a worker should be purged or not.
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS);
The workQueue is polled for a minute in this case to check for any new tasks coming on the queue.
If not it will return null and checks for whether worker can exit.
Returning null means we will break out of the while and come to the finally block.
Here the worker is removed from the HashSet and the referenced Thread is also gone.
Caching aspect
Coming back to the SynchronousQueue we discussed in Task submission.
In the event I submit a task where workerQueue.offer and workerQueue.poll is able to work in tandem, i.e. there is a task to process in between those 60s I can re-use the thread.
This can be seen in action if I put a sleep of 59s vs 61s between my each task execution.
for 59s I can see the thread getting re-used. for 61s I can see a new thread getting created in the pool.
N.B. The actual timings could vary from machine to machine and my run() is just printing out Thread.currentThread().getName()
Please let me know in comments if I have missed something or misinterpreted the code.
Cache word is only an abstraction. Internally it uses HashSet to hold Threads. As per the code:
/**
* Set containing all worker threads in pool. Accessed only when
* holding mainLock.
*/
private final HashSet<Worker> workers = new HashSet<Worker>();
And if at all you are interested about the runnables you submit or execute.
newCachedThreadPool uses SynchronousQueue<Runnable> to handle them.
If you go through the code of ThreadPoolExecutor, you will see this:
/**
* Set containing all worker threads in pool. Accessed only when
* holding mainLock.
*/
private final HashSet<Worker> workers = new HashSet<Worker>();
and this:
/**
* The queue used for holding tasks and handing off to worker
* threads. We do not require that workQueue.poll() returning
* null necessarily means that workQueue.isEmpty(), so rely
* solely on isEmpty to see if the queue is empty (which we must
* do for example when deciding whether to transition from
* SHUTDOWN to TIDYING). This accommodates special-purpose
* queues such as DelayQueues for which poll() is allowed to
* return null even if it may later return non-null when delays
* expire.
*/
private final BlockingQueue<Runnable> workQueue;
And this:
try {
Runnable r = timed ?
// here keepAliveTime is passed as sixty seconds from
// Executors#newCachedThreadPool()
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
I sincere walk through the actual implementation code, keeping these pointers in mind will help you understand more clearly.

ThreadPoolExecutor's getActiveCount()

I have a ThreadPoolExecutor that seems to be lying to me when I call getActiveCount(). I haven't done a lot of multithreaded programming however, so perhaps I'm doing something incorrectly.
Here's my TPE
#Override
public void afterPropertiesSet() throws Exception {
BlockingQueue<Runnable> workQueue;
int maxQueueLength = threadPoolConfiguration.getMaximumQueueLength();
if (maxQueueLength == 0) {
workQueue = new LinkedBlockingQueue<Runnable>();
} else {
workQueue = new LinkedBlockingQueue<Runnable>(maxQueueLength);
}
pool = new ThreadPoolExecutor(
threadPoolConfiguration.getCorePoolSize(),
threadPoolConfiguration.getMaximumPoolSize(),
threadPoolConfiguration.getKeepAliveTime(),
TimeUnit.valueOf(threadPoolConfiguration.getTimeUnit()),
workQueue,
// Default thread factory creates normal-priority,
// non-daemon threads.
Executors.defaultThreadFactory(),
// Run any rejected task directly in the calling thread.
// In this way no records will be lost due to rejection
// however, no records will be added to the workQueue
// while the calling thread is processing a Task, so set
// your queue-size appropriately.
//
// This also means MaxThreadCount+1 tasks may run
// concurrently. If you REALLY want a max of MaxThreadCount
// threads don't use this.
new ThreadPoolExecutor.CallerRunsPolicy());
}
In this class I also have a DAO that I pass into my Runnable (FooWorker), like so:
#Override
public void addTask(FooRecord record) {
if (pool == null) {
throw new FooException(ERROR_THREAD_POOL_CONFIGURATION_NOT_SET);
}
pool.execute(new FooWorker(context, calculator, dao, record));
}
FooWorker runs record (the only non-singleton) through a state machine via calculator then sends the transitions to the database via dao, like so:
public void run() {
calculator.calculate(record);
dao.save(record);
}
Once my main thread is done creating new tasks I try and wait to make sure all threads finished successfully:
while (pool.getActiveCount() > 0) {
recordHandler.awaitTermination(terminationTimeout,
terminationTimeoutUnit);
}
What I'm seeing from output logs (which are presumably unreliable due to the threading) is that getActiveCount() is returning zero too early, and the while() loop is exiting while my last threads are still printing output from calculator.
Note I've also tried calling pool.shutdown() then using awaitTermination but then the next time my job runs the pool is still shut down.
My only guess is that inside a thread, when I send data into the dao (since it's a singleton created by Spring in the main thread...), java is considering the thread inactive since (I assume) it's processing in/waiting on the main thread.
Intuitively, based only on what I'm seeing, that's my guess. But... Is that really what's happening? Is there a way to "do it right" without putting a manual incremented variable at the top of run() and a decremented at the end to track the number of threads?
If the answer is "don't pass in the dao", then wouldn't I have to "new" a DAO for every thread? My process is already a (beautiful, efficient) beast, but that would really suck.
As the JavaDoc of getActiveCount states, it's an approximate value: you should not base any major business logic decisions on this.
If you want to wait for all scheduled tasks to complete, then you should simply use
pool.shutdown();
pool.awaitTermination(terminationTimeout, terminationTimeoutUnit);
If you need to wait for a specific task to finish, you should use submit() instead of execute() and then check the Future object for completion (either using isDone() if you want to do it non-blocking or by simply calling get() which blocks until the task is done).
The documentation suggests that the method getActiveCount() on ThreadPoolExecutor is not an exact number:
getActiveCount
public int getActiveCount()
Returns the approximate number of threads that are actively executing tasks.
Returns: the number of threads
Personally, when I am doing multithreaded work such as this, I use a variable that I increment as I add tasks, and decrement as I grab their output.

Waiting on threads

I have a method that contains the following (Java) code:
doSomeThings();
doSomeOtherThings();
doSomeThings() creates some threads, each of which will run for only a finite amount of time. The problem is that I don't want doSomeOtherThings() to be called until all the threads launched by doSomeThings() are finished. (Also doSomeThings() will call methods that may launch new threads and so on. I don't want to execute doSomeOtherThings() until all these threads have finished.)
This is because doSomeThings(), among other things will set myObject to null, while doSomeOtherThings() calls myObject.myMethod() and I do not want myObject to be null at that time.
Is there some standard way of doing this kind of thing (in Java)?
You may want to have a look at the java.util.concurrent package. In particular, you might consider using the CountDownLatch as in
package de.grimm.game.ui;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Main {
public static void main(String[] args)
throws Exception {
final ExecutorService executor = Executors.newFixedThreadPool(5);
final CountDownLatch latch = new CountDownLatch(3);
for( int k = 0; k < 3; ++k ) {
executor.submit(new Runnable() {
public void run() {
// ... lengthy computation...
latch.countDown();
}
});
}
latch.await();
// ... reached only after all threads spawned have
// finished and acknowledged so by counting down the
// latch.
System.out.println("Done");
}
}
Obviously, this technique will only work, if you know the number of forked threads beforehand, since you need to initialize the latch with that number.
Another way would be to use condition variables, for example:
boolean done = false;
void functionRunInThreadA() {
synchronized( commonLock ) {
while( !done ) commonLock.wait();
}
// Here it is safe to set the variable to null
}
void functionRunInThreadB() {
// Do something...
synchronized( commonLock ) {
done = true;
commonLock.notifyAll();
}
}
You might need to add exception handling (InteruptedException) and some such.
Take a look at Thread.join() method.
I'm not clear on your exact implementation but it seems like doSomeThings() should wait on the child threads before returning.
Inside of doSomeThings() method, wait on the threads by calling Thread.join() method.
When you create a thread and call that thread's join() method, the calling thread waits until that thread object dies.
Example:
// Create an instance of my custom thread class
MyThread myThread = new MyThread();
// Tell the custom thread object to run
myThread.start();
// Wait for the custom thread object to finish
myThread.join();
You are looking is the executorservice and use the futures :)
See http://java.sun.com/docs/books/tutorial/essential/concurrency/exinter.html
So basically collect the futures for all the runnables that you submit to the executor service. Loop all the futures and call the get() methods. These will return when the corresponding runnable is done.
Another useful more robust Synchronization Barrier you can use that would do the similar functionality as a CountdownLatch is a CyclicBarrier. It works similar to a CountdownLatch where you have to know how many parties (threads) are being used, but it allows you to reuse the barrier as apposed to creating a new instance of a CountdownLatch every time.
I do like momania's suggestion of using an ExecutorService, collecting the futures and invoking get on all of them until they complete.
Another option is to sleep your main thread, and have it check every so often if the other threads have finished. However, I like Dirk's and Marcus Adams' answers better - just throwing this out here for completeness sake.
Depends on what exactly you are trying to do here. Is your main concern the ability to dynamically determine the various threads that get spawned by the successive methods that get called from within doSomeThings() and then be able to wait till they finish before calling doSomeOtherThings() ? Or it is possible to know the threads that are spawned at compile time ? In the later case there are number of solutions but all basically involve calling the Thread.join() method on all these threads from wherever they are created.
If it is indeed the former , then you are better off using ThreadGroup and its enumerate()
method. This gives you a array of all threads spawned by doSomeThings() if you have properly added new threads to the ThreadGroup. Then you can loop through all thread references in the returned array and call join() on the main thread just before you call doSomeOtherThings() .

Categories