How Java's ThreadPoolExecutor uses custom ThreadFactory? - java

The ThreadPoolExecutor class allows to give a custom ThreadFactory to create new threads. However, I do not understand how these threads are used within sun's implementation of ThreadPoolExecutor.
This implementation creates new thread as follows:
private Thread addThread(Runnable firstTask) {
Worker w = new Worker(firstTask);
Thread t = threadFactory.newThread(w);
if (t != null) {
w.thread = t;
...
} ...
But in Worker's implementation, I do not see how field "thread" is used as the runner.
Furthermore, I do not understand how one can provide Thread with custom "run" method that
are reused (where reused - as in ThreadPoolExecutor - mean "they run multiple Runnable"). How can ThreadPoolExecutor reuse such threads to run multiple Runnable (given that the "target" Runnable in class Thread is set at construction time and there's no setter). ThreadPoolExecutor's documentation is as follows:
By supplying a different ThreadFactory, you can alter the thread's name, thread group, priority, daemon status, etc.
Does this mean that the "run" method of threads created by a custom ThreadFactory is not used ? That's the only way I would understand the "custom threads creation + thread 'reuse'" mechanism.

But in Worker's implementation, I do not see how field "thread" is used as the runner.
As newThread() documentation states, the newly created thread must run the Runnuble supplied as the argument to threadFactory.newThread(Runnuble).
This is the way new thread is used as the runner.
Furthermore, I do not understand how one can provide Thread with custom "run" method that are reused
Again, thread's run method must run the supplied Runnable.run() once. The thread poll supplies Worker which runs tasks in a loop.
Example of correct user-implemented thread:
public class ThreadTL extends Thread {
public ThreadTL(Runnable r) {
super(r);
setName(getName()+" DF "+executor.getClass().getSimpleName());
}
#Override
public void run() {
super.run();
}
}

One Runnable can call another. The Runnable you get passed is one which gets Runnables from the thread pools queue and runs those.

Related

ExecutorService SingleThreadExecutor

I have a list of objects, from which depending on user interaction some objects need to do work asynchronically. Something like this:
for(TheObject o : this.listOfObjects) {
o.doWork();
}
The class TheObject implements an ExecutorService (SingleThread!), which is used to do the work. Every object of type TheObject instantiates an ExecutorService. I don't want to make lasagna code. I don't have enough Objects at the same time, to make an extra extraction layer with thread pooling needed.
I want to cite the Java Documentation about CachedThreadPools:
Threads that have not been used for sixty seconds are terminated and
removed from the cache. Thus, a pool that remains idle for long enough
will not consume any resources.
First question: Is this also true for a SingleThreadExecutor? Does the thread get terminated? JavaDoc doesn't say anything about SingleThreadExecutor. It wouldn't even matter in this application, as I have an amount of objects I can count on one hand. Just curiosity.
Furthermore the doWork() method of TheObject needs to call the ExecutorService#.submit() method to do the work async. Is it possible (I bet it is) to call the doWork() method implicitly? Is this a viable way of designing an async method?
void doWork() {
if(!isRunningAsync) {
myExecutor.submit(doWork());
} else {
// Do Work...
}
}
First question: Is this also true for a SingleThreadExecutor? Does the thread get terminated?
Take a look at the source code of Executors, comparing the implementations of newCachedThreadPool and newSingleThreadExecutor:
public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
}
public static ExecutorService newSingleThreadExecutor() {
return new FinalizableDelegatedExecutorService
(new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>()));
}
The primary difference (of interest here) is the 60L, TimeUnit.SECONDS and 0L, TimeUnit.MILLISECONDS.
Effectively (but not actually), these parameters are passed to ThreadPoolExecutor.setKeepAliveTime. Looking at the Javadoc of that method:
A time value of zero will cause excess threads to terminate immediately after executing tasks.
where "excess threads" actually refers to "threads in excess of the core pool size".
The cached thread pool is created with zero core threads, and an (effectively) unlimited number of non-core threads; as such, any of its threads can be terminated after the keep alive time.
The single thread executor is created with 1 core thread and zero non-core threads; as such, there are no threads which can be terminated after the keep alive time: its one core thread remains active until you shut down the entire ThreadPoolExecutor.
(Thanks to #GPI for pointing out that I was wrong in my interpretation before).
First question:
Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources.
Is this also true for a SingleThreadExecutor?
SingleThreadExecutor works differently. It don't have time-out concept due to the values configured during creation.
Termination of SingleThread is possible. But it guarantees that always one Thread exists to handle tasks from task queue.
From newSingleThreadExecutor documentation:
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.)
Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
Second question:
Furthermore the doWork() method of TheObject needs to call the ExecutorService#.submit() method to do the work async
for(TheObject o : this.listOfObjects) {
o.doWork();
}
can be changed to
ExecutorService executorService = Executors.newSingleThreadExecutor();
executorService.execute(new Runnable() {
public void run() {
System.out.println("Asynchronous task");
}
});
executorService.shutdown();
with Callable or Runnable interface and add your doWork() code in run() method or call() method. The task will be executed concurrently.

Differences between these 2 factory methods

I would like to know the difference between these 2 methods:
public static ExecutorService newFixedThreadPool(int nThreads)
and
public static ExecutorService newFixedThreadPool(int nThreads, ThreadFactory tf)
Obviously one takes a specified ThreadFactory for threads creation. However I would like to know what kind of standard ThreadFactory the former use?
Why is it convenient using the latter rather than the former or vice-versa?
Thanks in advance.
DefaultThreadFactory,
New threads are created using a ThreadFactory. If not otherwise
specified, a Executors.defaultThreadFactory() is used, that creates
threads to all be in the same java.lang.ThreadGroup and with the same
NORM_PRIORITY priority and non-daemon status. By supplying a different
ThreadFactory, you can alter the thread's name, thread group,
priority, daemon status, etc. If a ThreadFactory fails to create a
thread when asked by returning null from newThread, the executor will
continue, but might not be able to execute any tasks. Threads should
possess the "modifyThread" RuntimePermission. If worker threads or
other threads using the pool do not possess this permission, service
may be degraded: configuration changes may not take effect in a timely
manner, and a shutdown pool may remain in a state in which termination
is possible but not completed.
Reference -
But you can encapsulate the thread creation in your ThreadFactory, what actaully usage of Factory pattern.
For Example -
class SimpleThreadFactory implements ThreadFactory {
public Thread newThread(Runnable r) {
// do something
return new Thread(r);
}
}
For reference please check - documentation and also find a good answer.
The first one uses the DefaultThreadFactory which is an inner class of Executors. When you define your own ThreadFactory you can influence the created Threads. You can choose their name, priority, etc.
The first uses Executors.defaultThreadFactory to create threads with the first version. You would use the first version if you don't care how the threads are created, and the second if you need to impose some custom settings on the threads when they are created.

ThreadPoolExecutor's getActiveCount()

I have a ThreadPoolExecutor that seems to be lying to me when I call getActiveCount(). I haven't done a lot of multithreaded programming however, so perhaps I'm doing something incorrectly.
Here's my TPE
#Override
public void afterPropertiesSet() throws Exception {
BlockingQueue<Runnable> workQueue;
int maxQueueLength = threadPoolConfiguration.getMaximumQueueLength();
if (maxQueueLength == 0) {
workQueue = new LinkedBlockingQueue<Runnable>();
} else {
workQueue = new LinkedBlockingQueue<Runnable>(maxQueueLength);
}
pool = new ThreadPoolExecutor(
threadPoolConfiguration.getCorePoolSize(),
threadPoolConfiguration.getMaximumPoolSize(),
threadPoolConfiguration.getKeepAliveTime(),
TimeUnit.valueOf(threadPoolConfiguration.getTimeUnit()),
workQueue,
// Default thread factory creates normal-priority,
// non-daemon threads.
Executors.defaultThreadFactory(),
// Run any rejected task directly in the calling thread.
// In this way no records will be lost due to rejection
// however, no records will be added to the workQueue
// while the calling thread is processing a Task, so set
// your queue-size appropriately.
//
// This also means MaxThreadCount+1 tasks may run
// concurrently. If you REALLY want a max of MaxThreadCount
// threads don't use this.
new ThreadPoolExecutor.CallerRunsPolicy());
}
In this class I also have a DAO that I pass into my Runnable (FooWorker), like so:
#Override
public void addTask(FooRecord record) {
if (pool == null) {
throw new FooException(ERROR_THREAD_POOL_CONFIGURATION_NOT_SET);
}
pool.execute(new FooWorker(context, calculator, dao, record));
}
FooWorker runs record (the only non-singleton) through a state machine via calculator then sends the transitions to the database via dao, like so:
public void run() {
calculator.calculate(record);
dao.save(record);
}
Once my main thread is done creating new tasks I try and wait to make sure all threads finished successfully:
while (pool.getActiveCount() > 0) {
recordHandler.awaitTermination(terminationTimeout,
terminationTimeoutUnit);
}
What I'm seeing from output logs (which are presumably unreliable due to the threading) is that getActiveCount() is returning zero too early, and the while() loop is exiting while my last threads are still printing output from calculator.
Note I've also tried calling pool.shutdown() then using awaitTermination but then the next time my job runs the pool is still shut down.
My only guess is that inside a thread, when I send data into the dao (since it's a singleton created by Spring in the main thread...), java is considering the thread inactive since (I assume) it's processing in/waiting on the main thread.
Intuitively, based only on what I'm seeing, that's my guess. But... Is that really what's happening? Is there a way to "do it right" without putting a manual incremented variable at the top of run() and a decremented at the end to track the number of threads?
If the answer is "don't pass in the dao", then wouldn't I have to "new" a DAO for every thread? My process is already a (beautiful, efficient) beast, but that would really suck.
As the JavaDoc of getActiveCount states, it's an approximate value: you should not base any major business logic decisions on this.
If you want to wait for all scheduled tasks to complete, then you should simply use
pool.shutdown();
pool.awaitTermination(terminationTimeout, terminationTimeoutUnit);
If you need to wait for a specific task to finish, you should use submit() instead of execute() and then check the Future object for completion (either using isDone() if you want to do it non-blocking or by simply calling get() which blocks until the task is done).
The documentation suggests that the method getActiveCount() on ThreadPoolExecutor is not an exact number:
getActiveCount
public int getActiveCount()
Returns the approximate number of threads that are actively executing tasks.
Returns: the number of threads
Personally, when I am doing multithreaded work such as this, I use a variable that I increment as I add tasks, and decrement as I grab their output.

In Java, you must have a class with shared variables that threads will access?

I'm learning threads yet, but don't know much things.
I see that I need implement the Runnable interface and create various instances of the same class to each thread execute each one. It's correct?
If is correct, I need to create another class to contains the variables that will be accessed/shared by all threads?
EDIT: I need maintain some variables to coordinate the thread work, otherwise they will execute the same work. This will be one variable shared by all threads.
EDIT 2: this questions is related to this: How I make result of SQL querys with LIMIT different in each query? . I will need maintain the quantity of threads that have done a query to database to set the OFFSET parameter.
Each thread needs an instance of a Runnable to do its work, yes. In some cases the threads could share the same instance, but only if there is no state held within the instance that needs to differ between threads. Generally you will want different instances in each thread.
Threads should share as little state as possible to avoid problems, but if you do want to share state, in general you are right that you will need an instance or instances somewhere to hold that state.
Note that this shared state could also be held in class variables rather than instance variables.
There are many ways to solve this...this is really a question about Design Patterns.
Each thread could be provided via it's constructor an object or objects that describe its unique work.
Or you could provide the thread with a reference to a work queue from which they could query the next available task.
Or you could put a method in the class that implements Runnable that could be called by a master thread...
Many ways to skin this cat...I'm sure there are existing libraries for thread work distribution, configuration, etc.
Let's put all things on their places.
Statement new Thread(r) creates thread. But this thread still does not run. If you say"
Thread t = new Thread(r);
t.start();
you make thread to run, i.e. execute run() method of your runnable.
Other (equal) way to create and run thread is to inherit from class Thread and override default implementation of its run() method.
Now. If you have specific logic and you wish to run the same logic simultaneously in different threads you have to create different threads and execute their start() method.
If you prefer to implement Runnable interface and your logic does not require any parameters you even can create only one instance of your runnable implementation and run it into different threads.
public class MyLogic implements Runnable {
public void run() {
// do something.
}
}
//// ................
Runnable r = new MyLogic();
Thread t1 = new Thread(r);
Thread t2 = new Thread(r);
t1.start();
t2.start();
Now this logic is running simultaniusly in 2 separate threads while we created only one instance of MyLogic.
If howerver your logic requires parameters you should create separate instances.
public class MyLogic implements Runnable {
private int p;
public MyLogic(int p) {
this.p = p;
}
public void run() {
// this logic uses value of p.
}
}
//// ................
Thread t1 = new Thread(new MyLogic(111));
Thread t2 = new Thread(new MyLogic(222));
t1.start();
t2.start();
These 2 threads run the same logic with different arguments (111 and 222).
BTW this example shows how to pass values to thread. To get information from it you should use similar method. Define member variable result. The variable will be initiated by method run(). Provide appropriate getter. Now you can pass result from thread to anyone that is interesting to do this.
Obviously described above are basics. I did not say anything about synchronization, thread pools, executors etc. But I hope this will help you to start. Then find some java thread tutorial and go through it. In couple of days you will be the world class specialist in java threads. :)
Happy threading.

Waiting on threads

I have a method that contains the following (Java) code:
doSomeThings();
doSomeOtherThings();
doSomeThings() creates some threads, each of which will run for only a finite amount of time. The problem is that I don't want doSomeOtherThings() to be called until all the threads launched by doSomeThings() are finished. (Also doSomeThings() will call methods that may launch new threads and so on. I don't want to execute doSomeOtherThings() until all these threads have finished.)
This is because doSomeThings(), among other things will set myObject to null, while doSomeOtherThings() calls myObject.myMethod() and I do not want myObject to be null at that time.
Is there some standard way of doing this kind of thing (in Java)?
You may want to have a look at the java.util.concurrent package. In particular, you might consider using the CountDownLatch as in
package de.grimm.game.ui;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Main {
public static void main(String[] args)
throws Exception {
final ExecutorService executor = Executors.newFixedThreadPool(5);
final CountDownLatch latch = new CountDownLatch(3);
for( int k = 0; k < 3; ++k ) {
executor.submit(new Runnable() {
public void run() {
// ... lengthy computation...
latch.countDown();
}
});
}
latch.await();
// ... reached only after all threads spawned have
// finished and acknowledged so by counting down the
// latch.
System.out.println("Done");
}
}
Obviously, this technique will only work, if you know the number of forked threads beforehand, since you need to initialize the latch with that number.
Another way would be to use condition variables, for example:
boolean done = false;
void functionRunInThreadA() {
synchronized( commonLock ) {
while( !done ) commonLock.wait();
}
// Here it is safe to set the variable to null
}
void functionRunInThreadB() {
// Do something...
synchronized( commonLock ) {
done = true;
commonLock.notifyAll();
}
}
You might need to add exception handling (InteruptedException) and some such.
Take a look at Thread.join() method.
I'm not clear on your exact implementation but it seems like doSomeThings() should wait on the child threads before returning.
Inside of doSomeThings() method, wait on the threads by calling Thread.join() method.
When you create a thread and call that thread's join() method, the calling thread waits until that thread object dies.
Example:
// Create an instance of my custom thread class
MyThread myThread = new MyThread();
// Tell the custom thread object to run
myThread.start();
// Wait for the custom thread object to finish
myThread.join();
You are looking is the executorservice and use the futures :)
See http://java.sun.com/docs/books/tutorial/essential/concurrency/exinter.html
So basically collect the futures for all the runnables that you submit to the executor service. Loop all the futures and call the get() methods. These will return when the corresponding runnable is done.
Another useful more robust Synchronization Barrier you can use that would do the similar functionality as a CountdownLatch is a CyclicBarrier. It works similar to a CountdownLatch where you have to know how many parties (threads) are being used, but it allows you to reuse the barrier as apposed to creating a new instance of a CountdownLatch every time.
I do like momania's suggestion of using an ExecutorService, collecting the futures and invoking get on all of them until they complete.
Another option is to sleep your main thread, and have it check every so often if the other threads have finished. However, I like Dirk's and Marcus Adams' answers better - just throwing this out here for completeness sake.
Depends on what exactly you are trying to do here. Is your main concern the ability to dynamically determine the various threads that get spawned by the successive methods that get called from within doSomeThings() and then be able to wait till they finish before calling doSomeOtherThings() ? Or it is possible to know the threads that are spawned at compile time ? In the later case there are number of solutions but all basically involve calling the Thread.join() method on all these threads from wherever they are created.
If it is indeed the former , then you are better off using ThreadGroup and its enumerate()
method. This gives you a array of all threads spawned by doSomeThings() if you have properly added new threads to the ThreadGroup. Then you can loop through all thread references in the returned array and call join() on the main thread just before you call doSomeOtherThings() .

Categories