Is it OK to create multiple threadpools (ExecutorService)?

Is it OK to create multiple threadpools (ExecutorService)? - java

I created multiple ExecutorService instances in my code, usually each UI page has one ExecutorService instance. Each ExecutorService instance will execute some http get request threads.
private ExecutorService m_threadPool = Executors.newCachedThreadPool();
Is it OK to do that?
The problem I met is that sometimes the http get requests got response code -1 from HttpURLConnection getResponseCode() call. I don't know whether it is caused by multiple threadpool instances.
Thanks.

ExecutorService per se is just another object so there's no big overhead. But each thread pool comes with a number of idle threads by default and those are a cause of a major resource waste. I would suggest setting the default number of pre-generated threads in each pool small (1 or 0 if you are not sure whether any requests are sent) in order to reduce the cost of creating extra objects. Threads would be created on demand and you'll be able to keep your code clean.
Another solution is to use a single thread pool but to maintain a separate list of tasks for each UI window. In this case when window gets closed you'll have to iterate over all tasks and cancell the running ones manually (this can also be done in a separate thread). A task may be represented by a Future<?> (it has handy isDone() and cancel() methods).

It shouldn't be caused by your thread pool instances. However, I'd say that having more than one thread pool is questionable. Why would you need it? It could lead to a lot of unnecessary threads, and thereby unnecessary memory use.

Related

How is a thread pool able to re-use threads?

Our current course assignment specifies that we are supposed to create a manager for a thread pool using the "Object Pool Manager" design pattern which spawns a set amount of threads. The ownership of these threads shall be transferred to the client and then back to the pool after the client has finished using it. If no thread exists in the pool then the client has to wait.
My confusion comes from the fact that a thread is supposedly not reusable, which defeats the purpose of pooling them. Have I understood the assignment incorrectly?

Threads are reusable as long as they have not ended. A pool of threads generally involves threads that do work as it is given to them, and then wait for more work. Thus, they never end until explicitly told to do so. The trick is designing them in a way such that the work they are given ends, but the thread itself does not. Thread pools are useful because it is often relatively expensive to create/destroy threads.

#Kaliatech has already explained the concept behind re-use of threads. Also "The ownership of these threads shall be transferred to the client" is slightly misleading as the ownership of threads generally remain with the thread-pool/object-pool as it is the manager of this pool and the client should simply submits the task to the pool which can either complete successfully or fail. The thread continues to run ready to pick the next task submitted to the pool. As a design too the separation of task object ( Runnable/Callable) and the object representing thread execution (Thread) are designed to be different. Should the need arise the thread-pool is responsible for ramping up/down the number of threads as they are expensive to create and manage. Java ThreadPoolExecutor will be a good example to refer to how typically such a thread pool works.

Multiple CompletionService for one thread pool Java

I'm working on a Java server application with the general following architecture:
Clients make RPC requests to the server
The RPC server (gRPC) I believe has its own thread pool for handling requests
Requests are immediately inserted into Thread Pool 1 for more processing
A specific request type, we'll call Request R, needs to run a few asynchronous tasks in parallel, judging the results to form a consensus that it will return to the client. These tasks are a bit more long running, so I use a separate Thread Pool 2 to handle these requests. Importantly, each Request R will need to run the same 2-3 asynchronous tasks. Thread Pool 2 therefore services ALL currently executing Request R's. However, a Request R should only be able to see and retrieve the asynchronus tasks that belong to it.
To achieve this, upon every incoming Request R, while its in Thread Pool 1, it will create a new CompletionService for the request, backed by Thread Pool 2. It will submit 2-3 async tasks, and retrieve the results. These should be strictly isolated from anything else that might be running in Thread Pool 2 belonging to other requests.
My questions:
Firstly, is Java's CompletionService isolated? I couldn't find good documentation on this after checking the JavaDocs. In other words, if two or more CompletionService's are backed by the same thread pool, are any of them at risk of pulling a future belonging to another CompletionService?
Secondly, is this bad practice to be creating this many CompletionService's for each request? Is there a better way to handle this? Of course it would be a bad idea to create a new thread pool for each request, so is there a more canonical/correct way to isolate futures within a CompletionService or is what I'm doing okay?
Thanks in advance for the help. Any pointers to helpful documentation or examles would be greatly appreciated.
Code, for reference, although trivial:
public static final ExecutorService THREAD_POOL_2 =
new ThreadPoolExecutor(16, 64, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>());
// Gets created to handle a RequestR, RequestRHandler is run in Thread Pool 1
public class RequestRHandler {
CompletionService<String> cs;
RequestRHandler() {
cs = new ExecutorCompletionService<>(THREAD_POOL_2);
}
String execute() {
cs.submit(asyncTask1);
cs.submit(asyncTask2);
cs.submit(asyncTask3);
// Lets say asyncTask3 completes first
Future<String> asyncTask3Result = cs.take();
// asyncTask3 result indicates asyncTask1 & asyncTask2 results don't matter, cancel them
// without checking result
// Cancels all futures, I track all futures submitted within this request and cancel them,
// so it shouldn't affect any other requests in the TP 2 pool
cancelAllFutures(cs);
return asyncTask3Result.get();
}
}

Firstly, is Java's CompletionService isolated?
That's not garanteed as it's an interface, so the implementation decides that. But as the only implementation is ExecutorCompletionService I'd just say the answer is: yes. Every instance of ExecutorCompletionService has internally a BlockingQueue where the finished tasks are queued. Actually, when you call take on the service, it just passes the call to the queue by calling take on it. Every submitted task is wrapped by another object, which puts the task in the queue when it's finished. So each instance manages it's submitted tasks isolated from other instances.
Secondly, is this bad practice to be creating this many CompletionServices for each request?
I'd say it's okay. A CompletionService is nothing but a rather thin wrapper around an executor. You have to live with the "overhead" (internal BlockingQueue and wrapper instances for the tasks) but it's small and you are probably gaining way more from it than it costs. One could ask if you need one for just 2 to 3 tasks but it kinda depends on the tasks. At this point it's a question about if a CompletionService is worth it in general, so that's up to you to decide as it's out of scope of your question.

Behavior of ThreadPoolExecutor with keepAliveTime = 0 and corePoolSize = 0

Does setting ThreadPoolExecutor's keepAliveTime and corePoolSize to 0 make it create a new Thread for every task? Is it guaranteed no Thread will ever be reused for any task?
BTW I want to set the maximumPoolSize to 100 or so. I cannot afford unlimited amount of threads. In case I reached the limit of the threads (e.g. 100), I want the server to fallback to 'sychronous' mode (no parallelism). See the ThreadPoolExecutor.CallerRunsPolicy.
Background (read only in case you are interested in my motivation):
We have a project which relies on usage of ThreadLocals (e.g. we use Spring and its SecurityContextHolder). We would like to make 10 calls to backend systems in parallel. We like the ThreadPoolExecutor.CallerRunsPolicy, which runs the callable in the caller thread in case thread pool and its task queue is full. That's why we would like to use ThreadPoolExecutor. I am not able to change the project not to use ThreadLocals, please do not suggest doing so.
I was thinking how to do it with the least amount of work. SecurityContextHolder can be switched to use InheritableThreadLocal instead of ThreadLocal. The thread-local variables are then passed to child threads when the child threads are created. The only problem is how to make ThreadPoolExecutor create new Thread for every task. Will setting its keepAliveTime and corePoolSize to 0 work? Am I sure none of the threads will be reused for a next task? I can live with performance hit of creating new Threads, because the parallel tasks take much more time each.
Other possible solutions considered:
Extend ThreadPoolExecutor's execute method and wrap the Runnable command parameter into a different Runnable which remembers thread-locals into its final fields and then initializes them in its run method before calling the target command. I think this might work, but it is sligtly more code to maintain than the solution my original question is about.
Pass thread-locals in a parameter of asynchronous methods. This is more verbose to use. Also people can forget to do it, and the security context would be shared between asynchronous tasks :-(
Extend ThreadPoolExecutor's beforeExecute and afterExecute methods and copy thread-locals using reflection. This requires ~50 lines of ugly reflection code and I am not sure how safe it is.

Nope, this does not work! ThreadPoolExecutor wraps your Callable/Runnable into an internal Worker object and executes it in its runWorker() method. The Javadoc of this method says:
Main worker run loop. Repeatedly gets tasks from queue and executes them, while coping with a number of issues: ...
You can also take a look at the code of this method and see that it does not exit until the task queue is empty (or something bad happens which causes the thread to exit).
So setting keepAliveTime to 0 will not necessarily cause a new thread on each submitted task.
You should probably go with your solution 3 as the beforeExecute() and afterExecute() methods are exactly meant for dealing with ThreadLocals.
Alternatively, if you insist of having new threads for each task, you may take a look at Spring's SimpleAsyncTaskExecutor. It guarantees you a new thread for each task and allows you to set a concurrency limit, i.e. the equivalent of ThreadPoolExecutor#maxPoolSize.

Should any threads reside outside of the thread pool?

When using a thread pool, is it beneficial to still use singular thread objects for a specific task. I'm wondering in terms of a server in Java, whether or not the thread which is listening for connections, should share its resources with any other threads which are then allocated from this one listening thread? I may also be missing the point as I'm not familiar with this concept.

Yes, singular tasks that have to run concurrently can have their own threads outside of the thread pool. Forcing every thread to be part of the pool might obscure your design because you need all kinds of machinery to make concurrent tasks look like worker threads.

I'd create two pools, one for listening and one for internal tasks. This way you're never putting your server at risk of not being able to listen for connections.
The pool for internal tasks can be small if it's only a thread now and then, but at least it's safely isolated.

Resource sharing might be necessary in cases where your server needs to maintain a global application state (e.g. using an AtomicLong for the number of requests served by your server etc.). Your main thread would typically wait, ready to accept incoming connections/requests. You then update the global state (like hit counter), create a new "job" based on the new request (typically a Runnable or Callable) and submit it to a thread pool (java.util.concurrent) provides them.

The purpose of a thread pool is just to help you manage your threads. In other words, a thread pool handles the creation and termination of threads for you as well as giving work to idle threads. Threads that are blocked or waiting will not receive new tasks.
Your connection listener will probably be in an infinite loop waiting for connections and thus never be idle (although it could be in a wait state). Since this is the case, the connection listener thread will never be able to receive new tasks so it wouldn't make sense to pool it with the other threads.
Connection listening and connection handling are also two different things. From that perspective the connection listener shouldn't be pooled with the connection handlers either.

SImilar to #larsman's comment, I would do what ever you feel is simpler and clearer. I have tended to use one thread pool for everything because it appeared to be easier to manage. You don't have to do it that way and the listening task can be its own thread.

Is dangerous to start threads in Java and not to wait for them (with .join())?

When writing a multithread internet server in java, the main-thread starts new
ones to serve incoming requests in parallel.
Is any problem if the main-thread does not wait ( with .join()) for them?
(It is obviously absurd create a new thread and then, wait for it).
I know that, in a practical situation, you should (or "you must"?) implement a pool
of threads to "re-use" them for new requests when they become idle.
But for small applications, should we use a pool of threads?

You don't need to wait for threads.
They can either complete running on their own (if they've been spawned to perform one particular task), or run indefinitely (e.g. in a server-type environment).
They should handle interrupts and respond to shutdown requests, however. See this article on how to do this correctly.
If you need a set of threads I would use a pool and executor methods since they'll look after thread resource management for you. If you're writing a multi-threaded network server then I would investigating using (say) a servlet container or a framework such as Mina.

The only problem in your approach is that it does not scale well beyond a certain request rate. If the requests are coming in faster than your server is able to handle them, the number of threads will rise continuously. As each thread adds some overhead and uses CPU time, the time for handling each request will get longer, so the problem will get worse (because the number of threads rises even faster). Eventually no request will be able to get handled anymore because all of the CPU time is wasted with overhead. Probably your application will crash.
The alternative is to use a ThreadPool with a fixed upper bound of threads (which depends on the power of the hardware). If there are more requests than the threads are able to handle, some requests will have to wait too long in the request queue, and will fail due to a timeout. But the application will still be able to handle the rest of the incoming requests.
Fortunately the Java API already provides a nice and flexible ThreadPool implementation, see ThreadPoolExecutor. Using this is probably even easier than implementing everything with your original approach, so no reason not to use it.

Thread.join() lets you wait for the Thread to end, which is mostly contrary to what you want when starting a new Thread. At all, you start the new thread to do stuff in parallel to the original Thread.
Only if you really need to wait for the spawned thread to finish, you should join() it.

You should wait for your threads if you need their results or need to do some cleanup which is only possible after all of them are dead, otherwise not.
For the Thread-Pool: I would use it whenever you have some non-fixed number of tasks to run, i.e. if the number depends on the input.

I would like to collect the main ideas of this interesting (for me) question.
I can't totally agree with "you
don't need to wait for threads".
Only in the sense that if you don't
join a thread (and don't have a
pointer to it) once the thread is
done, its resources are freed
(right? I'm not sure).
The use of a thread pool is only
necessary to avoid the overhead of
thread creation, because ...
You can limit the number of parallel
running threads by accounting, with shared variables (and without a thread pool), how many of then
were started but not yet finished.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.