I am currently using 5 threadpools and I want to find optimal sizing for these pools. This is some kind of prior analysis. Pools are divided by usage: for handling commands (cmdPool), handling inventory transactions (invPool), pool for database transactions (dbPool), also pool for common things that simply need to run async like I/O (fastPool) and for scheduled tasks (timerPool). I do not have any statistical data that could be used for solving problem yet.
For database query I am using HikariCP with default values. I will try to change count of maximum connections and minimum idle connections later to find optimal performance. But for now, when using Hikari pool it will be always called from one of the pools to not affect main thread. Usual database query is called under dbPool but only when code block is not part of already runnable submited into one of the thread pools.
Actual setup looks it just works right in application. So my questions are:
1.) How will impact performance and resources when I decide to stop using cachedThreadPool and use pool with some minimum idle threads like timerPool or I should stick with cached ?
2.) Is right solution to set a maximum pool size to prevent spikes when like 100 clients will join in small period of time and let them keep wait for some short time while other task will complete.
3.) Is there any better solution how to manage many kinds of tasks ?
cmdPool = Executors.newFixedThreadPool(3);
invPool = Executors.newFixedThreadPool(2);
dbPool = Executors.newCachedThreadPool();
fastPool = Executors.newCachedThreadPool();
timerPool = new ScheduledThreadPoolExecutor(5);
timerPool.allowCoreThreadTimeOut(true);
timerPool.setKeepAliveTime(3, TimeUnit.MINUTES);
So first of all, every action depends on how many clients are connected, lets assume values like 5-25 clients. Pools should be designed to maintain even extremes like 100 clients and not make too many threads in small time period.
Expected uses may vary and are not same every second even may happen no task will come to run at all. Expected usage of cmdPool is like 3-8 uses per second (lightweight tasks). For invPool is usage nearly same like for cmdPool 2-6 uses per second (also lightweight tasks). As for dbPool this is more unpredictable than all others, but still expected usage is from 5-20 uses per second (lightweight and mediumweight tasks) also depends on how busy is network. Timer and fast pools are designed to take any kind of task and just do it, there is expected use of 20-50 uses per second.
I appreciate any suggestions, thank you.
The best solution is to adapt your application to the expected traffic.
You can do that in many manners:
Design it with a microservice architecture leaving the orchestrator to handle peak of traffic
Design the application that reads the parameters of the size of thread pools on the fly (from a database from a file, from a configuration server...), so you can change the values when needed
If you need only to tune your application but you don't need to change the values on the fly put your configurations in a file (or database). Check different configurations to find the most adapted to your needs
What is important is to move away a code similar to this one:
cmdPool = Executors.newFixedThreadPool(3);
and replace it with a code similar to this one
#Value("${cmdPoolSize}")
private int cmdPoolSize;
...
cmdPool = Executors.newFixedThreadPool(cmdPoolSize);
where the size of the pool is not taken from the code, but from an external configuration.
A better way is also to define the kind of pool with parameters:
#Value("${cmdPoolType}")
private String cmtPoolType;
#Value("${cmdPoolSize}")
private int cmdPoolSize;
...
if (cmdPoolType.equals("cached")) {
cmdPool = Executors.newCachedThreadPool();
} else if (cmdPoolType.equals("fixed")) {
cmdPool = Executors.newFixedThreadPool(cmdPoolSize);
}
Where you choose the reasonable kind of available pools.
In this last case you can also use a spring configuration file and change it before starting the application.
Related
I have a parallel stream with a few database queries inside like this:
private void processParallel() {
List\<Result\> = objects.parallelStream().peek(object -\> {
doSomething(object)
}
}
private void doSomething(object) {
CompletableFuture<String> param =
CompletableFuture.supplyAsync(() -> objectService.getParam(object.getField())),
executor)
.thenApply(object-> Optional.ofNullable(object)
.map(Object::getParam)
.orElse(null));
}
I need to specify the pool size, but setting the property "java.util.concurrent.ForkJoinPool.common.parallelism","20" is not working, probably because of locking the stream. Is there any way to limit the max amount of threads?
Since parallel streams are using Fork/Join framework under the hood, to limit the number of treads employed by the stream, you can wrap the stream with a Callable and define a ForkJoinPool having the required level of parallelism as described.
The threads occupied by the parallel Stream would be taken from the new created ForkJoinPool, to which the callable task was submitted (not from the common poll) as described here.
The downside of this approach is that you're relying on the implementation detail of the Stream API.
And also as #Louis Wasserman has pointed out in the comment
you probably might need another way of limiting the number of threads used in the stream. Because you're performing database queries, each tread would require a database connection to do its job, hence the number of threads should not be greater than the number of available connections that the data source can provide at the moment. And if you have multiple processes that can fire these asynchronous tasks simultaneously (for instance in a web application), it doesn't seem to be a good idea to try to develop your own solution. If that's the case, you might consider using a framework like Spring WebFlax.
I have a system where currently every job has it's own Runnable class and I pre defined a fixed number of threads for every job.
My understanding is that it is a wrong practice, because:
You have to tailor the number of threads with respect to the machine running the process.
Each threads can only take one type of job.
Would you agree on that? (current solution is wrong)
So, I'd like to use something like Java's ThreadPool instead. I was conflicted with an argument claiming that by doing so, slow jobs will take over most of the thread pool, leaving no place to the other jobs. Whereas, with the current solution, a fixed number of threads were assigned to the slow worker and it won't hurt the others.
(Notice that you can't know a-priori if a job will be "slow")
How can a system be both adaptive in the number of threads it uses, but at the same time not be bounded to the most slow job?
You could try getting the time it takes for the job to complete (With a hand-made Timer class of sorts. Then you normalize this value by dividing this time by the maximum time any given thread has taken. Finally, you multiply this number by a fixed number which varies depending on how many threads you want running per job per second. This will be the requested amount of threads this process should be using. You can adjust that according.
Edit: You can set minimum and maximum values that regulate how many threads a job is entitled to. You could alternatively request threads from a very spacious job when another thread enters the system.
Hope that helps!
It's more of a business problem. Let's say I am a telecom operator. I bar my subscribers from making outgoing calls when they don't clear their dues. When they make payment I clear a flag and in a second the subscriber can make calls. But a lot of other activities go on in my system like usage processing, billing, bill formatting etc.
Now let's assume I have a system wide common pool of threads and I started the billing of 50K subscribers. All my threads are now processing the relatively long running billing jobs and a huge queue is building up.
A poor customer now makes a payment and wants to make an urgent call. But I have no thread left in my pool to clear the flag. The customer had to wait for an hour before he can make the call. That's SLA breach.
What I should have done is create separate thread pools. If the call unblocking jobs are not very frequent and short, I can create a separate pool for it with core size 5 maybe. For billing jobs I'd rather create a pool with core size 25 and max-size 30.
So, my system limits won't anyway exceed because I know in even the worst situation I won't have more than 30 threads.
This will also make it easy to debug. If I have a different thread name pattern for each pool amd my system has some issues. I can easily take a thread dump and understand if the billing or the payment stuff is the culprit.
So, I think the existing design is based on some business use case which you need to thoroughly understand before proposing a solution.
This is may be more general question, on how to decide on the thread pool size, but let's use the Spring ThreadPoolTaskExecutor for this case. I have the following configuration for the pool core and max size and the queue capacity. I've already read about what all these configurations mean - there is a good answer here.
#SpringBootApplication
#EnableAsync
public class MySpringBootApp {
public static void main(String[] args) {
ApplicationContext ctx = SpringApplication.run(MySpringBootApp.class, args);
}
#Bean
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(5);
executor.setMaxPoolSize(10);
executor.setQueueCapacity(25);
return executor;
}
}
The above numbers look random to me and I want to understand how to set them up correctly based on my environment. I will outline the following constraints that I have:
the application will be running on a two-core CPU box
the executor will work on a task which usually takes about 1-2
seconds to finish.
Usually I expect 800/min tasks to be submitted to my executor, spiking at 2500/min
The task will construct some objects and make an HTTP call to Google pubsub.
Ideally I'd like to understand what other constraints I need to consider and based on them what will be a reasonable configuration for my pools and queue sizes.
Update : This answer got a few votes over the years so I'm adding a shortened version for people who don't have the time to read my weird metaphor :
TL;DR answer :
The actual constraint is that a (logical) CPU core can only run a single thread at the same time. Thus :
Number of core : Number of logical core of your CPUs * 1/(ratio_of_time_your_thread_is_runnable_when_doing_your_task)
So, if you have 8 logical cores on your machine, you can safely put 8 threads in your threadPool (well, remember to exclude the other threads that may be used). Then you need to ask yourself if you can put more : you need to benchmark the kind of task you intend to run on your threadpool : if you notice the thread are, on average running only 50% of the time, that means your CPU is free to go work on another thread 50% of its time and you can add more threads.
Queue size : as many as you can wait on.
The queue size is the number of items your threadPool will accept before rejecting them. It is business logic. It depends on what behavior you expect : is there a point accepting a billion tasks ? When do you throw the towel ?
If one task takes one second to complete, and you have 10 threads, that means that the 10,000th task in queue will hopefully be done in 1000 seconds. Is that acceptable ?
The worst thing to happen is having clients timeout and re-submit the same tasks before you could complete the firsts.
Original ELI12 answer :
It may not be the most accurate answer, but I'll try :
A simple approach is to be aware that your 2-core CPU will only work on two threads at the same time.
If you have relatively modern Intel CPU, and you have Hyper Threading (aka. HT(TM), HTT(TM), SMT) turned on (via setting in BIOS), your operating system will see the number of available cores as double the number of the physical cores within your CPU.
Either way, from Java to detect how many cores (or simultaneous not-preempting each other threads) you can work with, just call int cores = Runtime.getRuntime().availableProcessors();
If you try to see your application as a Workshop (an actual one) :
A processor would be represented by an employee. It is the physical unit that will add value to a product.
A task would be a lump of raw material (plus some instructions list)
Your thread is a desk on which the employee can put the task on and work.
The queue size is the length of the conveyor belt that brings the raw materials to the desk.
Thus, your question becomes "How can I choose how many desks and how long can my conveyor belt be inside my factory, given an unchanging number of employees ?".
For the how many desks (Threads) part :
An employee can only work at one desk at a time, and you can only have a single employee per desk. Thus, the basic setup would be to have at least as many desks as you have employees (to avoid having any employee (Processor) left out without any possibility to work.
But, depending on your activity, you may afford more desks per employee :
If your employees are expected to put mail inside enveloppes constantly, an operation that require their full attention (in programing : sorting collections, creating objects, incrementing counters), having more desks wouldn't help, and may even be detrimental because your employee would have to
sometime change desk (switching context, which takes some time), thus leaving the one they were working on, to make work progress on the other.
But, if your task is making pottery, and relies on your employee waiting for the clay to cook in an oven (understand getting access to external resource, such as a file system, a web service etc), your employee can afford to go model clay on another desk and get back to the first one later.
Thus, you can afford more desks per employee as long as your task have a active work/waiting ratio (running/waiting) big enough. And the number of desks being how many tasks can your employee make progress on during the waiting time.
For the conveyor belt (queue) size part :
The queue size represents how many item you are allowing to be queued before starting to reject any more task (by throwing an exception), thus being the threshold at which you start to tell "ok, I'm already overbooked and won't ever be able to comply"
First, I'd say your conveyer belt needs to fit inside the workshop. Meaning that the collection should be small enough to prevent out of memory errors (obviously).
After that, it is based on your company policy. Let's assume a task is added to the belt every time a client makes an order (another service call your API). If the caller doesn't care how much time you take to comply and trust you enough with the execution, there's no point in limiting the size of the belt.
But if you can expect that your client gets annoyed after waiting for their pottery for a month, and leaves you for a concurrent or reorder another pottery, assuming the first order was lost and won't be bothered to ever check if the first order was completed... That first order was done for nothing, you won't get payed, and if your client makes another order whenever you're too slow to comply, you'll enter in a feedback loop because every new order will slow down the whole process.
Thus, in that case, you should put up a sign telling your client "sorry, we're overbooked, you shouldn't make any new order now, as we won't be able to comply within an acceptable time range".
Then, the queue size would be : acceptable time range / time to complete a task.
Concrete Example : if your client service expects that the task it submits would have to be completed in less than 100 seconds, and knowing that every task takes 1-2 seconds, you should limit the queue to 50-100 tasks because once you have 100 tasks waiting in the queue, you're pretty sure that the next one won't be completed in less than 100 seconds, thus rejecting the task to prevent the service from waiting for nothing.
This question already has answers here:
Does multi-threading improve performance? How?
(2 answers)
Closed 8 years ago.
I have a List<Object> objectsToProcess.Lets say it contains 1000000 item`s. For all items in the array you then process each one like this :
for(Object : objectsToProcess){
Go to database retrieve data.
process
save data
}
My question is : would multi threading improve performance? I would of thought that multi threads are allocated by default by the processor anyways?
In the described scenario, given that process is a time-consuming task, and given that the CPU has more than one core, multi-threading will indeed improve the performance.
The processor is not the one who allocates the threads. The processor is the one who provides the resources (virtual CPUs / virtual processors) that can be used by threads by providing more than one execution unit / execution context. Programs need to create multiple threads themselves in order to utilize multiple CPU cores at the same time.
The two major reasons for multi-threading are:
Making use of multiple CPU cores which would otherwise be unused or at least not contribute to reducing the time it takes to solve a given problem - if the problem can be divided into subproblems which can be processed independently of each other (parallelization possible).
Making the program act and react on multiple things at the same time (i.e. Event Thread vs. Swing Worker).
There are programming languages and execution environments in which threads will be created automatically in order to process problems that can be parallelized. Java is not (yet) one of them, but since Java 8 it's on a good way to that, and Java 9 maybe will bring even more.
Usually you do not want significantly more threads than the CPU provides CPU cores, for the simple reason that thread-switching and thread-synchronization is overhead that slows down.
The package java.util.concurrent provides many classes that help with typical problems of multithreading. What you want is an ExecutorService to which you assign the tasks that should be run and completed in parallel. The class Executors provides factor methods for creating popular types of ExecutorServices. If your problem just needs to be solved in parallel, you might want to go for Executors.newCachedThreadPool(). If your problem is urgent, you might want to go for Executors.newWorkStealingPool().
Your code thus could look like this:
final ExecutorService service = Executors.newWorkStealingPool();
for (final Object object : objectsToProcess) {
service.submit(() -> {
Go to database retrieve data.
process
save data
}
});
}
Please note that the sequence in which the objects would be processed is no longer guaranteed if you go for this approach of multithreading.
If your objectsToProcess are something which can provide a parallel stream, you could also do this:
objectsToProcess.parallelStream().forEach(object -> {
Go to database retrieve data.
process
save data
});
This will leave the decisions about how to handle the threads to the VM, which often will be better than implementing the multi-threading ourselves.
Further reading:
http://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html#executing_streams_in_parallel
http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/package-summary.html
Depends on where the time is spent.
If you have a load of calculations to do then allocating work to more threads can help, as you say each thread may execute on a separate CPU. In such a situation there is no value in having more threads than CPUs. As Corbin says you have to figure out how to split the work across the threads and have responsibility for starting the threads, waiting for completion and aggregating the results.
If, as in your case, you are waiting for a database then there can be additional value in using threads. A database can serve several requests in paraallel (the database server itself is multi-threaded) so instead of coding
for(Object : objectsToProcess){
Go to database retrieve data.
process
save data
}
Where you wait for each response before issuing the next, you want to have several worker threads each performing
Go to database retrieve data.
process
save data
Then you get better throughput. The trick though is not to have too many worker threads. Several reasons for that:
Each thread is uses some resources, it has it's own stack, its own
connection to the database. You would not want 10,000 such threads.
Each request uses resources on the server, each connection uses memory, each database server will only serve so many requests in parallel. You have no benefit in submitting thousands of simultaneous requests if it can only server tens of them in parallel. Also If the database is shared you probably don't want to saturate the database with your requests, you need to be a "good citizen".
Net: you will almost certainly get benefit by having a number of worker threads. The number of threads that helps will be determined by factors such as the number of CPUs you have and the ratio between the amount of processing you do and the response time from the DB. You can only really determine that by experiment, so make the number of threads configurable and investigate. Start with say 5, then 10. Keep your eye on the load on the DB as you increase the number of threads.
This question is about the fallouts of using SingleThreadExecutor (JDK 1.6). Related questions have been asked and answered in this forum before, but I believe the situation I am facing, is a bit different.
Various components of the application (let's call the components C1, C2, C3 etc.) generate (outbound) messages, mostly in response to messages (inbound) that they receive from other components. These outbound messages are kept in queues which are usually ArrayBlockingQueue instances - fairly standard practice perhaps. However, the outbound messages must be processed in the order they are added. I guess use of a SingleThreadExector is the obvious answer here. We end up having a 1:1 situation - one SingleThreadExecutor for one queue (which is dedicated to messages emanating from one component).
Now, the number of components (C1,C2,C3...) is unknown at a given moment. They will come into existence depending on the need of the users (and will be eventually disposed of too). We are talking about 200-300 such components at the peak load. Following the 1:1 design principle stated above, we are going to arrange for 200 SingleThreadExecutors. This is the source of my query here.
I am uncomfortable with the thought of having to create so many SingleThreadExecutors. I would rather try and use a pool of SingleThreadExecutors, if that makes sense and is plausible (any ready-made, seen-before classes/patterns?). I have read many posts on recommended use of SingleThreadExecutor here, but what about a pool of the same?
What do learned women and men here think? I would like to be directed, corrected or simply, admonished :-).
If your requirement is that the messages be processed in the order that they're posted, then you want one and only one SingleThreadExecutor. If you have multiple executors, then messages will be processed out-of-order across the set of executors.
If messages need only be processed in the order that they're received for a single producer, then it makes sense to have one executor per producer. If you try pooling executors, then you're going to have to put a lot of work into ensuring affinity between producer and executor.
Since you indicate that your producers will have defined lifetimes, one thing that you have to ensure is that you properly shut down your executors when they're done.
Messaging and batch jobs is something that has been solved time and time again. I suggest not attempting to solve it again. Instead, look into Quartz, which maintains thread pools, persisting tasks in a database etc. Or, maybe even better look into JMS/ActiveMQ. But, at the very least look into Quartz, if you have not already. Oh, and Spring makes working with Quartz so much easier...
I don't see any problem there. Essentially you have independent queues and each has to be drained sequentially, one thread for each is a natural design. Anything else you can come up with are essentially the same. As an example, when Java NIO first came out, frameworks were written trying to take advantage of it and get away from the thread-per-request model. In the end some authors admitted that to provide a good programming model they are just reimplementing threading all over again.
It's impossible to say whether 300 or even 3000 threads will cause any issues without knowing more about your application. I strongly recommend that you should profile your application before adding more complexity
The first thing that you should check is that number of concurrently running threads should not be much higher than number of cores available to run those threads. The more active threads you have, the more time is wasted managing those threads (context switch is expensive) and the less work gets done.
The easiest way to limit number of running threads is to use semaphore. Acquire semaphore before starting work and release it after the work is done.
Unfortunately limiting number of running threads may not be enough. While it may help, overhead may still be to great, if time spent per context switch is major part of total cost of one unit of work. In this scenario, often the most efficient way is to have fixed number of queues. You get queue from global pool of queues when component initializes using algorithm such as round-robin for queue selection.
If you are in one of those unfortunate cases where most obvious solutions do not work, I would start with something relatively simple: one thread pool, one concurrent queue, lock, list of queues and temporary queue for each thread in pool.
Posting work to queue is simple: add payload and identity of producer.
Processing is relatively straightforward as well. First you get get next item from queue. Then you acquire the lock. While you have lock in place, you check if any of other threads is running task for same producer. If not, you register thread by adding a temporary queue to list of queues. Otherwise you add task to existing temporary queue. Finally you release the lock. Now you either run the task or poll for next and start over depending on whether current thread was registered to run tasks. After running the task, you get lock again and see, if there is more work to be done in temporary queue. If not, remove queue from list. Otherwise get next task. Finally you release the lock. Again, you choose whether to run the task or to start over.