I have a sort of complex problem like below.
- we have a real time system with large number threads requirement. In order to optimize the performance, we are thinking of following design.
create a thread pool executor with max number of threads
each thread is used to create scheduled executor service.
now the tasks are being assigned to these executor services evenly based on load
BUT the biggest problem is, if one of the task in the queue contains a sleep (for few secs), it blocks the corresponding Schedule executor service thread for that duration and subsequently all the following tasks in that queue.
In this regard, please suggest me how to suspend the execution of the task with sleep OR overriding the sleep somehow and rejoin/schedule the task again to the queue.
Thanks in advance
Seshu
Assuming I understand your question, your Schedule Executor service threads have a deadline requirement, but the actual workers can sleep for an unknown length of time, possibly throwing off the timing of the Schedule Executors. From your description I'm guessing what you want is for a task that needs to sleep to actually stop, save progress information and then requeue itself for the remainder of the work to be rescheduled at some future time. You'd have to build this into your application architecture.
Alternatively, you could have the scheduler threads launch the worker tasks in their own separate threads, letting them sleep as necessary, with one scheduler thread collecting all the worker terminations.
To get a better answer you're going to have to provide more information about what you're trying to accomplish.
Tasks which sleep are inherently unfriendly for running in any kind of bounded thread pool. The sleep is explicitly telling the thread that it must do nothing for a period of time.
If possible, split the task into 2 (or more parts), eliminating the sleep completely. Get the first half-task to schedule the second task with an appropriate delay.
Failing that, you could consider increasing the size of your thread pool somewhat - either setting a much larger cap to its size, or possibly even eliminating the cap altogether (not recommended for a server than might end up with many clients).
Alternatively, move the tasks with sleep statements in them into their own Scheduled executor. Then, they'll delay each other, but better-behaved tasks, with no wait statements in them, will get preferential treatment.
Related
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
I'm trying to set up a job that will run every x minutes/seconds/milliseconds/whatever and poll an Amazon SQS queue for messages to process. My question is what the best approach would be for this. Should I create a ScheduledThreadPoolExecutor with x number of threads and schedule a single task with scheduleAtFixedRate method and just run it very often (like 10 ms) so that multiple threads will be used when needed, or, as I am proposing to colleagues, create a ScheduledThreadPoolExecutor with x number of threads and then create multiple scheduled tasks at slightly offset intervals but running less often. This to me sounds like how the STPE was meant to be used.
Typically I use Spring/Quartz for this type of thing but that's out of at this point.
So what are your thoughts?
I recommend that you use long polling on SQS, which makes your ReceiveMessage calls behave more like calls to take on a BlockingQueue (which means that you won't need to use a scheduled task to poll from the queue - you just need a single thread that polls in an infinite loop, retrying if the connection times out)
Well it depends on the frequency of tasks. If you just have to poll on timely interval and the interval is not very small, then ScheduledThreadPoolExecutor with scheduleAtFixedRate is a good alternative.
Else I will recommend using netty's HashedWheelTimer. Under heavy tasks it gives the best performance. Akka and play uses this for scheduling. This is because STPE for every task adding takes O(log(n)) where as HWT takes O(1).
If you have to use STPE, I will recommend one task at a rate else it results in excess resource.
Long Polling is like a blocking queue only for a max of 20 seconds after which the call returns. Long polling is sufficient if that is the max delay required between poll cycles. Beyond that you will need a scheduledExector.
The number of threads really depends on how fast you can process the received messages. If you can process the message really fast you need only a single thread. I have a setup as follows
SingleThreadScheduledExecutor with scheduleWithFixedDelay executes 5 mins after the previous completion
In each execution messages are retrieved in batch from SQS till there are no more messages to process (remember each batch receive a max of 10 messages).
The messages are processed and then deleted from queue.
For my scenario single thread is sufficient. If the backlog is increasing (for example, a network operation is required for each message which may involve waits), you might want to use multiple threads. If one processing node become resource constrained you could always start another instance (EC2 perhaps) to add more capacity.
The docs for ScheduledThreadPoolExecutor says that -
Tasks scheduled for exactly the same execution time are enabled in first-in-first-out (FIFO) order of submission.
Does this mean that the tasks which SHOULD be done at the same time are never done at the same time. Instead they are executed in FIFO order ?
If that is true then which class do I use which is better than Timer and also does not have this FIFO problem ?
The way a ScheduledThreadPoolExecutor works is there is a single "scheduling" or master thread which checks for tasks to execute.
If it finds a task, it delegates it to a "worker" thread from the pool.
If multiple tasks are ready to be executed, they are "kicked off" one at a time, though once "kicked off", subsequent processing is concurrent, per Java's definition.
If you have two tasks that are both scheduled through the executor for the same time, the order in which they complete could vary from run to run and unless you put in specific controls such as locks, waits, etc... to handle this, it's up to java's thread scheduling (how java allots time to threads on a core) to determine how and when what gets processed. Please note that setting up such locks, waits, etc... is a deceptively complex task prone to race conditions leading to unexpected deadlocks, live locks, etc...
It depends on the size of your thread pool. If you schedule 1000 tasks to fire at midnight, and you only have 25 threads, then only 25 can be executed initially, while the rest must wait for available threads. FIFO here refers to the order in which the executor will hand tasks off to the execution threads.
Please note that the docs talk about "enabling" the tasks and that we are talking about a threadpool executor. :-)
That means the tasks will wait until the designated time, then they are treated as if put into a normal ThreadPoolExecutor. If there are enough threads available in the pool all these tasks will be run in parallel.
Only if you have more tasks becoming active than available threads in the pool some tasks will have to wait.
I have a number of tasks that I would like to execute periodically at different rates for most tasks. Some of the tasks may be scheduled for simultaneous execution though. Also, a task may need to start executing while another is currently executing.
I would also like to customize each task by setting an object for it, on which the task will operate while it is being executed.
Usually, the tasks will execute in periods of 2 to 30 minutes and will take around 4-5 seconds, sometimes up to 30 seconds when they are executed.
I've found Executors.newSingleThreadedScheduledExecutor(ThreadFactory) to be almost exactly what I want, except that it might cause me problems if a new task happens to be scheduled for execution while another is already executing. This is due to the fact that the Executor is backed up by a single execution thread.
The alternative is to use Executors.newScheduledThreadPool(corePoolSize, ThreadFactory), but this requires me to create a number of threads in a pool. I would like to avoid creating threads until it is necessary, for instance if I have two or more tasks that happen to need parallell executing due to their colliding execution schedules.
For the case above, the Executors.newCachedThreadPool(ThreadFactory) appears to do what I want, but then I can't schedule my tasks. A combination of both cached and scheduled executors would be best I think, but I am unable to find something like that in Java.
What would be the best way to implement the above do you think?
Isn't ScheduledThreadPoolExecutor.ScheduledThreadPoolExecutor(int):
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(0);
what you need? 0 is the corePoolSize:
corePoolSize - the number of threads to keep in the pool, even if they are idle, unless allowCoreThreadTimeOut is set
I guess you will not able to do that with ScheduledExecutor, because it uses DelayedWorkQueue where as newCachedThreadPool uses ThreadPoolExecutor SynchronousQueue as a work queue.
So you can not change implementation of ScheduledThreadPoolExecutor to act like that.
I'd like to have a ScheduledThreadPoolExecutor which also stops the last thread if there is no work to do, and creates (and keeps threads alive for some time) if there are new tasks. But once there is no more work to do, it should again discard all threads.
I naivly created it as new ScheduledThreadPoolExecutor(0) but as a consequence, no thread is ever created, nor any scheduled task is ever executed.
Can anybody tell me if I can achieve my goal without writing my own wrapper around the ScheduledThreadpoolExecutor?
Thanks in advance!
Actually you can do it, but its non-obvious:
Create a new ScheduledThreadPoolExecutor
In the constructor set the core threads to the maximum number of threads you want
set the keepAliveTime of the executor
and at last, allow the core threads to timeout
m_Executor = new ScheduledThreadPoolExecutor ( 16,null );
m_Executor.setKeepAliveTime ( 5, TimeUnit.SECONDS );
m_Executor.allowCoreThreadTimeOut ( true );
This works only with Java 6 though
I suspect that nothing provided in java.util.concurrent will do this for you, just because if you need a scheduled execution service, then you often have recurring tasks to perform. If you have a recurring task, then it usually makes more sense to just keep the same thread around and use it for the next recurrence of the task, rather than tearing down your thread and having to build a new one at the next recurrence.
Of course, a scheduled executor could be used for inserting delays between non-recurring tasks, or it could be used in cases where resources are so scarce and recurrence is so infrequent that it makes sense to tear down all your threads until new work arrives. So, I can see cases where your proposal would definitely make sense.
To implement this, I would consider trying to wrap a cached thread pool from Executors.newCachedThreadPool together with a single-threaded scheduled executor service (i.e. new ScheduledThreadPoolExecutor(1)). Tasks could be scheduled via the scheduled executor service, but the scheduled tasks would be wrapped in such a way that rather than having your single-threaded scheduled executor execute them, the single-threaded executor would hand them over to the cached thread pool for actual execution.
That compromise would give you a maximum of one thread running when there is absolutely no work to do, and it would give you as many threads as you need (within the limits of your system, of course) when there is lots of work to do.
Reading the ThreadPoolExecutor javadocs might suggest that Alex V's solution is okay. However, doing so will result in unnecessarily creating and destroying threads, nothing like a cashed thread-pool. The ScheduledThreadPool is not designed to work with a variable number of threads. Having looked at the source, I'm sure you'll end up spawning a new thread almost every time you submit a task. Joe's solution should work even if you are ONLY submitting delayed tasks.
PS. I'd monitor your threads to make sure your not wasting resources in your current implementation.
This problem is a known bug in ScheduledThreadPoolExecutor (Bug ID 7091003) and has been fixed in Java 7u4. Though looking at the patch, the fix is that "at least one thread is started even if corePoolSize is 0."