In my application I am calling third part vendor web-service. I need to delay my thread processing to achieve required throughput supported by vendor webservice.
I have two options
1. Use Thread.Sleep
2. use ScheduledThreadPoolExecutor as mentioned in the post How to start a thread after specified time delay in java
Wanted to know which is better option as we are sending time critical information(Text Message) using Vendor webservice.
Any help is appreciated.
They're pretty much the same as ScheduledThreadPoolExecutor.scheduleWithFixedDelay encapsulates the sleep call.
Since the delay is 100ms performance difference is kind of negligible. I'd go with ScheduledThreadPoolExecutor.scheduleWithFixedDelay due to pooled threads. The amount of load put on the system would be manageable, you wouldn't have multiple threads waking up from sleep together to compete for resources.
Also from the doc
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
use the scheduler method, you can select fixed-rate or fixed-delay.
look the source code:
/**
* Period in nanoseconds for repeating tasks. A positive
* value indicates fixed-rate execution. A negative value
* indicates fixed-delay execution. A value of 0 indicates a
* non-repeating task.
*/
private final long period;
Related
I am trying to implement a divide-and-conquer solution to some large data. I use fork and join to break down things into threads. However I have a question regarding the fork mechanism: if I set my divide and conquer condition as:
#Override
protected SomeClass compute(){
if (list.size()<LIMIT){
//Do something here
...
}else{
//Divide the list and invoke sub-threads
SomeRecursiveTaskClass subWorker1 = new SomeRecursiveTaskClass(list.subList());
SomeRecursiveTaskClass subWorker2 = new SomeRecursiveTaskClass(list.subList());
invokeAll(subWorker1, subWorker2);
...
}
}
What will happen if there is not enough resource to invoke subWorker (e.g. not enough thread in pool)? Does Fork/Join framework maintains a pool size for available threads? Or should I add this condition into my divide-and-conquer logic?
Each ForkJoinPool has a configured target parallelism. This isn’t exactly matching the number of threads, i.e. if a worker thread is going to wait via a ManagedBlocker, the pool may start even more threads to compensate. The parallelism of the commonPool defaults to “number of CPU cores minus one”, so when incorporating the initiating non-pool thread as helper, the resulting parallelism will utilize all CPU cores.
When you submit more jobs than threads, they will be enqueued. Enqueuing a few jobs can help utilizing the threads, as not all jobs may run exactly the same time, so threads running out of work may steal jobs from other threads, but splitting the work too much may create an unnecessary overhead.
Therefore, you may use ForkJoinTask.getSurplusQueuedTaskCount() to get the current number of pending jobs that are unlikely to be stolen by other threads and split only when it is below a small threshold. As its documentation states:
This value may be useful for heuristic decisions about whether to fork other tasks. In many usages of ForkJoinTasks, at steady state, each worker should aim to maintain a small constant surplus (for example, 3) of tasks, and to process computations locally if this threshold is exceeded.
So this is the condition to decide whether to split your jobs further. Since this number reflects when idle threads steal your created jobs, it will cause balancing when the jobs have different CPU load. Also, it works the other way round, if the pool is shared (like the common pool) and threads are already busy, they will not pick up your jobs, the surplus count will stay high and you will automatically stop splitting then.
I'm working in a redelivery system. This system attempt to execute an action, if the action fails, it try to execute again two times with an interval of five minutes, so I use the ExecutorService implementation to perform the first execution and ScheduledExecutorService to schedule the other ones, depending of its results (fail).
What should I consider to figure out the number of threads I need? In this moment I use only a single thread model (created by newSingleThreadScheduledExecutor method)
Without knowing details about the load your system has, environment it is using and how long does it take to process one message it is hard to say which number of threads you need. However, you can think of the following base principles:
Having many threads is bad, because you'll spend significant amount of time on a context switch, the chance of starvation and wasting system resources is higher .
Each thread consumes some space in memory for its stack. On x64 it is typically 1MB per thread.
I would probably create 2 thread pools (one scheduled, one non-scheduled) for both sending and redelivery and test them under high load varying number of threads from 2 to 10 to see which number suits best.
You should only need the one thread as only one action is running at a time. You could use a CachedThreadPool and not worry about it.
I'm trying to set up a job that will run every x minutes/seconds/milliseconds/whatever and poll an Amazon SQS queue for messages to process. My question is what the best approach would be for this. Should I create a ScheduledThreadPoolExecutor with x number of threads and schedule a single task with scheduleAtFixedRate method and just run it very often (like 10 ms) so that multiple threads will be used when needed, or, as I am proposing to colleagues, create a ScheduledThreadPoolExecutor with x number of threads and then create multiple scheduled tasks at slightly offset intervals but running less often. This to me sounds like how the STPE was meant to be used.
Typically I use Spring/Quartz for this type of thing but that's out of at this point.
So what are your thoughts?
I recommend that you use long polling on SQS, which makes your ReceiveMessage calls behave more like calls to take on a BlockingQueue (which means that you won't need to use a scheduled task to poll from the queue - you just need a single thread that polls in an infinite loop, retrying if the connection times out)
Well it depends on the frequency of tasks. If you just have to poll on timely interval and the interval is not very small, then ScheduledThreadPoolExecutor with scheduleAtFixedRate is a good alternative.
Else I will recommend using netty's HashedWheelTimer. Under heavy tasks it gives the best performance. Akka and play uses this for scheduling. This is because STPE for every task adding takes O(log(n)) where as HWT takes O(1).
If you have to use STPE, I will recommend one task at a rate else it results in excess resource.
Long Polling is like a blocking queue only for a max of 20 seconds after which the call returns. Long polling is sufficient if that is the max delay required between poll cycles. Beyond that you will need a scheduledExector.
The number of threads really depends on how fast you can process the received messages. If you can process the message really fast you need only a single thread. I have a setup as follows
SingleThreadScheduledExecutor with scheduleWithFixedDelay executes 5 mins after the previous completion
In each execution messages are retrieved in batch from SQS till there are no more messages to process (remember each batch receive a max of 10 messages).
The messages are processed and then deleted from queue.
For my scenario single thread is sufficient. If the backlog is increasing (for example, a network operation is required for each message which may involve waits), you might want to use multiple threads. If one processing node become resource constrained you could always start another instance (EC2 perhaps) to add more capacity.
I have used multithreading in many of applications I wrote . While reading more I came across ThreadPoolExecutors. I couldn't not differentiate between the two scenario wise .
Still what I understand is I should use multithreading when I have a task I want to divide a task in to multiple small tasks to utilize CPU and do the work faster . And use ThreadPoolExecutor when I have a set to tasks and each task can be run independent of each other.
Please correct me if I am wrong . Thanks
A ThreadPoolExecutor is just a high level API that enables you to run tasks in multiple threads while not having to deal with the low level Thread API. So it does not really make sense to differentiate between multithreading and ThreadPoolExecutor.
There are many flavours of ThreadPoolExecutors, but most of them allow more than one thread to run in parallel. Typically, you would use an Executor Service and use the Executors factory.
For example, a ExecutorService executor = Executors.newFixedThreadPool(10); will run the tasks you submit in 10 threads.
ThreadPoolExecutor is one way of doing multithreading. It's typically used when you
have independent operations that don't require coordination (though nothing prevents you
from coordinating, but you have to be careful)
want to limit the capacity of how many operations you're executing at once, and (optionally) want to queue operations when for execution if the pool is currently working in all threads.
Java 7 has another built in class called a ForkJoinPool which is typically used for Map-Reduce type operations. For instance, one can imagine implementing a merge sort using a ForkJoinPool by splitting the array in 1/2 at each fork point, waiting for the results, and merging the results together.
Thread pools (executors) are one form of multithreading, specifically an implementation of the single producer - multiple consumer pattern, in which a thread repeatedly puts work in a queue for a team of worker threads to execute. It is implemented using regular threads and brings several benefits:
thread anonymity - you don't explicitly control which thread does what; just fire off tasks and they'll be handled by the pool.
it encapsulates a work queue and thread team - no need to bother implementing you own thread-safe queue and looping threads.
load-balancing - since workers take new tasks as they finish previous ones, work is uniformly distributed, provided there is a sufficiently large number of tasks available.
thread recycling - just create a single pool at the beginning an keep feeding it tasks. No need to keep starting and killing threads every time work needs to be done.
Given the above, it is true that pools are suited for tasks that are usually independent of each-other and usually short-lived (long I/O operations will just tie up threads from the pool that won't be able to do other tasks).
ThreadPoolExecutor is a form of multithreading, with a simpler API to use than directly using Threads, where you submit tasks indeed. However, tasks can submit other tasks, so they need not be independent. As for the division of tasks into sub-tasks, you may be thinking of the new fork/join API in JDK7.
From source code documentation of ThreadPoolExecutor
/*
* <p>Thread pools address two different problems: they usually
* provide improved performance when executing large numbers of
* asynchronous tasks, due to reduced per-task invocation overhead,
* and they provide a means of bounding and managing the resources,
* including threads, consumed when executing a collection of tasks.
* Each {#code ThreadPoolExecutor} also maintains some basic
* statistics, such as the number of completed tasks.
*
* <p>To be useful across a wide range of contexts, this class
* provides many adjustable parameters and extensibility
* hooks. However, programmers are urged to use the more convenient
* {#link Executors} factory methods {#link
* Executors#newCachedThreadPool} (unbounded thread pool, with
* automatic thread reclamation), {#link Executors#newFixedThreadPool}
* (fixed size thread pool) and {#link
* Executors#newSingleThreadExecutor} (single background thread), that
* preconfigure settings for the most common usage
* scenarios.
*/
ThreadPoolExecutor is one way of achieving concurrency. There are many ways in achieving concurrency :
Executors framework provides different APIs. Some of important APIs are listed below.
static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
static ExecutorService newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available.
static ScheduledExecutorService newScheduledThreadPool(int corePoolSize)
Creates a thread pool that can schedule commands to run after a given delay, or to execute periodically.
static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Have a look at below SE questions:
java Fork/Join pool, ExecutorService and CountDownLatch
How to properly use Java Executor?
I have a sort of complex problem like below.
- we have a real time system with large number threads requirement. In order to optimize the performance, we are thinking of following design.
create a thread pool executor with max number of threads
each thread is used to create scheduled executor service.
now the tasks are being assigned to these executor services evenly based on load
BUT the biggest problem is, if one of the task in the queue contains a sleep (for few secs), it blocks the corresponding Schedule executor service thread for that duration and subsequently all the following tasks in that queue.
In this regard, please suggest me how to suspend the execution of the task with sleep OR overriding the sleep somehow and rejoin/schedule the task again to the queue.
Thanks in advance
Seshu
Assuming I understand your question, your Schedule Executor service threads have a deadline requirement, but the actual workers can sleep for an unknown length of time, possibly throwing off the timing of the Schedule Executors. From your description I'm guessing what you want is for a task that needs to sleep to actually stop, save progress information and then requeue itself for the remainder of the work to be rescheduled at some future time. You'd have to build this into your application architecture.
Alternatively, you could have the scheduler threads launch the worker tasks in their own separate threads, letting them sleep as necessary, with one scheduler thread collecting all the worker terminations.
To get a better answer you're going to have to provide more information about what you're trying to accomplish.
Tasks which sleep are inherently unfriendly for running in any kind of bounded thread pool. The sleep is explicitly telling the thread that it must do nothing for a period of time.
If possible, split the task into 2 (or more parts), eliminating the sleep completely. Get the first half-task to schedule the second task with an appropriate delay.
Failing that, you could consider increasing the size of your thread pool somewhat - either setting a much larger cap to its size, or possibly even eliminating the cap altogether (not recommended for a server than might end up with many clients).
Alternatively, move the tasks with sleep statements in them into their own Scheduled executor. Then, they'll delay each other, but better-behaved tasks, with no wait statements in them, will get preferential treatment.