I am currently working on a project involving lots of asynchronous tasks running independently. I have one spring configuration file.
<task:executor id="taskScheduler" pool-size="5-20">
<task:executor id="specificTaskScheduler" pool-size="5-50" queue-capacity="100">
<!-- integration beans and
several object pools, with a total number of 100 beans created
using CommonsPoolTargetSource -->
I specifically created two executors - one to be used for spring integration needs and custom executor in order for it to run only my tasks feeding it to integration beans with explicit reference. After that I supplied a long running task to be procesed. My EAR runs on WebLogic and I dumped stacktrasce of threads being run and was very disappointed to find out that most of fifty threads in my custom executor wait in a executor's queue for an object to be available from the pool. I did not want CommonsPoolTargetSource to use my executor as a platform for managing its sources. What can I do here? Maybe creating a separate spring file with CommonsTargetSource beans will solve it? Thank you for any ideas.
Thanks guys. Turned out that pool was not a problem, I just had to add more instances to it and slightly increase the pool size with queue capacity set to zero and rejection policy set to an execution of the call in the caller's thread. I have yet to test it under heavy load though.
Related
I have Spring Boot app running in embedded tomcat. There are around 50 concurrent HTTP sessions and each of them is served by 5-7 concurrently running async backend calls (#Async). There is no specific threads configuration for Tomcat or Spring Boot.
I found that long running thread (does not not matter whether it is Tomcat or async call) seriously decreases performance of other. For example, if I generate report using CR JRC which takes 20-40 seconds, most of async threads look paralyzed.
How can I optimize the code and configuration to resolve the performance issue?
From your description, there could be several bottlenecks in your configuration. But one could be the number of threads available in your system. The best you could do from here is profile your application and check what threads are available, how they are used, and where do they block.
Furthermore, assuming the number of threads is the issue, then when you say
There is no specific threads configuration for Tomcat or Spring Boot.
if it means you are running on the default ThreadPoolExecutor, then you should check the documentation and default values on how to configure your thread pool, and scale accordingly.
The #Async annotation also allows you to specify which bean Executor to use.
// use default Executor
#Async
public void asyncMethodUsingDefaultExecutor() {}
// use of Executor with qualifier specificExecutorBeanQualifier
#Async("specificExecutorBeanQualifier")
public void asyncMethodUsingSpecificExecutor() {}
You could use this to have a separated Threadpool to handle long-running tasks and another one for the others.
From this article we can learn that Spring-Batch holds the Job's status in some SQL repository.
And from this article we can learn that the location of the JobRepository can be configured - can be in-memory and can be remote DB.
So if we need to scale a batch job, should we run several different Spring-batch JARs, all configured to use the same shared DB in order to keep them synchronized?
Is this the right pattern / architecture?
Yes, this is the way to go. The problem that might happen when you launch the same job from different physical nodes is that you can create the same job instance twice. In this case, Spring Batch will not know which instance to pick up when restarting a failed execution. A shared job repository acts as a safeguard to prevent this kind of concurrency issues.
The job repository achieves this synchronization thanks to the transactional capabilities of the underlying database. The IsolationLevelForCreate can be set to an aggressive value (SERIALIZABLE is the default) in order to avoid the aforementioned issue.
We run several spring batch jobs within tomcat in the same web application that serves up our UI. Lately we have been adding many more jobs and we are noticing that when we patch our app, several jobs may get stuck in a STARTING or STARTED status. Many of those jobs ensure that another job is not running before they start up, so this means after we patch the server, some of our jobs are broken until we manually run SQL to update the statuses of the jobs to ABANDONED or STOPPED.
I have read here that JobScope and StepScope jobs don't play nicely with shutting down.
That article suggests not using JobScope or StepScope but I can't help but think that this is a solved problem where people must be doing something when the application exits to prevent this problem.
Are there some best practices for handling this scenario? What are you doing in your applications?
We are using spring-batch version 3.0.3.RELEASE
I will provide you an idea on how to solve this scenario. Not necessarily a spring-batch solution.
Everytime I need to add jobs in an application I do as this:
Create a table to control the jobs (queue, priority, status, etc.)
Create a JobController class to manage all jobs
All jobs are defined by the status R-running, F-Finished, Q-Queue (you can add more as you need like aborted, cancelled, etc) (the jobs control these statuses)
The jobController must be loaded only once, you can define it as a spring bean for this
Add a boolean attribute to JobController to inform if you already checked the jobs when you instantiate it. Set it to false
Check if there are jobs with the R status which means that in the last stop of the server they were running so you update every job with this R status to Q and increase their priority so it will get executed first after a restart of the server. This check is inside the if for that boolean attribute, after the check set it to true.
That way every time you call the JobController for the first time and there are unfinished jobs from a server crash you will be able to set then all to a status where it can be executed again. And this check will happens only once since you will be checking that boolean attribute.
A thing that you should be aware of is caution with your jobs priority, if you manage it wrong you may run into a starvation problem.
You can easily adapt this solution to spring-batch.
Hope it helps.
In my application I have one cron job which connects to a FTP server and transfer files, a very simple functionality and it is configured using spring #Schedule annotation with cron expression as a parameter.
It was running fine for few months and then suddenly it stopped, got the connectException.
May be the FTP server was down or something happened which causes the cron thread to stop.
I looked (google) for the reasons but didnt get any ( Nothing much in the logs also - Just the exception name ).It may be a one time thing :)
my question is that can I put some check or watcher on the #Schedule cron job to know whether it is running or not ?
Sorry for my bad explanation/english
Thanks
my question is that can I put some check or watcher on the #Schedule
cron job to know whether it is running or not ?
Basically, you can't.
When you use #Scheduled, Spring uses a ScheduledAnnotationBeanPostProcessor to register the tasks you specify (annotated methods). It registers them with a ScheduledTaskRegistrar. The ScheduledAnnotationBeanPostProcessor is an ApplicationListener<ContextRefreshEvent>. When it receives the ContextRefreshEvent from the ApplicationContext, it schedules the tasks registered in the ScheduledTaskRegistrar.
During this step, these tasks are scheduled with a TaskScheduler which typically wraps a ScheduledExecutorService. If an exception is uncaught in a submitted task, then the task is removed from the ScheduledExecutorService queue.
The TaskScheduler class does not provide a public API to retrieve the scheduled tasks, ie. the ScheduledFuture objects. So you can't use it to find out if your tasks are running or not.
And you probably shouldn't. Develop your tasks, your #Scheduled methods, to be able to withstand an exception being thrown. Some times, obviously, that's not possible. With a network error, for example, you would probably have to restart your application. Without knowing anything else about your application, I would say more logging is your best bet.
I have two beans which is running the scheduler
<bean id="eventService" class="xxx.xxxx.xxxxx.EventSchedulerImpl">
</bean>
<bean id="UpdateService" class="xxx.xxxx.xxxxx.UpdateSchedulerImpl">
</bean>
I want to make sure only one scheduler is running at time
when EventSchedulerImpl is running UpdateSchedulerImpl should not run also implemented "StatefulJob" on both the scheduler
will that work? do i need to do more?
appericate your idea guys
One way would be to configure a special task executor so that it contains only one thread in its thread pool, and configure its queue capacity so that jobs can be kept "on-hold". So only one task can run at a time with this task executor, and the other task will get queued.
But I don't like this approach. Having a single-threaded task executor seems like a recipe for problems down the road.
What I would do is simply write a wrapper service that calls your target services in the order you need. Then schedule the wrapper service instead.