Spring Boot - #EnableScheduling combined with spring-boot-starter-quartz - java

We have a project that uses somewhere around ten Quartz jobs via the aforementioned dependency. Everything seemed to work fine for a while but we have two jobs that just randomly stop working (all jobs are scheduled to run every five seconds). Our mitigation strategy was to migrate these jobs to Spring's 'own' scheduling as they do not require any input data. After adding #EnableScheduling and the appropriate #Scheduled annotation, they work fine and run every five seconds. The problem is that now, the 'old' Quartz jobs seemed to have stopped working (at least in our integration tests, they wait 20 seconds for an execution). The Quartz jobs just never fire. When increasing the timeout, the Quartz jobs sometimes start about 30 seconds after being scheduled. While they are running, the Spring jobs seem to wait. We have tried setting the thread count for both Quartz and Spring to > 50 but nothing seems to help. We're somewhat out of ideas, does anybody know a solution?
We're using Spring Boot 2.3.3. and the latest Quartz, 2.3.2.
Thanks.

Ok so the problem seems to be with our #Transactional management - we seem to have somehow acquired a lock that blocks Quartz's scheduler from finding triggers that were scheduled for immediate execution (Quartz trigger does not fire immediately seems to have the same problem, but unrelated to Spring scheduling). We 'fixed' the problem by setting
spring.quartz.properties.org.quartz.scheduler.idleWaitTime=5000
This is obviously not a fix but instead of waiting for 30 seconds (the default for Quartz's idle wait time which initially made the jobs only being executed after said time), the scheduler only waits for five seconds to search for triggers after not finding one. This makes our immediate jobs trigger five seconds late but that isn't a problem in our use case.
Of course, this isn't the real solution, refining our #Transactionals would probably be the correct fix but we cannot do that right now, maybe sometime in the future.
Why the problem started ocurring after #EnableScheduling and not beforehand, we have no idea.

Related

Cron based jobs, which are in Database

I am trying to write a process, which reads the cron expression from a set of records stored in database and runs a job (execute a program if that expression triggers in the next one hour). The records with cron expression in the database can have different triggering times (like Friday or hourly etc).
Example of the table with cron expressions.
----------
0 0 12 * * ? , 12Noon, AJOB
----------
0 11 11 11 11 ? , Nov 11, BJOB
----------
0 15 10 * * ? , EveryDay 10: 15, XJOB
----------
Users can update this crons in the table.
What's the best way to design this kind of application?
The major problem I see here is the following: let's say I ran my job every 1 hour and take the records which are scheduled in the next one hour and run my job, This looks all good when the application is up. If the application is down for 2 hours we might miss some jobs which need to triggered at that time.
How to write this kind of application keeping in mind that the application can fail, but we should not miss any crons during the downtime?
The cron API also has nextTriggerTime, but much less support on previous trigger time.
Since Akka is one of the question tags, I assume you're looking for some Akka based solution.
For cron-like job scheduler in JVM, maybe go straight with Quartz (as suggested by Ashiq), it is Java and should be straightforward to integrate into a Scala+Akka project. Also, take a look at akka-quartz-scheduler. It integrates with Akka well and it provides the cron utils.
In terms of the application:
Maintain a checkpoint table in DB or disk, whichever fits better, to indicate "if cron job ID=X has been run"
Your app loads all cron expressions into memory on startup, trigger quartz scheduler on all of them. If the app doesn't crash, it'll keep running.
Every time your app runs a cron job successfully, update the checkpoint table.
If your app crashes and restarts, lookup the checkpoint table first to understand where it has been, and scheduler the yet-to-finished jobs.
You can use Quartz which is a job scheduling library that can be integrated to Java. It can store and trigger cron jobs and simple jobs. It also has misfire policies which you can utilise in case any of the triggers are missed.

Run a Web Service at a Particular time of the day EveryDay (Get time From Tomcat parameters )

I have a requirement to send SOAP message to lot of devices everyday at a certain time. I will get the time from a tomcat parameter in web.xml. Something like;
<context-param>
<param-name>DailyTime</param-name>
<param-value>04:00</param-value>
</context-param>
I must create a separate thread that sends the messages. Time will be in 24-hrs format.
The problem is, as a starter i have no idea where to start or how to do it. Can you guys please point me in the right direction or give me some tips, which will help me greatly.
Thank You Everyone :)
You have several options. The two I've used most in the past are:
1) Schedule a cron job to run at the time(s) you want, and have it call an executable java class / jar file.
2) Use a scheduler library like Quartz
Regarding #1 - this assumes you're using a *nix system. If you're using Windows, you can schedule tasks through the Task Scheduler.
Regarding #2 - this gives you more flexibility on the conditions of running a task/job. For example, you could schedule a job to run every 1 minute, but not to start a new job until any existing job is complete.
Anecdotal remark from a version of Quartz circa 2006 - on WebSphere, it seems that my quartz jobs were getting executed by some background thread that made jobs take hours which should have only taken a few seconds. But that was almost a decade ago, and certainly quartz (and hopefully websphere) have vastly improved.

How to make Quartz scheduler to execute jobs asynchronously(parallelly)?

I have 56 jobs scheduled as cronTrigger, all at an exact same time.
I expect these jobs all start executing together, without having any sequence.
like each one executing in it's own Thread. however quartz scheduler executes them one by one.
I did some research, and found this
Quartz Thread Execution Parallel or Sequential? that suggest setting of following properties in quartz.properties file:
org.quartz.scheduler.batchTriggerAcquisitionMaxCount = 60
org.quartz.threadPool.threadCount = 60
sad to say, it didn't work for me. still when one of my jobs encounters an exception, it keeps trying to run the job, which is fine, but other jobs never execute until this one totally fails after some attemps.
Do you know how to make the scheduler to exhibit a parallel behavior?
Thanks.

Sling scheduler periodic job -- will the job overlap?

I am using Sling's scheduler to schedule periodic jobs and I'm wondering if I'm scheduling job A to run every 5 minute. In the unlikely occasion, the job start to run and took more than 5 minute what will happen? I have specified the job cannot run in parallel.
Job A will run again immediately after the previous run finishes.
Job A will run 5 minute after the previous run finishes.
Under the hood, Sling's scheduler is using QuartzScheduler, so if you know how QuartzScheduler will behave in this case please do share your knowledge as well.
Any help is much appreciated!
In Quartz Scheduler 2.1.x the annotation DisallowConcurrentExecution is used to prevent concurrent execution of the same Job.
In Quartz Scheduler 2.0.x in order to void the concurrent execution of a Job you have to implement the StatefulJob interface.
The decision on whether the misfired execution will execute when the previous job completes or it will be ignored depends on the trigger's misfire policy. By default when the scheduler starts, it searches for any persistent triggers that have misfired, and it then updates each of them based on their individually configured misfire instructions.
So in my opinion Job A will run again immediately after the previous run finishes. I suppose that Sling uses the default misfire policy. Otherwise the answer depends on the misfire policy selection.
That's how Quartz Scheduler works. I don't know how the Sling's scheduler works.
I hope this helps.

Quick and Dirty Solution to Load Balancing Batch Jobs

We're developing a web app and are coming to the end of development, and the client we're working with has suddenly sprung the fact on us that we will need to be able to handle load balancing.
We have batch jobs running which would need to run on both servers, but we don't want them to overlap. They are selecting rows from the database, processing the objects, and merging them back into the database. One of these jobs MUST run at the same time each day, though the others run every n minutes. We have about a week at most to get something working, and it'll become technical debt for us.
My question is: what quick and dirty hacks exist to get this working properly? We have a SQLServer 2008 instance and are running Java EE 6 on JBoss 5, which will be load balanced between two servers. We're using Spring 3 and JPA2 backed by Hibernate, and using the stock spring scheduler to schedule and run the jobs. Help me Obi Wan Kenobi; you're my only hope!
on jboss5 u need to use Scheduler API as the simplest solution - the implmentation is built on top of quartz and generally you would user clustered configuration like described here
http://quartz-scheduler.org/documentation/quartz-2.x/configuration/ConfigJDBCJobStoreClustering
Almost 10 years after this question was asked, I had the same need and the "quickest and dirty-ess" solution for me was a load balancer using shared file system without any master.
Each worker locks->picks the jobs from the shared file system, independent of other workers. To balance load, each worker sleeps X seconds between each job polling iteration, where X is proportional to load on the worker (in my case load is count of processes started by worker in the background). Thus high load sleeper gives higher probability to other workers to pick up the next job. Worker loops are running under supervisor (linux).
My use case was execution of sparklyr client-mode jobs on Spark/Hadoop cluster without overloading the edge nodes. It was implemented as a bash script within few hours and then scaled to 3 hosts, and has been stable for some months now - till there is time to invest in a better solution.

Categories