I observed that jobs which are normally scheduled at midnight (but could not be executed because the server went into energy saving mode) get executed after the server comes out of energy saving mode. This leads to unexpected execution times.
Is there a way to tell Quartz to not execute jobs after they are too far behind their target time?
Yes. You just need to tell Quartz what to do about job misfires:
Another important property of a Trigger is its “misfire instruction”. A misfire occurs if a persistent trigger “misses” its firing time because of the scheduler being shutdown, or because there are no available threads in Quartz’s thread pool for executing the job. The different trigger types have different misfire instructions available to them. By default they use a ‘smart policy’ instruction - which has dynamic behavior based on trigger type and configuration. When the scheduler starts, it searches for any persistent triggers that have misfired, and it then updates each of them based on their individually configured misfire instructions. When you start using Quartz in your own projects, you should make yourself familiar with the misfire instructions that are defined on the given trigger types, and explained in their JavaDoc.
The specific misfire instruction depends on the Trigger type you're using. For a daily job It could be something like this:
trigger = newTrigger()
.withIdentity("trigger1", "group1")
.withSchedule(dailyAtHourAndMinute(0,0)
.withMisfireHandlingInstructionDoNothing()) // set misfire instruction
.build();
But again, it depends on the type of trigger. Just use your IDE to see what withMisfire*() methods are available, and use either withMisfireHandlingInstructionDoNothing or withMisfireHandlingInstructionNextWithRemainingCount (both will ignore the missed execution and wait for the next scheduled one).
When I had to learn about the different types of misfire instructions, besides Quartz's tutorials and API documentation, I used this blog entry.
Related
I have a situation that specific quartz job must execute only once at specific time of day.
I want to know if there is a workaround to make sure this job will be executed even if server for any reason does not up and running at that point of time.
If i switch from ramjobstore to database does this guaranty the mentioned problem?
EDIT:
To make things clear i use the #Vala. D comment:
if the server was down at this time to run it immediately after server-start
This behavior will cover my requirements
I have a schedule job that run every end of the month. After running it saves some data to database.
When i scale the app(for example with 2 instances) both instances run the schedule job and both save the data and at the end of day my database has the same data.
So i want the schedule job only run one time regardless of instances numbers at cloud.
In my project, I have maintained a database table to hold a lock for each job which needs to be executed only once in the cluster.
When a Job gets triggered then it first tries to acquire lock from the database and if it gets that lock only then it will get executed. If it fails to acquire the lock then job will not get executed.
You can also look at the clustering feature of Quartz job.
http://www.quartz-scheduler.org/documentation/2.4.0-SNAPSHOT/introduction.html
I agree with the comments. If you can utilize a scheduler that's going to be your best, most flexible option. In addition, a scheduler should be executing your job as a "task" on Cloud Foundry. The task will only run on one instance, so you won't need to worry about how many instances your application is using (the two are separate in that regard).
If you're using Pivotal Cloud Foundry/Tanzu Cloud Foundry there is a scheduler you can ask your operations team to install. I don't know about other variants of CF, but I assume there are other schedulers.
https://network.pivotal.io/products/p-scheduler/
If using a scheduler is not an option then this is a concern you'll need to handle in your application. The solution of using a shared lock is a good one, but there is also a little trick you can do on Cloud Foundry that I feel is a little simpler.
When your application runs, certain environment variables are set by the platform. There is one called INSTANCE_INDEX which has a number indicating the instance on which the app is running. It's zero-based, so your first app instance will be running on instance zero, the second instance one, etc.
In your code, simply look at the instance index and see if it's zero. If the index is non-zero, have your task end without doing anything. If it's zero, then let the task proceed and do its work. The task will execute on every application instance, but it will only do work on the first instance. It's an easy way to guarantee something like a database migration or background job only runs once.
One final option would be to use multiple processes. This is a feature of Cloud Foundry that enables you to have different processes running, like your web process and a background worker process.
https://docs.cloudfoundry.org/devguide/multiple-processes.html
The interesting thing about this feature is that you can scale the different processes independently of each other. Thus you could have as many web processes running, but only one background worker which would guarantee that your background process only runs once.
That said, the downside of this approach is that you end up with separate containers for each process and the background process would need to continue running. The foundation expects it to be a long-running process, not a finite duration batch job. You could get around this by wrapping your periodic task a loop or something which keeps the process running forever.
I wouldn't really recommend this option but I wanted to throw it out there just in case.
You can use #SnapLock annotation in your method which guarantees that task only runs once. See documentation in this repo https://github.com/luismpcosta/snap-scheduler
Example:
Import maven dependency
<dependency>
<groupId>io.opensw.scheduler</groupId>
<artifactId>snap-scheduler-core</artifactId>
<version>0.3.0</version>
</dependency>
After importing maven dependency, you'll need to create the required tables tables.
Finally, see bellow how to annotate methods which guarantees that only runs once with #SnapLock annotation:
import io.opensw.scheduler.core.annotations.SnapLock;
...
#SnapLock(key = "UNIQUE_TASK_KEY", time = 60)
#Scheduled(fixedRate = 30000)
public void reportCurrentTime() {
...
}
With this approach you also guarantee audit of the tasks execution.
I am using Sling's scheduler to schedule periodic jobs and I'm wondering if I'm scheduling job A to run every 5 minute. In the unlikely occasion, the job start to run and took more than 5 minute what will happen? I have specified the job cannot run in parallel.
Job A will run again immediately after the previous run finishes.
Job A will run 5 minute after the previous run finishes.
Under the hood, Sling's scheduler is using QuartzScheduler, so if you know how QuartzScheduler will behave in this case please do share your knowledge as well.
Any help is much appreciated!
In Quartz Scheduler 2.1.x the annotation DisallowConcurrentExecution is used to prevent concurrent execution of the same Job.
In Quartz Scheduler 2.0.x in order to void the concurrent execution of a Job you have to implement the StatefulJob interface.
The decision on whether the misfired execution will execute when the previous job completes or it will be ignored depends on the trigger's misfire policy. By default when the scheduler starts, it searches for any persistent triggers that have misfired, and it then updates each of them based on their individually configured misfire instructions.
So in my opinion Job A will run again immediately after the previous run finishes. I suppose that Sling uses the default misfire policy. Otherwise the answer depends on the misfire policy selection.
That's how Quartz Scheduler works. I don't know how the Sling's scheduler works.
I hope this helps.
Following is my requirement:
On a periodical basis, a "request" needs to be enqueued into a queue
The different parameters of the "request", the periodicity of execution, start date of execution, and the last run date are stored in a database.
There can be few thousands of such requests in the database.
Now, I want to write a scheduler which polls the database regularly, and when the current date is equal to the "last run date" + "periodicity", the request should be enqueued.
Please suggest the alternatives availabe for schedulers
This scheduler should be capable of running on multiple hosts
Thanks,
Hima
Have you looked at Quartz Scheduler? It may well do all you need very simply. I haven't used it myself, but I've heard good things about it.
We need a schedule job in a Java EE server and we know how to use Quartz or the Timer service.
But our question is, if we want to change the schedule on production or manually trigger the batch, how to do it?
In the traditional solution, we use a servlet to run the job. And then use a cronjob with a http client (i.e. lynx) to trigger the servlet. It's easy to implement and could change on production.
I have never found Timers to entirely satisfactory because of this exact problem: you can't really monitor their status or modify them.
What I recommend is a second layer job manager class. When you call this class, it schedules a Java EE timer for time 'X', and it also records the fact that you want to execute a 'job' at time 'X'. When that time arrives, the Java EE timer calls this job manager class, which finds the job, and calls the job.
What this allows you to do is to write an "unschedule" function. Calling unschedule would remove the job. When the Java EE timer calls at time 'X' this class does not find any job, and so ignores it.
You can also implement a "change schedule" function that removes the old entry, and create a new entry at time 'Y' scheduling a Java EE timer for time 'Y'. The Java EE timer will arrive at both time 'X' and another at time 'Y' but only the time 'Y' will have effect.
Thus manual triggering is a matter of having a servlet that call "change schedule" to be right now.
The one other detail to be careful of: because timer events are not completely reliable, we implement this class to find all the jobs that had been scheduled before the current time, and run all of them at that moment. We then schedule extra Java EE timer events for every 5 minutes or so. Those timers will pick up any jobs that for one reason or another had been left behind. This is important if your job queue is persistent, then it might be that while restarting the server, it is down at exactly the moment that the timer was supposed to go off. No problem: Java EE Timer events themselves have no meaning, they just serve to wake up the job handler, so it can run all the outdated jobs.