I'm trying to understand how quartz scheduler works in a clustered environment. I believe pointing the multiple instances of the scheduler app to the same DB and also setting isClustered=true will make sure only one scheduler fires the job at the same time. However, I have the following questions:
Who ensures that only one scheduler executes the job and how?
Can two scheduler instances have the same name (ids are auto, so I guess they will be distinct? (org.quartz.scheduler.instanceName = MyScheduler)
Who sets DB parameters like next fire time?
Ideally, should any of the 11 or so predefined tables (QRTZ_TRIGGERS) be populated? Or they are populated based on the beans in the application upon on app startup?
Related
Related: Quartz Clustering - triggers duplicated when the server starts
I'm using Quartz Scheduler to manage scheduled jobs in a java-based clustered environment. There are a handful of nodes in the cluster at any given time, and they all run Quartz, backed by a data store in a postgresql database that all nodes connect to.
When an instance is initialized, it tries to create or update the jobs and triggers in the Quartz data store by executing this code:
private void createOrUpdateJob(JobKey jobKey, Class<? extends org.quartz.Job> clazz, Trigger trigger) throws SchedulerException {
JobBuilder jobBuilder = JobBuilder.newJob(clazz).withIdentity(jobKey);
if (!scheduler.checkExists(jobKey)) {
// if the job doesn't already exist, we can create it, along with its trigger. this prevents us
// from creating multiple instances of the same job when running in a clustered environment
scheduler.scheduleJob(jobBuilder.build(), trigger);
log.error("SCHEDULED JOB WITH KEY " + jobKey.toString());
} else {
// if the job has exactly one trigger, we can just reschedule it, which allows us to update the schedule for
// that trigger.
List<? extends Trigger> triggers = scheduler.getTriggersOfJob(jobKey);
if (triggers.size() == 1) {
scheduler.rescheduleJob(triggers.get(0).getKey(), trigger);
return;
}
// if for some reason the job has multiple triggers, it's easiest to just delete and re-create the job,
// since we want to enforce a one-to-one relationship between jobs and triggers
scheduler.deleteJob(jobKey);
scheduler.scheduleJob(jobBuilder.build(), trigger);
}
}
This approach solves a number of problems:
If the environment is not properly configured (i.e. jobs/triggers don't exist), then they will be created by the first instance that launches
If the job already exists, but I want to modify its schedule (change a job that used to run every 7 minutes to now run every 5 minutes), I can define a new trigger for it, and a redeploy will reschedule the triggers in the database
Exactly one instance of a job will be created, because we always refer to jobs by the specified JobKey, which is defined by the job itself. This means that jobs (and their associated triggers) are created exactly once, regardless of how many nodes are in the cluster, or how many times we deploy.
This is all well and good, but I'm concerned about a potential race condition when two instances are started at exactly the same time. Because there's no global lock around this code that all nodes in the cluster will respect, if two instances come online at the same time, I could end up with duplicate jobs or triggers, which kind of defeats the point of this code.
Is there a best practice for automatically defining Quartz jobs and triggers in a clustered environment? Or do I need to resort to setting my own lock?
I am not sure if there is a better way to do this in Quartz. But in case you are already using Redis or Memcache, I would recommend letting all instances perform an atomic increment against a well known key. If the code you pasted is supposed to run only one job per cluster per hour, you could do the following:
long timestamp = System.currentTimeMillis() / 1000 / 60 / 60;
String key = String.format("%s_%d", jobId, timestamp);
// this will only be true for one instance in the cluster per (job, timestamp) tuple
bool shouldExecute = redis.incr(key) == 1
if (shouldExecute) {
// run the mutually exclusive code
}
The timestamp gives you a moving window within which jobs are competing to execute this job.
I had (almost) the same problem: How to create triggers and jobs exactly once per software version in clustered environment. I solved the problem by assigning one of the cluster nodes to be a lead node during start-up and letting it to re-create the Quartz jobs. The lead node is the one, which first successfully inserts the git revision number of the running software to the database. Other nodes use the Quartz configuration created by the lead node. Here's complete solution: https://github.com/perttuta/quartz
I want a scheduler that creates a job/task/thread per entry in my database table.
Further, I want a mechanism to start, pause, stop, and restart each job without affecting the other jobs/tasks/threads. At any moment, I should be able to create a new job or delete one.
I am planning to handle all the job related operations mentioned above through a web application hosted on tomcat server.
Which java scheduler should I opt for and how do I start with this?
You can use Quartz Scheduler
In Quartz I would suggest to use JDBJobStore, it allows you to store your jobs in database.
Coming to creating jobs per entry in DB, you can one JobcreatorFactory which reads entry from your DB and you can create job for that and that job will be stored in DB.
Ideally its better to have separate job class(extends JOB) for every job you want to create. But as you want to create jobs dynamically, you can have general job class which takes context of that job and perform respective operation based on the context inside overridden method og your general job class.
Using Quartz its possible to start, stop, pause your jobs without affecting other jobs.
Hope this helps !
One problem, after executed jobs, quartz delete jobs from database qrtz_triggers table, but in specific situation needed to repeat job which is failed.
Is any configuration options or way to store jobs to another table after execute???
Thanks
If you are using JDBCJobStore, your jobs are stored in a table like QRTZ_JOB_DETAILS, and your simple triggers are stored in QRTZ_SIMPLE_TRIGGERS, your cron trigers are stored in QRTZ_CRON_TRIGGERS, and all the triggers are stored in QRTZ_TRIGGERS.
If you expect your job is durable and remains when no triggers are associated to the job, you should call storeDurably(true) when building your JobDetail. For example:
JobDetail jobDetail = JobBuilder.newJob()
.ofType(DataMapJob.class)
.withIdentity("dataJob", "dataJobGroup")
.storeDurably(true)
.requestRecovery(true)
.build();
Hope it helps.
This is exactly what the durable flag is for. Durable jobs remain registered in Quartz even if there are no triggers associated with the job. On the other hand, non-durable jobs are automatically deleted by Quartz where there are no associated triggers (e.g. after all associated triggers fired and have been deleted by Quartz).
For details, you can refer to JobDetailImpl javadoc.
I have 5 different quartz schedulers which implement 5 different jobs. If I am stopping one scheduler, remaining schedulers are getting stopped. Why?
I'm pretty sure your actually creating references to the same Scheduler, you need to give each scheduler a different "SchedulerName". At the moment it looks like each time your creating a new scheduler its defaulting the SchedulerName.
The "job executor" is actually not the SchedulerFactoryBean. It is the Scheduler bean(to be precise calling its start method invokes the aggregated QuartzScheduler.start method which fires the Trigger-s), provided by the SchedulerFactoryBean. As a matter of fact this Scheduler is stored(and looked-up) under the schedulerName(which if not explicitly set has the same default value for every configured SchedulerFactoryBean) in the SchedulerRepository singleton(SchedulerRepository.getInstance()).
That's how unless you set a different schedulerName for your SchedulerFactoryBean-s, you will always get the same scheduler by each and every SchedulerFactoryBean-s
http://forum.springsource.org/showthread.php?40945-Multiple-Quartz-SchedulerFactoryBean-instances
I know this refers to Spring Beans but i still think the same applies here.
I am working on an application where we have 100 of jobs that's needs to be schedules for executions.
Here is my sample quartz.property file
org.quartz.scheduler.instanceName=QuartzScheduler
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.threadPool.threadCount=7
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.MSSQLDelegate
org.quartz.jobStore.tablePrefix = QRTZ_
org.quartz.jobStore.dataSource = myDS
org.quartz.dataSource.myDS.driver=com.mysql.jdbc.Driver
org.quartz.dataSource.myDS.URL=jdbc:mysql://localhost:3306/quartz
org.quartz.dataSource.myDS.user=root
org.quartz.dataSource.myDS.password=root
org.quartz.dataSource.myDS.maxConnections=5
Though this is working fine, but we are planning to separates jobs in different groups so that it can be easy to maintain them.
Groups will be unique and we want that if a user(Admin) creates a new group a new instance of scheduler should get created and all jobs within that group should be handled by that scheduler instance in future.
This means if the Admin creates a new group say NewProductNotification than we should be able to create a scheduler instance with same name NewProductNotification and all jobs which are parts of the NewProductNotification group should be handeled by NewProductNotification instance of scheduler.
How is this possible and how can we store this information in the Database so that next time when the server is up Quartz should have knowledge about all the scheduler instances or do we need to add this information about new instance in property file.
As the proprty file above showing , we are using jdbcjobstore to handle everything using database.
I don't think dynamically creating schedulers is a good desing approach in Quartz. You can share the same database tables for multiple schedulers (job details and triggers have scheduler name as part of their primary key) but Scheduler is kind of heavyweight.
Can you explain why do you relly need separate schedulers? Maybe you can simply use Job groups and triggers groups (you are in fact using the term group) to distinguish jobs/groups? Also you can use different priorities for each trigger.
As a side note:
I'm using JobStoreCMT and I'm seeing deadlocks, what can I do?
Make sure you have at least number-of-threads-in-thread-pool + 2 connections in your datasources.
And in your configuration (reverse the values and it will be fine):
org.quartz.threadPool.threadCount=7
org.quartz.dataSource.myDS.maxConnections=5
From: I'm using JobStoreCMT and I'm seeing deadlocks, what can I do?
Dynamically creating schedules is very much possible. You would need to create objects of JobDetail and Trigger and pass to the SchedulerFactoryBean object. It will take care of the rest.