Requirement is
1. Pick up tasks from database and call web service for those tasks
2. Need to do this in Weblogic cluster where only single instance of scheduler/executor should run.
We have Hazelcast support so i am thinking of getting java ExecutorService from Hazelcast. This ExecutorService will "pick tasks from DB and execute web service call". So each node will have to go through this ExecutorService
Is this the right approach?
My main concern is for not make duplicate calls in the cluster
The reason i do not want to use Quartz scheduler is because i cannot store quartz scheduler in Hazelcast.
Hazelcast didn't support a ScehduledExecutorService. There is an open issue for it here.
I my opinion, you should use a queue : Put tasks on this queue, and on each node poll this queue. You'll be sure to invoke a task only once, and the work will be distributed. This kind of implementation is not fully fault-tolerant, though. If a node crash during the execution of a task, it will be lost.
Related
I am working on an application which is deployed on web-sphere application server 8.0. This application insert record in one table and uses the data-source by jndi lookup.
I need to create a batch job which will read data from the above table and will insert into some other table continuously on a fixed interval of time. It will be deployed on the same WAS server and use the same jndi lookup for data source.
I read on internet that web-sphere application server scheduling is an option and is done using EJB and session beans.
I also read about jdk's ScheduledThreadPoolExecutor. I can create a war having ScheduledThreadPoolExecutor implementation and deploy it on the WAS for this.
I tried to find the difference between these two in terms of usage, complexity, performance and maintainability but could not.
Please help me in deciding which approach will be better for creating the scheduler for insert batch jobs and why. And in case if WAS scheduler is better then please provide me link to create and deploy the same.
Thanks!
Some major differences between WAS Scheduler and Java SE ScheduledThreadPoolExecutor is that WAS Scheduler is transactional (task execution can roll back or commit), persistent (tasks are stored in a database), and can coordinate across members of a cluster (such that tasks can be scheduled from any member but only run on one member). ScheduledThreadPoolExecutor is much lighter weight approach because it doesn't have any of these capabilities and does all of its scheduling within a single JVM. Task executions neither roll back nor retry and are not kept externally in a database in case the server goes down. It should be noted that WebSphere Application Server also has CommonJ TimerManager (and AlarmManager via WorkManager) which are more similar to what you get with ScheduledThreadPoolExecutor if that is what you want. In that case, the application server still manages the threads and ensures that context of the scheduling thread is available on the thread of execution. Hope this helps with your decision.
When I trigger a job in Quartz in a clustered setup, does that trigger job only on the same machine, or any machine in the clustered setup?
Quartz documentation on Clustering says (emphasis mine):
Only one node will fire the job for each firing. For example, if the job has a repeating trigger that tells it to fire every 10 seconds, then at 12:00:00 exactly one node will run the job, and at 12:00:10 exactly one node will run the job, etc. It won't necessarily be the same node each time - it will more or less be random which node runs it. The load balancing mechanism is near-random for busy schedulers (lots of triggers) but favors the same node that just was just active for non-busy (e.g. one or two triggers) schedulers.
Basically, once a job is scheduled to run, this information is written to the database. Any node from the cluster can read from this database and run the job.
I have quartz schedulers load balanced across four cluster. Could you please let me know whether we can do some simple operation so it will be block the quartz schedulers from running in one instance.
All my jobs are JDBC store kind of jobs. My requirement is to disabling jobs from kicking off from instance alone from the cluster.
Any Suggestions?
I need to have a set of jobs running but they won't be queued like how something like RabbitMQ (or similar software) works. They will run continuously and perform some action periodic (like a cron job) while making sure they don't overlap. So if a job doesn't finish until it's scheduled to run again it won't start again so we end up with same job running twice.
Is there any software that can handle and provide such features so i don't end up with a script while (true) {do...}
Seems that DisallowConcurrentExecution is what you are looking for as part of the
Quartz Scheduler API
I have a webapp which will run on 2 different machines. From the webapp it is possible to "order" jobs to be executed at specific times by quartz. quartz is running inside the webapp. Thus quartz is running on each of the two machines.
I am using JDBC datastore to persist the jobs, triggers etc.
However, the idea is that only one of the machines will run jobs and the other will only use quartz to schedule jobs. Thus, the scheduler will only be started (scheduler.start()) on one of the machines.
In the documentation it says
Never run clustering on separate machines, unless their clocks are synchronized using some form of time-sync service (daemon) that runs very regularly (the clocks must be within a second of each other). See http://www.boulder.nist.gov/timefreq/service/its.htm if you are unfamiliar with how to do this.
Never start (scheduler.start()) a non-clustered instance against the same set of database tables that any other instance is running (start()ed) against. You may get serious data corruption, and will definitely experience erratic behavior.
And i'm not sure that the two machines in which my webapp is running have their clocks synchronized.
My question is this: Should i still run quartz in clustering mode for this setup when only one of the quartz instances will be started and run jobs while the other instance will only used for scheduling jobs to be executed by the first instance.
What about simply starting the scheduler on one node only and accessing it remotely on another machine? You can schedule jobs using RMI/JMX. Or you can use a RemoteScheduler adapter.
Basically instead of having two clustered schedulers where one is working and another one is only accessing the shared database you have only a single scheduler (server) and you access it from another machine, scheduling and monitoring jobs via API.
If you will never call the start() method on the second node, then you shouldn't need to worry about clock synchronization.
However you will want to set the isClustered config prop to true to make sure that table-based locking is used when data is inserted/updated/deleted by both nodes.