Quartz clustering setup when running in tomcat on multiple machines - java

I have a webapp which will run on 2 different machines. From the webapp it is possible to "order" jobs to be executed at specific times by quartz. quartz is running inside the webapp. Thus quartz is running on each of the two machines.
I am using JDBC datastore to persist the jobs, triggers etc.
However, the idea is that only one of the machines will run jobs and the other will only use quartz to schedule jobs. Thus, the scheduler will only be started (scheduler.start()) on one of the machines.
In the documentation it says
Never run clustering on separate machines, unless their clocks are synchronized using some form of time-sync service (daemon) that runs very regularly (the clocks must be within a second of each other). See http://www.boulder.nist.gov/timefreq/service/its.htm if you are unfamiliar with how to do this.
Never start (scheduler.start()) a non-clustered instance against the same set of database tables that any other instance is running (start()ed) against. You may get serious data corruption, and will definitely experience erratic behavior.
And i'm not sure that the two machines in which my webapp is running have their clocks synchronized.
My question is this: Should i still run quartz in clustering mode for this setup when only one of the quartz instances will be started and run jobs while the other instance will only used for scheduling jobs to be executed by the first instance.

What about simply starting the scheduler on one node only and accessing it remotely on another machine? You can schedule jobs using RMI/JMX. Or you can use a RemoteScheduler adapter.
Basically instead of having two clustered schedulers where one is working and another one is only accessing the shared database you have only a single scheduler (server) and you access it from another machine, scheduling and monitoring jobs via API.

If you will never call the start() method on the second node, then you shouldn't need to worry about clock synchronization.
However you will want to set the isClustered config prop to true to make sure that table-based locking is used when data is inserted/updated/deleted by both nodes.

Related

Run Scheduled Cron method in Spring from single pod if multiple pods exist on Kubernetes

I am moving from single pod(docker image) to multiple pods for my Spring Application on Kubernetes for load handling. But I am facing an issue because I have a cron scheduler method in my application which runs daily at a particular time. If I deploy multiple pods, they all run simultaneously and as a result multiple entries get saved into my DB, but I want only a single pod to execute that function.
I have thought of generating java uuid and saving it in the DB as the function starts execution on each pod. Then in the same function, putting a sleep timer of, let's say, 5 seconds and comparing the uuid from the DB in each pod. The pod which updated the value in the database latest will match that value and will execute the method ahead.
Is this a good approach? If not, please give suggestions that can be done to solve this issue.
You can use a quartz scheduler with JDBC store. This will take care of your requirement automatically.
In short: "Load-balancing occurs automatically, with each node of the cluster firing jobs as quickly as it can. When a trigger’s firing time occurs, the first node to acquire it (by placing a lock on it) is the node that will fire it."
Please refer to the official link for details- http://www.quartz-scheduler.org/documentation/quartz-2.3.0/configuration/ConfigJDBCJobStoreClustering.html
Also, you can move the workload to an isolated process, this will be helpful and cleaner, check this out https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/ this will give you an idea.
the better approach is to configure your application to have multiple Spring Profiles and use cron only on one profile. Follow this tutorial https://faun.pub/spring-scheduler-in-multi-node-environment-49814e031e7c

Tomcat In-Memory Caching - replication on Cluster

I have a question on Tomcat clustering. I have a java application in which we have implemented in-memory caching. So basically when Tomcat starts, it loads a few objects from the database. These objects are stored in the tomcat memory like static objects. so whenever we update something from the application, it writes to the database and also updates the object in memory.
My question is, if we implement clustering in tomcat with 2 or more nodes, will those cached objects be also shared? Is that possible? I dont think it is. HttpSession objects can be shared using the session replication provided by tomcat delta manager or backup manager. But can the in-memory things also be shared?
Additionally what happens to batch jobs that are running? Will they also run multiple times as there will be multiple tomcat instances in the cluster and they would each trigger the job? That would be a failure as well as.
Any thoughts \ ideas?
If you save something in memory, it will not be replicated unless you implement something specifically to send it to other machines. Each jvm keeps their memory independent from each other.
In general, if you want to have replicated caching, a good solution is to use ehcache (http://www.ehcache.org/).
With regard to batch jobs, it depends on the library you use but generally, if you use an established library (like http://www.quartz-scheduler.org/), it should be capable of making sure that only one instance runs the job. Perhaps you need to configure.
The important thing is to test to make sure that any solution you put in place actually does what you expect it to do.
Good luck!
Whenever moving to a cluster or a cluster-like topology you need to revise your application solution design/architecture to make sure it will support multiple instance execution.
Data cached in memory by a given Tomcat instance WILL NOT be shared across instances in the cluster. You will need to move such data outside the Tomcat instance to a shared cache instance - Redis seems to be a popular option this days.
Job execution probably needs to be revised and customized to be driven by configuration. Create a Boolean flag your app can read and kick off batch processing if required. Select the node within the cluster you need the job to run on and set the flag to true there. Set it to false in all other nodes. Quartz WILL NOT ensure/control/manage multiple instances of a job running in a cluster.

WebSphere application server scheduler or java scheduler for insert

I am working on an application which is deployed on web-sphere application server 8.0. This application insert record in one table and uses the data-source by jndi lookup.
I need to create a batch job which will read data from the above table and will insert into some other table continuously on a fixed interval of time. It will be deployed on the same WAS server and use the same jndi lookup for data source.
I read on internet that web-sphere application server scheduling is an option and is done using EJB and session beans.
I also read about jdk's ScheduledThreadPoolExecutor. I can create a war having ScheduledThreadPoolExecutor implementation and deploy it on the WAS for this.
I tried to find the difference between these two in terms of usage, complexity, performance and maintainability but could not.
Please help me in deciding which approach will be better for creating the scheduler for insert batch jobs and why. And in case if WAS scheduler is better then please provide me link to create and deploy the same.
Thanks!
Some major differences between WAS Scheduler and Java SE ScheduledThreadPoolExecutor is that WAS Scheduler is transactional (task execution can roll back or commit), persistent (tasks are stored in a database), and can coordinate across members of a cluster (such that tasks can be scheduled from any member but only run on one member). ScheduledThreadPoolExecutor is much lighter weight approach because it doesn't have any of these capabilities and does all of its scheduling within a single JVM. Task executions neither roll back nor retry and are not kept externally in a database in case the server goes down. It should be noted that WebSphere Application Server also has CommonJ TimerManager (and AlarmManager via WorkManager) which are more similar to what you get with ScheduledThreadPoolExecutor if that is what you want. In that case, the application server still manages the threads and ensures that context of the scheduling thread is available on the thread of execution. Hope this helps with your decision.

How to stop quartz from running in one instance

I have quartz schedulers load balanced across four cluster. Could you please let me know whether we can do some simple operation so it will be block the quartz schedulers from running in one instance.
All my jobs are JDBC store kind of jobs. My requirement is to disabling jobs from kicking off from instance alone from the cluster.
Any Suggestions?

Using Quartz in a clustered environment

I'm looking to use the quartz scheduler in my application because I have a clustered environment and want to guarantee that only one instance of my job runs each hour. My question is...Do I have to use a JDBC job store or some sort of "outside" storage of job data to guarantee that only once instance in my cluster runs the job at any given hour or is there more magic to Quartz that I am aware of?
Yes, you need to use the JDBC-JobStore, or else the TerracottaJobStore to enable a mechanism for the nodes to communicate with each other (in the one case they communicate in the db tables, in the other via the Terracotta networking features).

Categories