I have a piece of the same software, running on 2 servers. Each of this piece of software runs Quartz that lunch at a specific time a job to be executed.
Presuming that both servers have the clocks synchronized, these 2 jobs will start at the same time doing the same thing...
How can I do that only one job to run and the other one don't ?
I have a database also, and my first thought was to make a table where to insert a line when the job starts and also to verify if there is a record for current day (if so then skip job execution...)
But again, if the clocks on servers are perfectly synchronized then both apps will write and check at the same time, making useless this mechanism.
What other solution can I implement ?
Cluster 'em!
(source: quartz-scheduler.org)
Basically both of your Quartz schedulers use the same database to synchronize and will only run job on one (idle) machine.
But again, if the clocks on servers are perfectly synchronized then both apps will write and check at the same time, making useless this mechanism.
This exactly what Quartz does! However it uses some database locking/transaction mechanisms so that when one instance fetches new jobs, the second one must wait.
Presuming that both servers have the clocks synchronized
They must have synchronized clocks to run in a cluster:
Never run clustering on separate machines, unless their clocks are synchronized using some form of time-sync service
Related
I have a situation that specific quartz job must execute only once at specific time of day.
I want to know if there is a workaround to make sure this job will be executed even if server for any reason does not up and running at that point of time.
If i switch from ramjobstore to database does this guaranty the mentioned problem?
EDIT:
To make things clear i use the #Vala. D comment:
if the server was down at this time to run it immediately after server-start
This behavior will cover my requirements
I have a schedule job that run every end of the month. After running it saves some data to database.
When i scale the app(for example with 2 instances) both instances run the schedule job and both save the data and at the end of day my database has the same data.
So i want the schedule job only run one time regardless of instances numbers at cloud.
In my project, I have maintained a database table to hold a lock for each job which needs to be executed only once in the cluster.
When a Job gets triggered then it first tries to acquire lock from the database and if it gets that lock only then it will get executed. If it fails to acquire the lock then job will not get executed.
You can also look at the clustering feature of Quartz job.
http://www.quartz-scheduler.org/documentation/2.4.0-SNAPSHOT/introduction.html
I agree with the comments. If you can utilize a scheduler that's going to be your best, most flexible option. In addition, a scheduler should be executing your job as a "task" on Cloud Foundry. The task will only run on one instance, so you won't need to worry about how many instances your application is using (the two are separate in that regard).
If you're using Pivotal Cloud Foundry/Tanzu Cloud Foundry there is a scheduler you can ask your operations team to install. I don't know about other variants of CF, but I assume there are other schedulers.
https://network.pivotal.io/products/p-scheduler/
If using a scheduler is not an option then this is a concern you'll need to handle in your application. The solution of using a shared lock is a good one, but there is also a little trick you can do on Cloud Foundry that I feel is a little simpler.
When your application runs, certain environment variables are set by the platform. There is one called INSTANCE_INDEX which has a number indicating the instance on which the app is running. It's zero-based, so your first app instance will be running on instance zero, the second instance one, etc.
In your code, simply look at the instance index and see if it's zero. If the index is non-zero, have your task end without doing anything. If it's zero, then let the task proceed and do its work. The task will execute on every application instance, but it will only do work on the first instance. It's an easy way to guarantee something like a database migration or background job only runs once.
One final option would be to use multiple processes. This is a feature of Cloud Foundry that enables you to have different processes running, like your web process and a background worker process.
https://docs.cloudfoundry.org/devguide/multiple-processes.html
The interesting thing about this feature is that you can scale the different processes independently of each other. Thus you could have as many web processes running, but only one background worker which would guarantee that your background process only runs once.
That said, the downside of this approach is that you end up with separate containers for each process and the background process would need to continue running. The foundation expects it to be a long-running process, not a finite duration batch job. You could get around this by wrapping your periodic task a loop or something which keeps the process running forever.
I wouldn't really recommend this option but I wanted to throw it out there just in case.
You can use #SnapLock annotation in your method which guarantees that task only runs once. See documentation in this repo https://github.com/luismpcosta/snap-scheduler
Example:
Import maven dependency
<dependency>
<groupId>io.opensw.scheduler</groupId>
<artifactId>snap-scheduler-core</artifactId>
<version>0.3.0</version>
</dependency>
After importing maven dependency, you'll need to create the required tables tables.
Finally, see bellow how to annotate methods which guarantees that only runs once with #SnapLock annotation:
import io.opensw.scheduler.core.annotations.SnapLock;
...
#SnapLock(key = "UNIQUE_TASK_KEY", time = 60)
#Scheduled(fixedRate = 30000)
public void reportCurrentTime() {
...
}
With this approach you also guarantee audit of the tasks execution.
I have a Java application named 'X'. In Windows environment, at a given point of time there might be more than one instance of the application.
I want a common piece of code to be executed sequentially in the Application 'X' no matter how many instances of the application are running. Is that something possible and can be achieved ? Any suggestions will help.
Example :- I have a class named Executor where a method execute() will be invoked. Assuming there might be two or more instances of the application at any given point of time, how can i have the method execute() run sequential from different instances ?
Is there something like a lock which can be accessed from two instances and see if the lock is currently active or not ? Any help ?
I think what you are looking for is a distributed lock (i.e. a lock which is visible and controllable from many processes). There are quite a few 3rd party libraries that have been developed with this in mind and some of them are discussed on this page.
Distributed Lock Service
There are also some other suggestions in this post which use a file on the underlying system as a synchornization mechanism.
Cross process synchronization in Java
To my knowledge, you cannot do this that easily. You could implement TCP calls between processes... but well I wouldn't advice it.
You should better create an external process in charge of executing the task and a request all the the tasks to execute by sending a message to a JMS queue that your executor process would consume.
...Or maybe you don't really need to have several processes running in the same time but what you might require is just an application that would have several threads performing things in the same time and having one thread dedicated to the Executor. That way, synchronizing the execute()method (or the whole Executor) would be enough and spare you some time.
You cannot achieve this with Executors or anything like that because Java virtual machines will be separate.
If you really need to synchronize between multiple independent instances, one of the approaches would be to dedicate internal port and implement a simple internal server within the application. Look into ServerSocket or RMI is full blown solution if you need extensive communications. First instance binds to the dedicated application port and becomes the master node. All later instances find the application port taken but then can use it to make HTTP (or just TCP/IP) call to the master node reporting about activities they need to do.
As you only need to execute some action sequentially, any slave node may ask master to do this rather than executing itself.
A potential problem with this approach is that if the user shuts down the master node, it may be complex to implement approach how another running node could take its place. If only one node is active at any time (receiving input from the user), it may take a role of the master node after discovering that the master is not responding and then the port is not occupied.
A distributed queue, could be used for this type of load-balancing. You put one or more 'request messages' into a queue, and the next available consumer application picks it up and processes it. Each such request message could describe your task to process.
This type of queue could be implemented as JMS queue (e.g. using ActiveMQ http://activemq.apache.org/), or on Windows there is also MSMQ: https://msdn.microsoft.com/en-us/library/ms711472(v=vs.85).aspx.
If performance is an issue and you can have C/C++ develepors, also the 'shared memory queue' could be interesting: shmemq API
We're developing a web app and are coming to the end of development, and the client we're working with has suddenly sprung the fact on us that we will need to be able to handle load balancing.
We have batch jobs running which would need to run on both servers, but we don't want them to overlap. They are selecting rows from the database, processing the objects, and merging them back into the database. One of these jobs MUST run at the same time each day, though the others run every n minutes. We have about a week at most to get something working, and it'll become technical debt for us.
My question is: what quick and dirty hacks exist to get this working properly? We have a SQLServer 2008 instance and are running Java EE 6 on JBoss 5, which will be load balanced between two servers. We're using Spring 3 and JPA2 backed by Hibernate, and using the stock spring scheduler to schedule and run the jobs. Help me Obi Wan Kenobi; you're my only hope!
on jboss5 u need to use Scheduler API as the simplest solution - the implmentation is built on top of quartz and generally you would user clustered configuration like described here
http://quartz-scheduler.org/documentation/quartz-2.x/configuration/ConfigJDBCJobStoreClustering
Almost 10 years after this question was asked, I had the same need and the "quickest and dirty-ess" solution for me was a load balancer using shared file system without any master.
Each worker locks->picks the jobs from the shared file system, independent of other workers. To balance load, each worker sleeps X seconds between each job polling iteration, where X is proportional to load on the worker (in my case load is count of processes started by worker in the background). Thus high load sleeper gives higher probability to other workers to pick up the next job. Worker loops are running under supervisor (linux).
My use case was execution of sparklyr client-mode jobs on Spark/Hadoop cluster without overloading the edge nodes. It was implemented as a bash script within few hours and then scaled to 3 hosts, and has been stable for some months now - till there is time to invest in a better solution.
I'm working on a multi-user Java webapp, where it is possible for clients to use the webapp API to do potentially naughty things, by passing code which will execute on our server in a sandbox.
For example, it is possible for a client to write a tight while(true) loop that impacts the performance of other clients.
Can you guys think of ways to limit the damage caused by these sorts of behaviors to other clients' performance?
We are using Glassfish for our application server.
The halting problem show that there is no way that a computer can reliably identify code that will not terminate.
The only way to do this reliably is to execute your code in a separate JVM which you then ask the operating system to shut down when it times out. A JVM not timing out can process more tasks so you can just reuse it.
One more idea would be byte-code instrumentation. Before you load the code sent by your client, manipulate it so it adds a short sleep in every loop and for every method call (or method entry).
This avoids clients clogging a whole CPU until they are done. Of course, they still block a Thread object (which takes some memory), and the slowing down is for every client, not only the malicious ones. Maybe make the first some tries free, then scale the waiting time up with each try (and set it down again if the thread has to wait for other reasons).
Modern day app servers use Thread Pooling for better performance. The problem is that one bad apple can spoil the bunch. What you need is an app server with one thread or maybe process per request. Of course there are going to be trade offs. but the OS will handle making sure that processing time gets allocated evenly.
NOTE: After researching a little more what you need is an engine that will create another process per request. If not a user can either cripple you servlet engine by having servlets with infinite loops and then posting multiple requests. Or he could simply do a System.exit in his code and bring everybody down.
You could use a parent thread to launch each request in a separate thread as suggested already, but then monitor the CPU time used by the threads using the ThreadMXBean class. You could then have the parent thread kill any threads that are misbehaving. This is if, of course, you can establish some kind of reasonable criteria for how much CPU time a thread should or should not be using. Maybe the rule could be that a certain initial amount of time plus a certain additional amount per second of wall clock time is OK?
I would make these client request threads have lower priority than the thread responsible for monitoring them.