I am working on a project in which I can hit maximum 15k hit a day to Google API. So I want to stop the job after 15k and resume it next day. Please let me know how can I do the same.
Please let me know how can I achieve the same. Right now I am thinking of using quartz scheduler to schedule the job every day.
If anyone needs full explanation, I can explain it more.
Thanks in advance.
You can stop a step execution (and its surrounding job) using StepExecution#setTerminateOnly. So in your case, you can use for example a ItemReadListener#afterRead or ItemWriteListener#afterWrite that has access to the step execution and set the terminateOnly flag after processing 15k items. When you stop the job gracefully like this, its status will be STOPPED and you will be able to restart it again the next day as you mentioned.
You can find an example in the Stopping a Job Manually for Business Reasons section of the reference documentation.
Hope this helps.
I had something similar where I needed to stop a 24/7 job 5 minutes before server maintenance was scheduled to start.
The easiest I found was to use the Reader and return null to indicate the job should stop. In your case, return null when 15k API requests were processed.
This will likely mean you'll need a bean (could be just an AtomicInteger) available to the Reader and updated by the Processor. But also a Job Listener (sorry, I don't have the code) which also knows about the bean. If the maximum is reached the Listener sets up a custom job exit value to be returned to the scheduler when the job stops. The scheduler has to be configurable enough to know the particular exit value means to start the job again the next day. (Any other non-zero value was treated as an error.)
This means there is a small possibility the job hits 15k but also that it is the last item, so the job is scheduled again for the next day even though there is nothing more to be processed. It shouldn't matter though - the job will start the next day and stop immediately with a normal complete status so the scheduler will not schedule again.
Related
I basically want to know, if the scheduler itself deletes triggers after they have fired if there are not any other points in time where they would ever fire again?
I need to know this so that I know how to tidy up after a job has been executed.
I have already read through many posts about triggers and jobs. I have also read through all the official quartz lessons. Only thing I found out there was that jobs can be deleted if you set their property "durable" to false when there are no more triggers pointing to it.. That is also how my question came up on how or when the scheduler deletes its triggers
Yes, it automatically removes these triggers.
I've found some documentation for this topic: https://www.quartz-scheduler.org/api/2.1.7/org/quartz/SimpleTrigger.html
there is a line stating:
int getRepeatCount()
Get the the number of times the SimpleTrigger should repeat, after which it will be automatically deleted.
According to this Google article,
"You can also use the Task Queue to do the write at a later time, which has the added benefit that the Task Queue automatically retries failures."
Suppose I'm trying to keep my daily spend on Google App Engine under a certain budget. Let's say I start to detect I'm getting low on quota for the day so I want to reschedule the work for tomorrow. It would be great to use Task Queues for this instead of Cron jobs because the initiation of the work and the rescheduling of the work can be handled pretty similarly.
How do I put a task on the Task Queue and specify that it should not begin until a particular time? I can see how I might use RetryOptions to get part of what I want, namely to delay the work. But RetryOptions doesn't seem to provide a way to specify not to retry until 24 hours have passed since "now" or don't retry until midnight.
Thanks for your help.
Looks like I can use TaskOptions.countdownMillis(long) to specify how long to wait before executing the task.
The documentation says "later time", in the sense that your application doesn't stop to wait for your write to go through, so you work in parallel.
If you want to control WHEN to start a cleanup or something similar, look into CRON jobs
Suppose I have a spring job run every 5 minute, usually the job will take about one minute to complete, but if something goes wrong the job will last more than 5 minute. Before last job finished , another job will start. So, the two jobs will interfere with each other?
ps: I use the spring schedule annotation to schedule jobs.
You can control this behavior. If you want to leave a fixed amount of time between the end of one job and the start of the next, use the fixedDelay http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/scheduling/annotation/Scheduled.html#fixedDelay--.
If you use the fixedRate, then jobs may overlap. Whether that's "ok" depends on what your job does. But you can prevent this from happening with fixedDelay if you want.
I'm looking for a way to simulate or force a trigger misfire programatically. Here's the scenario:
I have a job set to trigger but the job requires some underlying resource that may be unavailable at times. If the resource is unavailable, I would like Quartz to re-fire the trigger later based on the misfire policy.
I've explored two options that are similar but not quite what I'm looking for:
Throwing a JobExecutionException with refireImmediately set to true:
Works, but doesn't delay execution based on misfire policy; this
would hammer the resource availability check.
Scheduling a second trigger at some fixed interval of time in the
future: Also works, but doesn't take into account misfire policy.
This means a job could wind up with a bunch of retries queued up
stemming from different failed runs.
Any ideas or anything I'm missing? Thanks!
If I got it right, you don't need to force or simulate a misfire because the resource availability is "something your Job can handle".
Misfires exists and are managed by Quartz for situations like server shutdown or other "unexpected problems" that prevent Job execution.
You have two options to follow to implement a simple fault-tolerance retry logic;
your Job can execute it's logic only when the underlying resource is available, so you can:
Wait for the resource to become availabile:
in this case your job waits and repeat the check for the resource availability, eventually after a short timeout, the job can give up and end.
Just do nothing if the resource is not available:
in this case the job ends without doing anything and the it will fire normally and retry according the Trigger definition.
In both cases, if the resource is avaliable, the Job can execute it's inner logic and use the underlying resource. (No misfires, because the Job was actually executed)
This can be done using a Trigger, setting the misfire policy you need in case the Job cannot be executed at all. See This great and detailed article on Quartz misfires.
In your situation, If you want to execute a job one time every day:
Define a Cron trigger that fires multiple times in a day in a given span of time, for exemple, every 15 minutes from 8:00 AM to 12:00 AM:
0 0/15 8-12 * * ?
Buld a Job that use one of the two approaches described before
The first time your Job inner logic executes (that is, the resource is available) your job will save a "job executed flag" with the day of execution somewhere on DB.
On the following trigger executions, the Job will check the flag and will not execute it's inner logic again.
Also, if your job will take long time to finish, you may want to prevent concurrent execution of the same job using the following annotation on the job implementation:
#DisableConcurrentExecution
See Quartz tutorials for more informations on job execution.
I have an application that checks a resource on the internet for new mails. If there is are new mails it does some processing on them. This means that depending on the amount of mails it might take just a few seconds to hours of processing.
Now the object/program that does the processing is already a singleton. So right now I already took care of there really only being 1 instance that's handling the checking and processing.
However I only have it running once now and I'd like to have it continuously running, checking for new mails more or less every 10 minutes or so to handle them in a timely manner.
I understand I can take care of this with Timer/Timertask or even better I found a resource here: http://www.ibm.com/developerworks/java/library/j-schedule/index.html that uses Scheduler/SchedulerTask. But what I am afraid of.. is if I set it to run every 10 minutes and a previous session is already processing data it will put the new task in a stack waiting to be executed once the previous one is done. So what I'm afraid of is for instance the first run running for 5 hours and then, because it was busy all the time, after that it will launch 5*6-1=29 runs immediately after each other checking for mails and/do some processing without giving the server a break.
Does anyone know how I can solve this?
P.S. the way I have my application set up right now is I'm using a Java Servlet on my tomcat server that's launched upon server start where it creates a Singleton instance of my main program, then calls some method to do the fetching/processing. And what I want is to repeat that fetching/processing every "x" amount of time (10 minutes or so), making sure that really only 1 instance is doing this and that really after each run 10 minutes or so are given to rest.
Actually, Timer + TimerTask can deal with this pretty cleanly. If you schedule something with Timer.scheduleAtFixedRate() You will notice that the docs say that it will attempt to "make up" late events to maintain the long-term period of execution. However, this can be overcome by using TimerTask.scheduledExecutionTime(). The example therein lets you figure out if the task is too tardy to run, and you can just return instead of doing anything. This will, in effect, "clear the queue" of TimerTask.
Of note: TimerTask uses a single thread to execute, so it won't spawn two copies of your task side-by-side.
On the side note part, you don't have to process all 10k emails in the queue in a single run. I would suggest processing for a fixed amount of time using TimerTask.scheduledExecutionTime() to figure out how long you have, then returning. That keeps your process more limber, cleans up the stack between runs, and if you are doing aggregates, ensures that you don't have to rebuild too much data if, for example, the server is restarted in the middle of the task. But this recommendation is based on generalities, since I don't know what you're doing in the task :)