Scheduling periodic tasks and clock drift

Scheduling periodic tasks and clock drift - java

I would like to schedule a periodic task which executes every X hours. I have a service which is written in Java and I was thinking of creating a long running background thread that runs forever as long as the service is up. How can I ensure that we are executing the task once every X hours? Is clock drift on my host an issue I should be worried about? I know that frequency of the clock ticks may change if the CPUs are working hard.
Edit: I was thinking of adding a bean to my spring configuration to spin up the thread which will periodically perform my task.

Java provides a java.util.Timer class that is designed to execute a task on a background thread. One of the modes of operation is "repeated execution at regular intervals". There are fixed-delay and fixed-rate execution methods that can be used, depending on your exact needs.
Java 5 added a java.util.concurrent.ScheduledThreadPoolExecutor class that is more flexible than Timer, but also offers fixed-delay and fixed-rate execution methods.
If you need such precise timing that these aren't suitable, I'm not sure that Java is an appropriate solution. You would be starting to enter the realm of a real-time system. At this point, you should likely be looking for other options.

If you are worried, write a test process and run it on the target platform. Using the feature you plan to use for the real process (like ScheduledExecutorService), schedule a task to log the host time every 24 hours. If the host doesn't use NTP to keep its clock synchronized, perhaps you could also make call to a time-keeping web service and log that too. After a few days, you should have a good sense of whether you need a method to correct for drift.
My guess is that the built-in scheduler will be accurate to less than a second per day.

Is clock drift on my host an issue I should be worried about?
Yes, clock drift can be an issue when using ScheduledThreadPoolExecutor.
CronScheduler is specifically designed to be proof against clock drift.
Example usage:
Duration syncPeriod = Duration.ofMinutes(1);
CronScheduler cron = CronScheduler.create(syncPeriod);
// If you need just precisely "once every X hours", irrespective of the
// starting time
cron.scheduleAtFixedRate(0, X, TimeUnit.HOURS, runTimeMillis -> {
// Do the task
});
// If you need "once every X hours" in terms of wall clock time,
// in some time zone:
ZoneId myTZ = ZoneId.systemDefault();
cron.scheduleAtRoundTimesInDay(Duration.ofHours(X), myTZ, runTimeMillis -> {
// Do the task
});
See Javadocs for scheduleAtRoundTimesInDay.

Related

How to schedule a task for new data created?

In my system, user can create a schedule with time and conditions. Before 30 mins of schedule time, if the conditions are not satisfied the system will raise an alarm to notice users about that.
My system are spring boot applications and using spring scheduled task to trigger alarms. The problems is when user creates a lot of schedule in the future, if I create a scheduled task for each schedule data, there will be memory problem.
My current solution is a create a schedule run at a time of everyday to scan all data in next 24 hours and create scheduled task for them to trigger alarm. This will reduce scheduled tasks created but if user creates new schedule data in next 24 hours after scanning, that data will be not trigger any alarm.
So what should I do?

Is there a reason that you are scheduling all of this in JVM memory? If the JVM crashes (or is simply rebooted), the timers would then be lost as if the user never scheduled any alarm. As you mentioned, creating a timer per request would likely not be a scalable solution.
Without knowing the specific details of your system, the most common approach would be to persist (i.e. in a DB, flat file, etc.) the data each time a user requests to schedule event. This way, in the event of a crash or reboot, you won't lose events. Similarly, this approach can scale to multiple servers if necessary. Then, at whatever granularity you support (i.e. minute, hour, day, etc.) there would be a process or thread (only a single monitor thread) find all of the events which have expired since you last ran. Finally, once this thread has identified events that need an "alarm," this one thread can control sending these events for active processing. This thread can either individually handle each event or otherwise submit them to an active work queue for parallelization.
More specifically, if you have alarms which could go off at any minute, you should schedule a monitor thread to run every minute. This thread should find all the events which require an alarm and then actually send that alarm.
Remember that how often you should schedule your monitor thread is a function of the resolution you want for your alarms and your tolerance for late alarms. If late alarms are totally unacceptable, then your monitor must run at least as often as the finest granularity for scheduling an alarm event. This is, of course, assuming alarms are always scheduled in the future-- otherwise, you will probably want to double the frequency of your monitoring checks. To see why, consider the following example:
minute 0: Run monitor
minute 0: User schedules alarm for minute 0
minute 1: Run monitor
If we run the monitor once per minute but allow the user to schedule an alarm in the current minute, it's quite possible that we'll miss the event (as shown in the example above). I can go into this more deeply if necessary, but this is here mostly for completeness as I have no indications from your description that this will actually pose any problems.
Good luck.

How can a task be scheduled in Java to occur at midnight, even if a leap second has just occurred?

First, definition of "occur at midnight" is that when task is run, new DateTime() or similar will show 00:00:00 or later for the time portion when converted to a human readable format. Important point is that it must not show 23:59:59 of the previous day.
A common way to achieve this would be to calculate how many milliseconds are between now and the desired point in time, and then use a ScheduledExecutorService to execute the task at the correct time. However, when a leap second is inserted this will result in the task running a second early (or a few milliseconds early depending on how the leap second is 'smeared' and when you scheduled the task):
Runnable task = ...
long numberOfMillisUntilMidnight = ...
ScheduledExecutorService executor = ...
// task runs too early when leap seconds are inserted
executor.schedule(task, numberOfMillisUntilMidnight, TimeUnit.MILLISECONDS);
The reason is that executor.schedule() is based on System.nanoTime() which obviously ignores the leap seconds. I guess what I need is some scheduler based on "run at this time" rather than "run after this amount of time".
For those who are interested, the reason the task must run at midnight related to the fact that all events in my system must be categorized according to which day they occurred on, and in so far as is possible, this needs to be in sync with another system. Of course it would be better if the other system stamped each event with what day it is, but we are not there.

I guess what I need is some scheduler based on "run at this time" rather than "run after this amount of time"
That would be the all-singing, all-dancing solution. But:
First, definition of "occur at midnight" is that when task is run, new DateTime() or similar will show 00:00:00 or later for the time portion...Important point is that it must not show 23:59:59 of the previous day.
(my emphasis)
The simple way is always add a second, or even two. So it'd be 00:00:01 (00:00:00 or later) in the common case, and 00:00:00 (not 23:59:59) in the leap second case.

Based on the resulting discussions it seems clear that, in general, it is unwise to rely on your scheduler to run a task at the "correct" time if "wall time" is important to you. This is also true when running daily tasks at the same "wall time" across daylight savings shifts, although unlike the leap second case, the daylight savings case is well supported by existing tools (by Quartz for example).
Instead I think the best approach for such "wall time sensitive" processes is that when the task is run, check the system clock at that point. If your schedule was inaccurate for whatever reason (leap seconds are not the only time your system clock is adjusted relative to the elapsed time measured by System.nanoTime()) and the time has not yet been reached, then do nothing and reschedule the task for the correct time. This approach would also work for schedules that respond to daylight savings changes but as mentioned above this is already supported by common tools.
This approach was inspired by the comment by Jonathon Reinhart above. Rescheduling rather than sleeping though seems better.

Assuming that your concrete ScheduledExecutorService-implementation relies on System.nanoTime() (as you said) and taking into account your requirement/configuration that the initial delay parameter of the method schedule(...) counts the elapsed milliseconds until next midnight including a possible leap second,
I suggest you to use a leap-second-aware solution. An example using my library Time4J shows how to calculate the delay parameter:
Moment now = SystemClock.currentMoment(); // should not be called during a leap second
Moment nextMidnight =
now.with(
PlainTime.COMPONENT.setToNext(PlainTime.midnightAtStartOfDay()).in(
Timezone.ofSystem().with(
GapResolver.NEXT_VALID_TIME.and(OverlapResolver.EARLIER_OFFSET)
)
)
);
long delayInMilliseconds = SI.NANOSECONDS.between(now, nextMidnight) / 1_000_000;
This code will also choose the earliest valid local time after midnight in case of daylight-saving-change (standard-Java would rather push the time forward by the size of the gap possibly resulting in a local time later than first valid time). For most zones, this is only relevant if choosing an arbitrary start time after midnight (dependent on business requirements).
What so ever, you should also care about connecting your systems to the same NTP-clock. Either you rely on OS-NTP-configuration, or you can use a Java-based clock connecting to NTP (Time4J offers here a solution, too).
If your chosen clock is just doing arbitrary jumps (i.e. if someone has manually adjusted it or in case of bad NTP-configuration) then rescheduling the task after having checked the local walltime again is probably safer. However, I still think that calculating the delay parameter by Time4J-code above is a good idea because the chance to match midnight is higher than just to run the task and rechecking the local time.
You could also combine both approaches: exact calculation of delay AND check/reschedule.

What should Timertask.scheduleAtFixedRate do if the clock changes?

We want to run a task every 1000 seconds (say).
So we have
timer.scheduleAtFixedRate(task, delay, interval);
Mostly, this works fine. However, this is an embedded system and the user can change the real time clock. If they set it to a time in the past after we set up the timer, it seems the timer doesn't execute until the original real-time date/time. So if they set it back 3 days, the timer doesn't execute for 3 days :(
Is this permissible behaviour, or a defect in the Java library? The Oracle javadocs don't seem to mention anything about the dependency or not on the underlying value of the system clock.
If it's permissible, how do we spot this clock change and reschedule our timers?

Looking at the source of Timer for Java 1.7, it appears that is uses System.currentTimeMillis() to determine the next execution of a task.
However, looking at the source of ScheduledThreadPoolExecutor, it uses System.nanoTime().
Which means you won't see that behaviour if you use one in place of a Timer. To create one, use, for instance, Executors.newScheduledThreadPool().
Why you wouldn't see this behaviour is because of what the doc for System.nanoTime() says:
This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time. The value returned represents nanoseconds since some fixed but arbitrary origin time [emphasis mine].
As to whether this is a bug in Timer, maybe...
Note that unlike a ScheduledExecutorService, a Timer supports absolute time, and maybe this explains its use of System.currentTimeMillis(); also, Timer has been there since Java 1.3 while System.nanoTime() only appears in 1.5.
But a consequence of using System.currentTimeMillis() is that Timer is sensitive to the system date/time... And that is not documented in the javadoc.

It is reported here http://bugs.sun.com/view_bug.do?bug_id=4290274
Similarly, when the system clock is set to a later time, the task may be run multiple times without any delay to "catch up" the missed executions. Exactly this happens when the computer is set to standby/hibernate and the application is resumed (this is how I found out).
This behavior can also be seen in a Java debugger by suspending the timer thread and resuming it.

Java bug in sleep() when changing OS time : any workaround?

The bug that annoys me is the same than this ticket. Basically, if you change the OS clock to a date in the past, all the thread that were sleeping at the time of the change won't wake up.
The application I am developping is meant to be running 24/24, and we would like to be able to change the OS date without stopping it (for example, to switch from summer time to winter time). What happens for the moment is that when we change the date to the past, then some parts of the application just freeze. I observed that on multiple machine, on Windows XP and Linux 2.6.37, and with a recent JVM (1.6.0.22).
I tried many Java sleeping primitives, but they all have the same behavior :
Thread.sleep(long)
Thread.sleep(long, int)
Object.wait(long)
Object.wait(long, int)
Thread.join(long)
Thread.join(long, int)
LockSupport.parkNanos(long)
java.util.Timer
javax.swing.Timer
Now, I am out of idea to work around this problem. I think there is nothing I can do to prevent the sleeping threads to freeze. But I would like, at least, to warn the user when a dangerous system clock change is detected.
I came up with a monitoring thread that detects such changes :
Thread t = new Thread(new Runnable() {
#Override
public void run() {
long ms1 = System.currentTimeMillis();
long ms2;
while(true) {
ms2 = ms1;
ms1 = System.currentTimeMillis();
if (ms1 < ms2) {
warnUserOfPotentialFreeze();
}
Thread.yield();
}
}
});
t.setName("clock monitor");
t.setPriority(Thread.MIN_PRIORITY);
t.setDaemon(true);
t.start();
The problem is that this makes the application grow from 2% CPU usage to 15% when idle.
Do you have an idea to work around the original problem, or can you think of another way to monitor the appearance of thread freeze ?
Edit
Ingo suggested not to touch the system clock. I agree that it's generally not needed. The problem is that we don't control what our clients do with their computers (we plan to sell hundred of copies).
Worse : one of our machine exhibits this problem without any manual intervention. I guess the OS (Windows XP) regularly synchronizes its clock to the RTC clock, and this makes the OS clock go back in time naturally.
Epilogue
I found out that some statements in my question were wrong. There are actually two separate causes involved in my initial problem. Now, I can say two things for sure :
On my machine only (archlinux with kernel 2.6.37 with an OpenJDK 64 bits 1.6.0_22), Thread.sleep, Object.wait, Thread.join, LockSupport.parkNanos have the same problem : they wake up only when the system clock reaches the "target" time of awakening. However, a simple sleep in my shell does not exhibit the problem.
On all the machines I tested (included mine), java.util.Timer and java.swing.Timer have the same problem (they are blocked until the "target" time is reached).
So, what I've done is that I replaced all the java's Timer by a simpler implementation. This solves the problem for all the machines but mine (I just hope my machine is an exception more than a rule).

According to the bug ticket, your threads aren't frozen, they will resume once the clock catches up to where it was before it was modified (so if they moved it back an hour, your threads will resume in 1 hour).
Of course, that is still not very useful. The root cause seems to be that Thread.sleep() resolves to a system call that puts the thread to sleep until some specific timestamp in the future, rather than for a specified duration. To work around it you would need to implement your own version of Thread.sleep() that uses System.nanoTime() instead of System.currentTimeMillis() or any other time-dependent API. How to do that without using the built-in Thread.sleep() I can't say, however.
Edit:
Or, what if you create some external app in another language (like C or whatever else you prefer) that does nothing but wait for a specified duration and then exit. Then instead of calling Thread.sleep() in Java, you can spawn a new instance of this external process, and then call waitFor() on it. This will "sleep" the Java thread for all practical purposes, and so long as your external app is able to sleep for the correct duration, it will resume at the correct time without getting frozen and without thrashing the CPU.
Seems like a long way to go to fix the issue, but it's the only feasible workaround that I can think of. Also, given that spawning an external process is a relatively expensive operation, it probably works best if you are sleeping for a relatively long time (like several hundred ms or more). For shorter durations it might just continue thrashing the CPU.

As others have said, you definitely shouldn't have to change the system clock. The timestamp (milliseconds since the epoch) is consistent across all computers across the world, but the local time depends on your location, observation on Daylight Savings Time and so on. Therefore, the problem is with the OS locale and time/date settings.
(Still, I agree that if the system clock does change, the JVM should detect this and update or awaken sleeping threads to combat the problem.)

Please test the latest jre 1.7.0_60. It resolves the described problem caused by a system time shift to the past at least for systems with a glibc version released since 2009.
Related bug http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6900441 has been fixed and therefore all functions mentioned by you (Thread.sleep, Object.wait, Thread.join, LockSupport.parkNanos, java.util.Timer and java.swing.Timer) should work as expected.
I have tested it with a linux kernel 3.7.10.

#Op. You have implemented something that looks like "busy waiting", and that will always consume lots of resources.
I agree with the others, I don't see why you need to change the system clock when you go from summer to winter time.

You don't change OS time for DST adjustments! It's nothing more than a Time Zone change. System clock should always be GMT. And the wall clock time that you display to the user is derived from that with the proper time zone offset.

ScheduledThreadPoolExecutor executing a wrong time because of CPU time discrepancy

I'm scheduling a task using a ScheduledThreadPoolExecutor object. I use the following method:
public ScheduledFuture<?> schedule(Runnable command, long delay,TimeUnit unit)
and set the delay to 30 seconds (delay = 30,000 and unit=TimeUnit.MILLISECONDS). Sometimes my task occurs immediately and other times it takes 70 seconds.
I believe the ScheduledThreadPoolExecutor uses CPU specific clocks. When i run tests comparing System.currentTimeMillis(), System.nanoTime() [which is CPU specific] i see the following
schedule: 1272637682651ms, 7858346157228410ns
execute: 1272637682667ms, 7858386270968425ns
difference is 16ms but 4011374001ns (or 40,113ms)
so it looks like there is discrepancy between two CPU clocks of 40 seconds
How do i resolve this issue in java code? Unfortunately this is a clients machine and i can't modify their system.

Yes, you're right that ScheduledThreadPoolExecutor uses System.nanoTime(). And you're also right that System.nanoTime() is dependent on the particular system instance. If your process happens to migrate between schedule and execute, then you're out of luck. (I wouldn't think that migrating between CPUs on a multi-CPU system would matter, but maybe it does? Certainly it would matter if you're running in a VM and the VM migrated between hosts).
I think the only real solution in this case is to use something other than ScheduledThreadPoolExecutor... It's not a simple as just changing ScheduledThreadPoolExecutor.now() either. AbstractQueuedSynchronizer$ConditionObject.awaitNanos() uses System.nanoTime() too.
One of my projects uses Quartz for job scheduling and I've never seen the problem you describe with that library. I don't know the implementation details (maybe it just uses System.nanoTime() too, but maybe not?).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.