How to achieve a guaranteed sleep time on a thread - java

I have a requirement for a class method to be called every 50 milliseconds. I don't use Thread.sleep because it's very important that it happens as precisely as possible to the milli, whereas sleep only guarantees a minimum time. The basic set up is this:
public class ClassA{
public void setup(){
ScheduledExecutorService se = Executors.newScheduledThreadPool(20);
se.scheduleAtFixedRate(this::onCall, 2000, 50, TimeUnit.MILLISECONDS);
}
protected void onCall(Event event) {
// do something
}
}
Now this by and large works fine. I have put System.out.println(System.nanoTime) in onCall to check its being called as precisely as I hope it is. I have found that there is a drift of 1-5 milliseconds over the course of 100s of calls, which corrects itself now and again.
A 5 ms drift unfortunately is pretty hefty for me. 1 milli drift is ok but at 5ms it messes up the calculation I'm doing in onCall because of states of other objects. It would be almost OK if I could get the scheduler to auto-correct such that if it's 5ms late on one call, the next one would happen in 45ms instead of 50.
My question is: Is there a more precise way to achieve this in Java? The only solution I can think of at the moment is to call a check method every 1ms and check the time to see if its at the 50ms mark. But then I'd need to maintain some logic if, on the off-chance, the precise 50ms interval is missed (49,51).
Thanks

Can I achieve a guaranteed sleep time on a thread?
Sorry, but No.
There is no way to get reliable, precise delay timing in a Java SE JVM. You need to use a Real time Java implementation running on a real time operating system.
Here are a couple of reasons why Java SE on a normal OS cannot do this.
At certain points, the GC in a Java SE JVM needs to "stop the world". While this is happening, no user thread can run. If your timer goes off in a "stop the world" pause, it can't be scheduled until the pause is over.
Scheduling of threads in a JVM is actually done by the host operating system. If the system is busy, the host OS may decide not to schedule the JVM's threads when your application needs this to happen.
The java.util.Timer.scheduleAtFixedRate approach is probably as good as you will get on Java SE. It should address long-term drift, but you can't get rid of the "jitter". And that jitter could easily be hundreds of milliseconds ... or even seconds.
Spinlocks won't help if the system is busy and the OS is preempting or not scheduling your threads. (And spinlocking in user code is wasteful ...)

According to the comment, the primary goal is not to concurrently execute multiple tasks at this precise interval. Instead, the goal is to execute a single task at this interval as precisely as possible.
Unfortunately, neither the ScheduledExecutorService nor any manual constructs involving Thread#sleep or LockSupport#parkNanos are very precise in that sense. And as pointed out in the other answers: There may always be influencing factors that are beyond your control - namely, details of the JVM implementation, garbage collection, JIT runs etc.
Nevertheless, a comparatively simple approach to achieve a high precision here is busy waiting. (This was already mentioned in an answer that is now deleted). But of course, this has several caveats. Most importantly, it will burn processing resources of one CPU. (And on a single-CPU-system, this may be particularly bad).
But in order to show that it may be far more precise than other waiting approaches, here is a simple comparison of the ScheduledExecutorService approach and the busy waiting:
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
public class PreciseSchedulingTest
{
public static void main(String[] args)
{
long periodMs = 50;
PreciseSchedulingA a = new PreciseSchedulingA();
a.setup(periodMs);
PreciseSchedulingB b = new PreciseSchedulingB();
b.setup(periodMs);
}
}
class CallTracker implements Runnable
{
String name;
long expectedPeriodMs;
long baseTimeNs;
long callTimesNs[];
int numCalls;
int currentCall;
CallTracker(String name, long expectedPeriodMs)
{
this.name = name;
this.expectedPeriodMs = expectedPeriodMs;
this.baseTimeNs = System.nanoTime();
this.numCalls = 50;
this.callTimesNs = new long[numCalls];
}
#Override
public void run()
{
callTimesNs[currentCall] = System.nanoTime();
currentCall++;
if (currentCall == numCalls)
{
currentCall = 0;
double maxErrorMs = 0;
for (int i = 1; i < numCalls; i++)
{
long ns = callTimesNs[i] - callTimesNs[i - 1];
double ms = ns * 1e-6;
double errorMs = ms - expectedPeriodMs;
if (Math.abs(errorMs) > Math.abs(maxErrorMs))
{
maxErrorMs = errorMs;
}
//System.out.println(errorMs);
}
System.out.println(name + ", maxErrorMs : " + maxErrorMs);
}
}
}
class PreciseSchedulingA
{
public void setup(long periodMs)
{
CallTracker callTracker = new CallTracker("A", periodMs);
ScheduledExecutorService se = Executors.newScheduledThreadPool(20);
se.scheduleAtFixedRate(callTracker, periodMs,
periodMs, TimeUnit.MILLISECONDS);
}
}
class PreciseSchedulingB
{
public void setup(long periodMs)
{
CallTracker callTracker = new CallTracker("B", periodMs);
Thread thread = new Thread(new Runnable()
{
#Override
public void run()
{
while (true)
{
long periodNs = periodMs * 1000 * 1000;
long endNs = System.nanoTime() + periodNs;
while (System.nanoTime() < endNs)
{
// Busy waiting...
}
callTracker.run();
}
}
});
thread.setDaemon(true);
thread.start();
}
}
Again, this should be taken with a grain of salt, but the results on My MachineĀ® are as follows:
A, maxErrorMs : 1.7585339999999974
B, maxErrorMs : 0.06753599999999693
A, maxErrorMs : 1.7669149999999973
B, maxErrorMs : 0.007193999999998368
A, maxErrorMs : 1.7775299999999987
B, maxErrorMs : 0.012780999999996823
showing that the error for the waiting times is in the range of few microseconds.
In order to apply such an approach in practice, a more sophisticated infrastructure would be necessary. E.g. the bookkeeping that is necessary to compensate for waiting times that have been too high. (I think they can't be too low). Also, all this still does not guarantee a precisely timed execution. But it may be an option to consider, at least.

If you really have hard time constraints, you want to use a real-time operating system. General computing does not have hard time constraints; if your OS goes to virtual memory in one of your intervals, then you can miss your sleep interval. The real-time OS will make the tradeoff that you may get less done, but that work will can be better scheduled.
If you need to do this on a normal OS, you can spinlock instead of sleeping. This is really inefficient, but if you really have hard time constraints, it's the best way to approximate that.

That will be hard - think about GC... What I would do is to grab time with nanoTime, and use it in calculations. Or in other words I would get exact time and use it in calculations.

Yes (assuming you only want to prevent long term drifts and don't worry about each delay individually). java.util.Timer.scheduleAtFixedRate:
...In fixed-rate execution, each execution is scheduled relative to the scheduled execution time of the initial execution. If an execution is delayed for any reason (such as garbage collection or other background activity), two or more executions will occur in rapid succession to "catch up." In the long run, the frequency of execution will be exactly the reciprocal of the specified period (assuming the system clock underlying Object.wait(long) is accurate). ...
Basically, do something like this:
new Timer().scheduleAtFixedRate(new TimerTask() {
#Override
public void run() {
this.onCall();
}
}, 2000, 50);

Related

Thread.sleep() optimization for small sleep intervals

I am writing a library that involves a caller-defined temporal resolution. In the implementation, this value ends up being an interval some background thread will sleep before doing some housekeeping and going back to sleep again. I am allowing this resolution to be as small as 1 millisecond, which translates to Thread.sleep(1). My hunch is that that may be more wasteful and less precise than busy-waiting for 1 ms. If that's the case;
Should I fall back to busy-waiting for small enough (how small) time intervals?
Does anyone know if the JVM is already doing this optimization anyway and I don't need to do anything at all?
That's easy to test:
public class Test {
static int i = 0;
static long[] measurements = new long[0x100];
static void report(long value) {
measurements[i++ & 0xff] = value;
if (i > 10_000) {
for (long m : measurements) {
System.out.println(m);
}
System.exit(0);
}
}
static void sleepyWait() throws Exception {
while (true) {
long before = System.nanoTime();
Thread.sleep(1);
long now = System.nanoTime();
report(now - before);
}
}
static void busyWait() {
while (true) {
long before = System.nanoTime();
long now;
do {
now = System.nanoTime();
} while (before + 1_000_000 >= now);
report(now - before);
}
}
public static void main(String[] args) throws Exception {
busyWait();
}
}
Run on my windows system, this shows that busyWait has microsecond accuracy, but fully uses one CPU core.
In contrast, sleepyWait causes no measurable CPU load, but only achieves millisecond accuracy (often taking as much as 2 ms to fire, rather than the 1 ms requested).
At least on windows, this is therefore a straightforward tradeoff between accuracy and CPU use.
It's also worth noting that there are often alternatives to running a CPU at full speed obsessively checking the time. In many cases, there is some other signal you could be waiting for, and offering an API that focuses on time-based resolution may steer the users of your API in a bad direction.

Multithreaded vs Asynchronous programming in a single core

If in real time the CPU performs only one task at a time then how is multithreading different from asynchronous programming (in terms of efficiency) in a single processor system?
Lets say for example we have to count from 1 to IntegerMax. In the following program for my multicore machine, the two thread final count count is almost half of the single thread count. What if we ran this in a single core machine? And is there any way we could achieve the same result there?
class Demonstration {
public static void main( String args[] ) throws InterruptedException {
SumUpExample.runTest();
}
}
class SumUpExample {
long startRange;
long endRange;
long counter = 0;
static long MAX_NUM = Integer.MAX_VALUE;
public SumUpExample(long startRange, long endRange) {
this.startRange = startRange;
this.endRange = endRange;
}
public void add() {
for (long i = startRange; i <= endRange; i++) {
counter += i;
}
}
static public void twoThreads() throws InterruptedException {
long start = System.currentTimeMillis();
SumUpExample s1 = new SumUpExample(1, MAX_NUM / 2);
SumUpExample s2 = new SumUpExample(1 + (MAX_NUM / 2), MAX_NUM);
Thread t1 = new Thread(() -> {
s1.add();
});
Thread t2 = new Thread(() -> {
s2.add();
});
t1.start();
t2.start();
t1.join();
t2.join();
long finalCount = s1.counter + s2.counter;
long end = System.currentTimeMillis();
System.out.println("Two threads final count = " + finalCount + " took " + (end - start));
}
static public void oneThread() {
long start = System.currentTimeMillis();
SumUpExample s = new SumUpExample(1, MAX_NUM );
s.add();
long end = System.currentTimeMillis();
System.out.println("Single thread final count = " + s.counter + " took " + (end - start));
}
public static void runTest() throws InterruptedException {
oneThread();
twoThreads();
}
}
Output:
Single thread final count = 2305843008139952128 took 1003
Two threads final count = 2305843008139952128 took 540
For a purely CPU-bound operation you are correct. Most (99.9999%) of programs need to do input, output, and invoke other services. Those are orders of magnitude slower than the CPU, so while waiting for the results of an external operation, the OS can schedule and run other (many other) processes in time slices.
Hardware multithreading benefits primarily when 2 conditions are met:
CPU-intensive operations;
That can be efficiently divided into independent subsets
Or you have lots of different tasks to run that can be efficiently divided among multiple hardware processors.
In the following program for my multicore machine, the two thread final count count is almost half of the single thread count.
That is what I would expect from a valid benchmark when the application is using two cores.
However, looking at your code, I am somewhat surprised that you are getting those results ... so reliably.
Your benchmark doesn't take account of JVM warmup effects, particularly JIT compilation.
You benchmark's add method could potentially be optimized by the JIT compiler to get rid of the loop entirely. (But at least the counts are "used" ... by printing them out.)
I guess you got lucky ... but I'm not convinced those results will be reproducible for all versions of Java, or if you tweaked the benchmark.
Please read this:
How do I write a correct micro-benchmark in Java?
What if we ran this in a single core machine?
Assuming the following:
You rewrote the benchmark to corrected the flaws above.
You are running on a system where hardware hyper-threading1 is disabled2.
Then ... I would expect it to take two threads to take more than twice as long as the one thread version.
Q: Why "more than"?
A: Because there is a significant overhead in starting a new thread. Depending on your hardware, OS and Java version, it could be more than a millisecond. Certainly, the time taken is significant if you repeatedly use and discard threads.
And is there any way we could achieve the same result there?
Not sure what you are asking here. But are if you are asking how to simulate the behavior of one core on a multi-core machine, you would probably need to do this at the OS level. See https://superuser.com/questions/309617 for Windows and https://askubuntu.com/questions/483824 for Linux.
1 - Hyperthreading is a hardware optimization where a single core's processing hardware supports (typically) two hyper-threads. Each hyperthread
has its own sets of registers, but it shares functional units such as the ALU with the other hyperthread. So the two hyperthreads behave like (typically) two cores, except that they may be slower, depending on the precise instruction mix. A typical OS will treat a hyperthread as if it is a regular core. Hyperthreading is typically enabled / disabled at boot time; e.g. via a BIOS setting.
2 - If hyperthreading is enabled, it is possible that two Java threads won't be twice as fast as one in a CPU-intensive computation like this ... due to possible slowdown caused by the "other" hyperthread on respective cores. Did someone mention that benchmarking is complicated?

Odd c++/java multi threading performance results compared to single thread

I was struggling since 2 days to understand what is going on with c++ threadpool performance compared to a single thread, then I decided to do the same on java, this is when I noticed that the behaviour is same on c++ and java.. basically my code is simple straight forward.
package com.examples.threading
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.atomic.AtomicLong;
public class ThreadPool {
final static AtomicLong lookups = new AtomicLong(0);
final static AtomicLong totalTime = new AtomicLong(0);
public static class Task implements Runnable
{
int start = 0;
Task(int s) {
start = s;
}
#Override
public void run()
{
for (int j = start ; j < start + 3000; j++ ) {
long st = System.nanoTime();
boolean a = false;
long et = System.nanoTime();
totalTime.getAndAdd((et - st));
lookups.getAndAdd(1l);
}
}
}
public static void main(String[] args)
{
// change threads from 1 -> 100 then you will get different numbers
ExecutorService executor = Executors.newFixedThreadPool(1);
for (int i = 0; i <= 1000000; i++)
{
if (i % 3000 == 0) {
Task task = new Task(i);
executor.execute(task);
System.out.println("in time " + (totalTime.doubleValue()/lookups.doubleValue()) + " lookups: " + lookups.toString());
}
}
executor.shutdown();
while (!executor.isTerminated()) {
;
}
System.out.println("in time " + (totalTime.doubleValue()/lookups.doubleValue()) + " lookups: " + lookups.toString());
}
}
now same code when you run with different pool number say like 100 threads, the overall elapsed time will change.
one thread:
in time 36.91493612774451 lookups: 1002000
100 threads:
in time 141.47934530938124 lookups: 1002000
the question is, the code is same why the overall elapsed time is different what is exactly going on here..
You have a couple of obvious possibilities here.
One is that System.nanoTime may serialize internally, so even though each thread is making its call separately, it may internally execute those calls in sequence (and, for example, queue up calls as they come in). This is particularly likely when nanoTime directly accesses a hardware clock, such as on Windows (where it uses Windows' QueryPerformanceCounter).
Another point at which you get essentially sequential execution is your atomic variables. Even though you're using lock-free atomics, the basic fact is that each has to execute a read/modify/write as an atomic sequence. With locked variables, that's done by locking, then reading, modifying, writing, and unlocking. With lock-free, you eliminate some of the overhead in doing that, but you're still stuck with the fact that only one thread can successfully read, modify, and write a particular memory location at a given time.
In this case the only "work" each thread is doing is trivial, and the result is never used, so the optimizer can (and probably will) eliminate it entirely. So all you're really measuring is the time to read the clock and increment your variables.
To gain at least some of the speed back, you could (for one example) give thread thread its own lookups and totalTime variable. Then when all the threads finish, you can add together the values for the individual threads to get an overall total for each.
Preventing serialization of the timing is a little more difficult (to put it mildly). At least in the obvious design, each call to nanoTime directly accesses a hardware register, which (at least with most typical hardware) can only happen sequentially. It could be fixed at the hardware level (provide a high-frequency timer register that's directly readable per-core, guaranteed to be synced between cores). That's a somewhat non-trivial task, and (more importantly) most current hardware just doesn't include such a thing.
Other than that, do some meaningful work in each thread, so when you execute in multiple threads, you have something that can actually use the resources of your multiple CPUs/cores to run faster.

Wait for system time to continue application

I've written a class to continue a started JAVA application if the current second is a multiple of 5 (i.e. Calender.SECOND % 5 == 0)
The class code is presented below, what I'm curious about is, am I doing this the right way? It doesn't seem like an elegant solution, blocking the execution like this and getting the instance over and over.
public class Synchronizer{
private static Calendar c;
public static void timeInSync(){
do{
c = Calendar.getInstance();
}
while(c.get(Calendar.SECOND) % 5 != 0);
}
}
Synchronizer.timeInSync() is called in another class's constructor and an instance of that class is created at the start of the main method. Then the application runs forever with a TimerTask that's called every 5 seconds.
Is there a cleaner solution for synchronizing the time?
Update:
I think I did not clearly stated but what I'm looking for here is to synchronization with the system time without doing busy waiting.
So I need to be able to get
12:19:00
12:19:05
12:19:10
...
What you have now is called busy waiting (also sometimes referred as polling), and yes its inefficient in terms of processor usage and also in terms of energy usage. You code executes whenever the OS allows it, and in doing so it prevents the use of a CPU for other work, or when there is no other work it prevents the CPU from taking a nap, wasting energy (heating the CPU, draining the battery...).
What you should do is put your thread to sleep until the time where you want to do something arrives. This allows the CPU to perform other tasks or go to sleep.
There is a method on java.lang.Thread to do just that: Thread.sleep(long milliseconds) (it also has a cousin taking an additional nanos parameter, but the nanos may be ignored by the VM, and that kind of precision is rarely needed).
So first you determine when you need to do some work. Then you sleep until then. A naive implementation could look like that:
public static void waitUntil(long timestamp) {
long millis = timestamp - System.currentTimeMillis();
// return immediately if time is already in the past
if (millis <= 0)
return;
try {
Thread.sleep(millis);
} catch (InterruptedException e) {
throw new RuntimeException(e.getMessage(), e);
}
}
This works fine if you don't have too strict requirements on precisely hitting the time, you can expect it to return reasonably close to the specified time (a few ten ms away probably) if the time isn't too far in the future (a few secs tops). You have however no guarantees that occasionally when the OS is really busy that it possily returns much later.
A slightly more accurate method is to determine the reuired sleep time, sleep for half the time, evaluate required sleep again, sleep again half the time and so on until the required sleep time becomes very small, then busy wait the remaining few milliseconds.
However System.currentTimeMillis() does not guarantee the actual resolution of time; it may change once every millisecond, but it might as well only change every ten ms by 10 (this depends on the platform). Same goes for System.nanoTime().
Waiting for an exact point in time is not possible in high level programming languages in a multi-tasking environment (practically everywhere nowadays). If you have strict requirements, you need to turn to the operating system specifics to create an interrupt at the specified time and handle the event in the interrupt (that means assembler or at least C for the interrupt handler). You won't need that in most normal applications, a few ms +/- usually don't matter in a game/application.
As #ChrisK suggests, you could simplify by just making a direct call to System.currentTimeMillis().
For example:
long time = 0;
do
{
time = System.currentTimeMillis();
} while (time % 5000 != 0);
Note that you need to change the comparison value to 5000 because the representation of the time is in milliseconds.
Also, there are possible pitfalls to doing any comparison so directly like this, as the looping call depends on processor availability and whatnot, so there is a chance that an implementation such as this could make one call that returns:
`1411482384999`
And then the next call in the loop return
`1411482385001`
Meaning that your condition has been skipped by virtue of hardware availability.
If you want to use a built in scheduler, I suggest looking at the answer to a similar question here java: run a function after a specific number of seconds
You should use
System.nanoTime()
instead of
System.currentTimeMillis()
because it returns the measured elapsed time instead of the system time, so nanoTime is not influenced by system time changes.
public class Synchronizer
{
public static void timeInSync()
{
long lastNanoTime = System.nanoTime();
long nowTime = System.nanoTime();
while(nowTime/1000000 - lastNanoTime /1000000 < 5000 )
{
nowTime = System.nanoTime();
}
}
}
The first main point is that you must never use busy-waiting. In java you can avoid busy-waiting by using either Object.wait(timeout) or Thread.sleep(timeout). The later is more suitable for your case, because your case doesn't require losing monitor lock.
Next, you can use two approaches to wait until your time condition is satisfied. You can either precalculate your whole wait time or wait for small time intervals in loop, checking the condition.
I will illustrate both approaches here:
private static long nextWakeTime(long time) {
if (time / 1000 % 5 == 0) { // current time is multiple of five seconds
return time;
}
return (time / 1000 / 5 + 1) * 5000;
}
private static void waitUsingCalculatedTime() {
long currentTime = System.currentTimeMillis();
long wakeTime = nextWakeTime(currentTime);
while (currentTime < wakeTime) {
try {
System.out.printf("Current time: %d%n", currentTime);
System.out.printf("Wake time: %d%n", wakeTime);
System.out.printf("Waiting: %d ms%n", wakeTime - currentTime);
Thread.sleep(wakeTime - currentTime);
} catch (InterruptedException e) {
// ignore
}
currentTime = System.currentTimeMillis();
}
}
private static void waitUsingSmallTime() {
while (System.currentTimeMillis() / 1000 % 5 != 0) {
try {
System.out.printf("Current time: %d%n", System.currentTimeMillis());
Thread.sleep(100);
} catch (InterruptedException e) {
// ignore
}
}
}
As you can see, waiting for the precalculated time is more complex, but it is more precise and more efficient (since in general case it will be done in single iteration). Waiting iteratively for small time interval is simpler, but less efficient and precise (precision is dependent on the selected size of the time interval).
Also please note how I calculate if the time condition is satisfied:
(time / 1000 % 5 == 0)
In first step you need to calculate seconds and only then check if the are multiple of five. Checking by time % 5000 == 0 as suggested in other answer is wrong, as it is true only for the first millisecond of each fifth second.

(Java) Ticker that adds to counter variable

Im trying to get a timer to work in my current java project that adds 1 to an integer variable every n microseconds (e.g. 500 for 1/2 a second), within an infinite loop, so that it is always running while the program runs.
Heres the code i have currently:
public class Ticker
{
public int time = 0;
long t0, t1;
public void tick(int[] args)
{
for (int i = 2; i < 1; i++)
{
t0 = System.currentTimeMillis();
do
{
t1 = System.currentTimeMillis();
}
while (t1 - t0 < 500);
time = time + 1;
}
}
}
Everyone was so helpful with my last question, hopefully this one is just as easy
Here is an comparable ScheduledExecutorService example which will update the time variable with a 500 millisecond interval:
ScheduledExecutorService exec = Executors.newScheduledThreadPool(1);
exec.scheduleAtFixedRate(new Runnable(){
private int time = 0;
#Override
public void run(){
time++;
System.out.println("Time: " + time);
}
}, 0, 500, TimeUnit.MILLISECONDS);
This approach is preferred over using Timer.
I think you want
Thread.sleep(500);
At the moment you're consuming CPU cycles waiting for 500ms (you mention microseconds but I believe you want milliseconds). The above puts your current thread to sleep for 500ms and your process won't consume any CPU (or minimal at least - garbage collection will still be running). If you watch the CPU when you run your version you should see the difference.
See here for more info.
If you need to do it in a different thread, take a look on Timer:
int delay = 500; //milliseconds
ActionListener taskPerformer = new ActionListener() {
public void actionPerformed(ActionEvent evt) {
time++
}
};
new Timer(delay, taskPerformer).start();
Note that the code above cannot utilize a local variable (they must be declared as final to access them in an anonymous class). It can be a member however.
What you have is rather inefficient, since it wastes CPU cycles waiting for the next wakeup time. If I were you, I'd rewrite the function using Thread.sleep().
As to why your current code doesn't work, your for loop conditions are off, so the loop is never entered.
To have the timer code run concurrently with whatever other logic you have in your program, you'll need to look into threading.
It sounds like you might want to look into multithreading. If you search SO for this, you will find several good question/answer threads. There are also tutorials elsewhere on the web...
Have a look at Timer or better ScheduledExecutorService. They enable you to execute some action periodically and handle the computations surrounding that.

Categories