My current setup:
public void launchBenchmark() throws Exception {
Options opt = new OptionsBuilder()
.include(this.getClass().getName())
.mode(Mode.Throughput) //Calculate number of operations in a time unit.
.mode(Mode.AverageTime) //Calculate an average running time per operation
.timeUnit(TimeUnit.MILLISECONDS)
.warmupIterations(1)
.measurementIterations(30)
.threads(Runtime.getRuntime().availableProcessors())
.forks(1)
.shouldFailOnError(true)
.shouldDoGC(true)
.build();
new Runner(opt).run();
}
How can I know/control (if possible) the number of operations is performed per benchmark ?
And is it important to set warmUp time and measurementIteration time?
Thank you.
You cannot control the number of operations per iteration. The whole point of JMH is to correctly measure that number.
You can configure the warmup using the annotation:
#Warmup(iterations = 10, time = 500, timeUnit = MILLISECONDS)
And the measurement by:
#Measurement(iterations = 200, time = 200, timeUnit = MILLISECONDS)
Just set the appropriate values for your use case
Related
In JMH(Java Microbenchmark Harness), we can use
#BenchmarkMode(Mode.AverageTime)
#Warmup(iterations = 10)
#Measurement(iterations = 10)
to evaluate the average time of an execution after JVM warms up.
Also we can use
#BenchmarkMode(Mode.SingleShotTime)
#Measurement(iterations = 1)
to estimate the cold start time of an execution. But this only executes the benchmark once, which may introduce bias. So is there any method to evaluate the average time of the cold start in JMH?
According to Alexey himself (though from 2014):
Single-shot benchmarks were originally destined to run a single
measurement iteration over multiple forks -- the scenarios to estimate
"cold" performance. But for many cases, you might want more measurement
iterations there especially if you are running only a single fork,
because more samples would be generated.
#BenchmarkMode(Mode.AverageTime)
#OutputTimeUnit(TimeUnit.NANOSECONDS)
public class AverageSingleShot {
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder()
.include(AverageSingleShot.class.getSimpleName())
.build();
new Runner(opt).run();
}
#Fork(100)
#Benchmark
#BenchmarkMode(Mode.SingleShotTime)
public int test() {
return ThreadLocalRandom.current().nextInt() + ThreadLocalRandom.current().nextInt();
}
}
Besides the fact that this will tell you the average (see that 100):
Benchmark Mode Cnt Score Error Units
AverageSingleShot.test ss 100 41173.540 ± 2871.546 ns/op
you will also get Percentiles and a Histogram.
I have a generator which generates events for Flink CEP, code for which is given below. Basically, I am using Thread.sleep() and I have read somewhere that java can't sleep less than 1 millisecond even we use System.nanoTime(). Code for the generator is
public class RR_interval_Gen extends RichParallelSourceFunction<RRIntervalStreamEvent> {
Integer InputRate ; // events/second
Integer Sleeptime ;
Integer NumberOfEvents;
public RR_interval_Gen(Integer inputRate, Integer numberOfEvents ) {
this.InputRate = inputRate;
Sleeptime = 1000 / InputRate;
NumberOfEvents = numberOfEvents;
}
#Override
public void run(SourceContext<RRIntervalStreamEvent> sourceContext) throws Exception {
long currentTime;
Random random = new Random();
int RRInterval;
int Sensor_id;
for(int i = 1 ; i <= NumberOfEvents ; i++) {
Sensor_id = 2;
currentTime = System.currentTimeMillis();
// int randomNum = rand.nextInt((max - min) + 1) + min;
RRInterval = 10 + random.nextInt((20-10)+ 1);
RRIntervalStreamEvent stream = new RRIntervalStreamEvent(Sensor_id,currentTime,RRInterval);
synchronized (sourceContext.getCheckpointLock())
{
sourceContext.collect(stream);
}
Thread.sleep(Sleeptime);
}
}
#Override
public void cancel() {
}
}
I will specify my requirement here in simple words.
I want generator class to generate events, let's say an ECG stream at 1200 Hz. This generator will accept parameters like input rate and total time for which we have to generate the stream.
So far so good, the issue is that I need to send more than 1000 events / second. How can I do this by using generator function which is generating values U[10,20]?
Also please let me know if I am using wrong way to generate x number of events / second in the above below.
Sleeptime = 1000 / InputRate;
Thanks in advance
The least sleep time in Windows systems is ~ 10 ms and in Linux and Macintosh is 1 millisecond as mentioned here.
The granularity of sleep is generally bound by the thread scheduler's
interrupt period. In Linux, this interrupt period is generally 1ms in
recent kernels. In Windows, the scheduler's interrupt period is
normally around 10 or 15 milliseconds
Through my research, I learned that using the nano time sleep in java will not help as the issue in at OS level. If you want to send data at arrival rate > 1000 in a controlled way, then it can be done using Real-Time Operating Systems (RTOS), as they can sleep for less then a millisecond. Now, I have come up with another way of doing it, but in this solution, the interarrival times will not be constantly distributed.
Let's say you want arrival rate of 3000 events/ second, then you can create a for loop which iterates 3 times to send data in each iteration and then sleep for 1ms. So for the 3 tuples, the interarrival time will be close to one another, but the issue will be solved. This may be a stupid solution but it works.
Please let me know if there is some better solution to this.
I have a requirement for a class method to be called every 50 milliseconds. I don't use Thread.sleep because it's very important that it happens as precisely as possible to the milli, whereas sleep only guarantees a minimum time. The basic set up is this:
public class ClassA{
public void setup(){
ScheduledExecutorService se = Executors.newScheduledThreadPool(20);
se.scheduleAtFixedRate(this::onCall, 2000, 50, TimeUnit.MILLISECONDS);
}
protected void onCall(Event event) {
// do something
}
}
Now this by and large works fine. I have put System.out.println(System.nanoTime) in onCall to check its being called as precisely as I hope it is. I have found that there is a drift of 1-5 milliseconds over the course of 100s of calls, which corrects itself now and again.
A 5 ms drift unfortunately is pretty hefty for me. 1 milli drift is ok but at 5ms it messes up the calculation I'm doing in onCall because of states of other objects. It would be almost OK if I could get the scheduler to auto-correct such that if it's 5ms late on one call, the next one would happen in 45ms instead of 50.
My question is: Is there a more precise way to achieve this in Java? The only solution I can think of at the moment is to call a check method every 1ms and check the time to see if its at the 50ms mark. But then I'd need to maintain some logic if, on the off-chance, the precise 50ms interval is missed (49,51).
Thanks
Can I achieve a guaranteed sleep time on a thread?
Sorry, but No.
There is no way to get reliable, precise delay timing in a Java SE JVM. You need to use a Real time Java implementation running on a real time operating system.
Here are a couple of reasons why Java SE on a normal OS cannot do this.
At certain points, the GC in a Java SE JVM needs to "stop the world". While this is happening, no user thread can run. If your timer goes off in a "stop the world" pause, it can't be scheduled until the pause is over.
Scheduling of threads in a JVM is actually done by the host operating system. If the system is busy, the host OS may decide not to schedule the JVM's threads when your application needs this to happen.
The java.util.Timer.scheduleAtFixedRate approach is probably as good as you will get on Java SE. It should address long-term drift, but you can't get rid of the "jitter". And that jitter could easily be hundreds of milliseconds ... or even seconds.
Spinlocks won't help if the system is busy and the OS is preempting or not scheduling your threads. (And spinlocking in user code is wasteful ...)
According to the comment, the primary goal is not to concurrently execute multiple tasks at this precise interval. Instead, the goal is to execute a single task at this interval as precisely as possible.
Unfortunately, neither the ScheduledExecutorService nor any manual constructs involving Thread#sleep or LockSupport#parkNanos are very precise in that sense. And as pointed out in the other answers: There may always be influencing factors that are beyond your control - namely, details of the JVM implementation, garbage collection, JIT runs etc.
Nevertheless, a comparatively simple approach to achieve a high precision here is busy waiting. (This was already mentioned in an answer that is now deleted). But of course, this has several caveats. Most importantly, it will burn processing resources of one CPU. (And on a single-CPU-system, this may be particularly bad).
But in order to show that it may be far more precise than other waiting approaches, here is a simple comparison of the ScheduledExecutorService approach and the busy waiting:
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
public class PreciseSchedulingTest
{
public static void main(String[] args)
{
long periodMs = 50;
PreciseSchedulingA a = new PreciseSchedulingA();
a.setup(periodMs);
PreciseSchedulingB b = new PreciseSchedulingB();
b.setup(periodMs);
}
}
class CallTracker implements Runnable
{
String name;
long expectedPeriodMs;
long baseTimeNs;
long callTimesNs[];
int numCalls;
int currentCall;
CallTracker(String name, long expectedPeriodMs)
{
this.name = name;
this.expectedPeriodMs = expectedPeriodMs;
this.baseTimeNs = System.nanoTime();
this.numCalls = 50;
this.callTimesNs = new long[numCalls];
}
#Override
public void run()
{
callTimesNs[currentCall] = System.nanoTime();
currentCall++;
if (currentCall == numCalls)
{
currentCall = 0;
double maxErrorMs = 0;
for (int i = 1; i < numCalls; i++)
{
long ns = callTimesNs[i] - callTimesNs[i - 1];
double ms = ns * 1e-6;
double errorMs = ms - expectedPeriodMs;
if (Math.abs(errorMs) > Math.abs(maxErrorMs))
{
maxErrorMs = errorMs;
}
//System.out.println(errorMs);
}
System.out.println(name + ", maxErrorMs : " + maxErrorMs);
}
}
}
class PreciseSchedulingA
{
public void setup(long periodMs)
{
CallTracker callTracker = new CallTracker("A", periodMs);
ScheduledExecutorService se = Executors.newScheduledThreadPool(20);
se.scheduleAtFixedRate(callTracker, periodMs,
periodMs, TimeUnit.MILLISECONDS);
}
}
class PreciseSchedulingB
{
public void setup(long periodMs)
{
CallTracker callTracker = new CallTracker("B", periodMs);
Thread thread = new Thread(new Runnable()
{
#Override
public void run()
{
while (true)
{
long periodNs = periodMs * 1000 * 1000;
long endNs = System.nanoTime() + periodNs;
while (System.nanoTime() < endNs)
{
// Busy waiting...
}
callTracker.run();
}
}
});
thread.setDaemon(true);
thread.start();
}
}
Again, this should be taken with a grain of salt, but the results on My Machine® are as follows:
A, maxErrorMs : 1.7585339999999974
B, maxErrorMs : 0.06753599999999693
A, maxErrorMs : 1.7669149999999973
B, maxErrorMs : 0.007193999999998368
A, maxErrorMs : 1.7775299999999987
B, maxErrorMs : 0.012780999999996823
showing that the error for the waiting times is in the range of few microseconds.
In order to apply such an approach in practice, a more sophisticated infrastructure would be necessary. E.g. the bookkeeping that is necessary to compensate for waiting times that have been too high. (I think they can't be too low). Also, all this still does not guarantee a precisely timed execution. But it may be an option to consider, at least.
If you really have hard time constraints, you want to use a real-time operating system. General computing does not have hard time constraints; if your OS goes to virtual memory in one of your intervals, then you can miss your sleep interval. The real-time OS will make the tradeoff that you may get less done, but that work will can be better scheduled.
If you need to do this on a normal OS, you can spinlock instead of sleeping. This is really inefficient, but if you really have hard time constraints, it's the best way to approximate that.
That will be hard - think about GC... What I would do is to grab time with nanoTime, and use it in calculations. Or in other words I would get exact time and use it in calculations.
Yes (assuming you only want to prevent long term drifts and don't worry about each delay individually). java.util.Timer.scheduleAtFixedRate:
...In fixed-rate execution, each execution is scheduled relative to the scheduled execution time of the initial execution. If an execution is delayed for any reason (such as garbage collection or other background activity), two or more executions will occur in rapid succession to "catch up." In the long run, the frequency of execution will be exactly the reciprocal of the specified period (assuming the system clock underlying Object.wait(long) is accurate). ...
Basically, do something like this:
new Timer().scheduleAtFixedRate(new TimerTask() {
#Override
public void run() {
this.onCall();
}
}, 2000, 50);
So I'm trying to play a bit with microbenchmarks, have chosen JMH, have read some articles. How JMH measures execution of methods below system's timer granularity?
A more detailed explanation:
These are the benchmarks I'm running (method names speak for themselves):
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;
import java.util.concurrent.TimeUnit;
#BenchmarkMode(Mode.AverageTime)
#OutputTimeUnit(TimeUnit.NANOSECONDS)
#State(Scope.Thread)
#Warmup(iterations = 10, time = 200, timeUnit = TimeUnit.NANOSECONDS)
#Measurement(iterations = 20, time = 200, timeUnit = TimeUnit.NANOSECONDS)
public class RandomBenchmark {
public long lastValue;
#Benchmark
#Fork(1)
public void blankMethod() {
}
#Benchmark
#Fork(1)
public void simpleMethod(Blackhole blackhole) {
int i = 0;
blackhole.consume(i++);
}
#Benchmark
#Fork(1)
public void granularityMethod(Blackhole blackhole) {
long initialTime = System.nanoTime();
long measuredTime;
do {
measuredTime = System.nanoTime();
} while (measuredTime == initialTime);
blackhole.consume(measuredTime);
}
}
Here are results:
# Run complete. Total time: 00:00:02
Benchmark Mode Cnt Score Error Units
RandomBenchmark.blankMethod avgt 20 0,887 ? 0,274 ns/op
RandomBenchmark.granularityMethod avgt 20 407,002 ? 26,297 ns/op
RandomBenchmark.simpleMethod avgt 20 6,979 ? 0,743 ns/op
Currently ran on Windows 7 and as it's described in various articles it has big granularity (407 ns). Checking with basic code below it's indeed new timer value comes every ~400ns:
final int sampleSize = 100;
long[] timeMarks = new long[sampleSize];
for (int i=0; i < sampleSize; i++) {
timeMarks[i] = System.nanoTime();
}
for (long timeMark : timeMarks) {
System.out.println(timeMark);
}
It's hard to fully understand how generated methods exactly work but looking through decompiled JMH generated code it seems like it's using the same System.nanoTime() before and after execution and measures the difference. How is it able to measure method execution of couple nanoseconds while granularity is 400 ns?
You are totally right. You cannot measure something that is faster than your system's timer granularity.
JMH doesn't measure each invocation of the bechmark method. It calls System.nanotime() before the start of an iteration, executes the benchmark method X times and call System.nanotime() again after the iteration. The results is then time difference / # of operations (potentially you specify on the method more than 1 operation per invocation with #OperationsPerInvocation).
Aleksey Shipilev discussed measurement problems with Nanotime in his article Nanotrusting the Nanotime. Section 'Latency' contains a code example that shows how JMH measures one benchmark iteration.
I've written a class to continue a started JAVA application if the current second is a multiple of 5 (i.e. Calender.SECOND % 5 == 0)
The class code is presented below, what I'm curious about is, am I doing this the right way? It doesn't seem like an elegant solution, blocking the execution like this and getting the instance over and over.
public class Synchronizer{
private static Calendar c;
public static void timeInSync(){
do{
c = Calendar.getInstance();
}
while(c.get(Calendar.SECOND) % 5 != 0);
}
}
Synchronizer.timeInSync() is called in another class's constructor and an instance of that class is created at the start of the main method. Then the application runs forever with a TimerTask that's called every 5 seconds.
Is there a cleaner solution for synchronizing the time?
Update:
I think I did not clearly stated but what I'm looking for here is to synchronization with the system time without doing busy waiting.
So I need to be able to get
12:19:00
12:19:05
12:19:10
...
What you have now is called busy waiting (also sometimes referred as polling), and yes its inefficient in terms of processor usage and also in terms of energy usage. You code executes whenever the OS allows it, and in doing so it prevents the use of a CPU for other work, or when there is no other work it prevents the CPU from taking a nap, wasting energy (heating the CPU, draining the battery...).
What you should do is put your thread to sleep until the time where you want to do something arrives. This allows the CPU to perform other tasks or go to sleep.
There is a method on java.lang.Thread to do just that: Thread.sleep(long milliseconds) (it also has a cousin taking an additional nanos parameter, but the nanos may be ignored by the VM, and that kind of precision is rarely needed).
So first you determine when you need to do some work. Then you sleep until then. A naive implementation could look like that:
public static void waitUntil(long timestamp) {
long millis = timestamp - System.currentTimeMillis();
// return immediately if time is already in the past
if (millis <= 0)
return;
try {
Thread.sleep(millis);
} catch (InterruptedException e) {
throw new RuntimeException(e.getMessage(), e);
}
}
This works fine if you don't have too strict requirements on precisely hitting the time, you can expect it to return reasonably close to the specified time (a few ten ms away probably) if the time isn't too far in the future (a few secs tops). You have however no guarantees that occasionally when the OS is really busy that it possily returns much later.
A slightly more accurate method is to determine the reuired sleep time, sleep for half the time, evaluate required sleep again, sleep again half the time and so on until the required sleep time becomes very small, then busy wait the remaining few milliseconds.
However System.currentTimeMillis() does not guarantee the actual resolution of time; it may change once every millisecond, but it might as well only change every ten ms by 10 (this depends on the platform). Same goes for System.nanoTime().
Waiting for an exact point in time is not possible in high level programming languages in a multi-tasking environment (practically everywhere nowadays). If you have strict requirements, you need to turn to the operating system specifics to create an interrupt at the specified time and handle the event in the interrupt (that means assembler or at least C for the interrupt handler). You won't need that in most normal applications, a few ms +/- usually don't matter in a game/application.
As #ChrisK suggests, you could simplify by just making a direct call to System.currentTimeMillis().
For example:
long time = 0;
do
{
time = System.currentTimeMillis();
} while (time % 5000 != 0);
Note that you need to change the comparison value to 5000 because the representation of the time is in milliseconds.
Also, there are possible pitfalls to doing any comparison so directly like this, as the looping call depends on processor availability and whatnot, so there is a chance that an implementation such as this could make one call that returns:
`1411482384999`
And then the next call in the loop return
`1411482385001`
Meaning that your condition has been skipped by virtue of hardware availability.
If you want to use a built in scheduler, I suggest looking at the answer to a similar question here java: run a function after a specific number of seconds
You should use
System.nanoTime()
instead of
System.currentTimeMillis()
because it returns the measured elapsed time instead of the system time, so nanoTime is not influenced by system time changes.
public class Synchronizer
{
public static void timeInSync()
{
long lastNanoTime = System.nanoTime();
long nowTime = System.nanoTime();
while(nowTime/1000000 - lastNanoTime /1000000 < 5000 )
{
nowTime = System.nanoTime();
}
}
}
The first main point is that you must never use busy-waiting. In java you can avoid busy-waiting by using either Object.wait(timeout) or Thread.sleep(timeout). The later is more suitable for your case, because your case doesn't require losing monitor lock.
Next, you can use two approaches to wait until your time condition is satisfied. You can either precalculate your whole wait time or wait for small time intervals in loop, checking the condition.
I will illustrate both approaches here:
private static long nextWakeTime(long time) {
if (time / 1000 % 5 == 0) { // current time is multiple of five seconds
return time;
}
return (time / 1000 / 5 + 1) * 5000;
}
private static void waitUsingCalculatedTime() {
long currentTime = System.currentTimeMillis();
long wakeTime = nextWakeTime(currentTime);
while (currentTime < wakeTime) {
try {
System.out.printf("Current time: %d%n", currentTime);
System.out.printf("Wake time: %d%n", wakeTime);
System.out.printf("Waiting: %d ms%n", wakeTime - currentTime);
Thread.sleep(wakeTime - currentTime);
} catch (InterruptedException e) {
// ignore
}
currentTime = System.currentTimeMillis();
}
}
private static void waitUsingSmallTime() {
while (System.currentTimeMillis() / 1000 % 5 != 0) {
try {
System.out.printf("Current time: %d%n", System.currentTimeMillis());
Thread.sleep(100);
} catch (InterruptedException e) {
// ignore
}
}
}
As you can see, waiting for the precalculated time is more complex, but it is more precise and more efficient (since in general case it will be done in single iteration). Waiting iteratively for small time interval is simpler, but less efficient and precise (precision is dependent on the selected size of the time interval).
Also please note how I calculate if the time condition is satisfied:
(time / 1000 % 5 == 0)
In first step you need to calculate seconds and only then check if the are multiple of five. Checking by time % 5000 == 0 as suggested in other answer is wrong, as it is true only for the first millisecond of each fifth second.