Hazelcast Ringbuffer readManyAsync returns Empty Results - java

I'm trying to read N items from a RingBuffer using readManyAsync but It's always returns an empty resultSet. If I use readOne I get data.
I'm using the readManyAsync as the documentation specify. There is another way to do that?
Enviroment:
Java 8
Hazelcast 3.5.3
Example:
Ringbuffer<String> buffer = this.hazelcastInstance.getRingbuffer("testBuffer");
buffer.add("a");
buffer.add("b");
buffer.add("c");
Long sequence = buffer.headSequence();
ICompletableFuture<ReadResultSet<String>> resultSetFuture = buffer.readManyAsync(sequence, 0, 3, null);
ReadResultSet<String> resultSet = resultSetFuture.get();
System.out.println("*** readManyAsync *** readCount: " + resultSet.readCount());
int count = 0;
for (String s : resultSet) {
System.out.println(count + " - " + s);
count++;
}
System.out.println("*** readOne ***");
for (int i = 0; i < 3; i++) {
System.out.println(i + " - " + buffer.readOne(i));
}
Output:
*** readManyAsync *** readCount: 0
*** readOne ***
0 - a
1 - b
2 - c

You are happy with receiving zero results:
buffer.readManyAsync(sequence, 0, 3, null);
Try changing 0 to 1.
buffer.readManyAsync(sequence, 1, 3, null);
Now the call will block till there is at least 1 result.
Probably you can make things more efficient by asking for more than 3 items. In most cases, retrieving data is cheap, but the io/operation scheduling is expensive. So try to batch as much as possible. So try to get as many results as possible.. e.g. 100... or 1000 (which is the max).

Ok, but how do you use readManyAsync in a non-blocking way, with minCount to 0 ?
I made a minimal test case, and I really can't figure it out. I posted a support topic here :
https://groups.google.com/forum/#!topic/hazelcast/FGnLWDGrzb8
As an answer : I use readManyAsync with a timeout, like so :
try{
buffer.readManyAsync(sequence, 1, 3, null).get(500, TimeUnit.MILLISECONDS);
} catch (TimeoutException e){
// We timed out, shame, let's move on
}
That seems the only way to make a graceful non-blocking thread, but by reading the doc, I really thought a minCount=0 would do the trick.

Related

Stop all threads of a parallelStream [duplicate]

I'm trying locate the first (any) member of a list that matches a given predicate like so:
Item item = items.parallelStream()
.map(i -> i.doSomethingExpensive())
.filter(predicate)
.findAny()
.orElse(null);
I would expect that once findAny() gets a match, it would return immediately, but that doesn't appear to be the case. Instead it seems to wait for the map method to finish on most of the elements before returning. How can I return the first result immediately and cancel the other parallel streams? Is there a better way to do this than using streams such as CompletableFuture?
Here's a simple example to show the behavior:
private static void log(String msg) {
private static void log(String msg) {
SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss.SSS");
System.out.println(sdf.format(new Date()) + " " + msg);
}
Random random = new Random();
List<Integer> nums = Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14);
Optional<Integer> num = nums.parallelStream()
.map(n -> {
long delay = Math.abs(random.nextLong()) % 10000;
log("Waiting on " + n + " for " + delay + " ms");
try { Thread.sleep(delay); }
catch (InterruptedException e) { System.err.println("Interruption error"); }
return n * n;
})
.filter(n -> n < 30)
.peek(n -> log("Found match: " + n))
.findAny();
log("First match: " + num);
Log output:
14:52:27.061 Waiting on 9 for 2271 ms
14:52:27.061 Waiting on 2 for 1124 ms
14:52:27.061 Waiting on 13 for 547 ms
14:52:27.061 Waiting on 4 for 517 ms
14:52:27.061 Waiting on 1 for 1210 ms
14:52:27.061 Waiting on 6 for 2646 ms
14:52:27.061 Waiting on 0 for 4393 ms
14:52:27.061 Waiting on 12 for 5520 ms
14:52:27.581 Found match: 16
14:52:27.582 Waiting on 3 for 5365 ms
14:52:28.188 Found match: 4
14:52:28.275 Found match: 1
14:52:31.457 Found match: 0
14:52:32.950 Found match: 9
14:52:32.951 First match: Optional[0]
Once a match is found (in this case 16), findAny() does not return immediately, but instead blocks until the remaining threads finish. In this case, the caller is waiting an extra 5 seconds before returning after a match has already been found.
Instead it seems to wait for the map method to finish on most of the elements before returning.
This is not correct.
When speaking of the elements which are already being processed, it will wait for the completion of all of them, as the Stream API allows concurrent processing of data structures which are not intrinsically thread safe. It must ensure that all potential concurrent access has been finished before returning from the terminal operation.
When talking about the entire stream, it’s simply not fair to test a stream of only 14 elements on an 8 core machine. Of course, there will be at least 8 concurrent operations started, that’s what it is all about. You are adding fuel to the flames by using findFirst() instead of findAny(), as that doesn’t mean returning the first found element in processing order, but the first element in encounter order, i.e. exactly zero in your example, so threads processing other chunks than the first can’t assume that their result is the correct answer and are even more willing to help processing other candidates than with findAny().
When you use
List<Integer> nums = IntStream.range(0, 200).boxed().collect(Collectors.toList());
Optional<Integer> num = nums.parallelStream()
.map(n -> {
long delay = ThreadLocalRandom.current().nextInt(10_000);
log("Waiting on " + n + " for " + delay + " ms");
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(delay));
return n * n;
})
.filter(n -> n < 40_000)
.peek(n -> log("Found match: " + n))
.findAny();
log("First match: " + num);
You will get a similar number of tasks running to completion, despite the far bigger number of stream elements.
Note that CompletableFuture also doesn’t support interruption, so the only builtin feature for returning any result and canceling the other jobs, that comes into my mind, is the old ExecutorService.invokeAny.
To built the feature of mapping and filtering for it, we can use the following helper function:
static <T,R> Callable<R> mapAndfilter(T t, Function<T,R> f, Predicate<? super R> p) {
return () -> {
R r = f.apply(t);
if(!p.test(r)) throw new NoSuchElementException();
return r;
};
}
Unfortunately, there’s only the option of completing with a value or exceptionally, therefore we have to use an exception for non-matching elements.
Then we can use it like
ExecutorService es = ForkJoinPool.commonPool();
Integer result = es.invokeAny(IntStream.range(0, 100)
.mapToObj(i -> mapAndfilter(i,
n -> {
long delay = ThreadLocalRandom.current().nextInt(10_000);
log("Waiting on " + n + " for " + delay + " ms");
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(delay));
return n * n;
},
n -> n < 10_000))
.collect(Collectors.toList()));
log("result: "+result);
and it will not only cancel the pending tasks, it will return without waiting for them to finish.
Of course, this implies that the source data, the jobs operating upon, must be either immutable or thread safe.
You can use this code to illustrate how parallelStream works:
final List<String> list = Arrays.asList("first", "second", "third", "4th", "5th", "7th", "8th", "9th", "10th", "11th", "12th", "13th");
String result = list.parallelStream()
.map(s -> {
System.out.println("map: " + s);
return s;
})
.filter(s -> {
System.out.println("fiter: " + s);
return s.equals("8th");
})
.findFirst()
.orElse(null);
System.out.println("result=" + result);
There are two options to achieve what you're looking for, to stop expensive operation with a filter:
Don't use streams at all, use a simple for or enhanced for
Filter first, then map with the expensive operation
There are several things at play here. The first thing is that parallelStream() uses the common ForkJoinPool by default, which makes the calling thread participate as well. This means that if one of the slow tasks is currently running on the calling thread, it has to finish before the caller gets the control back.
You can see this by modifying the code a little bit to log the thread names, and log when finished the wating:
private static void log(String msg) {
SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss.SSS");
System.out.println(sdf.format(new Date()) + " [" + Thread.currentThread().getName() + "] " + " " + msg);
}
public static void main(String[] args) {
Random random = new Random();
List<Integer> nums = Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14);
Optional<Integer> num = nums.parallelStream()
.map(n -> {
long delay = Math.abs(random.nextLong()) % 10000;
log("Waiting on " + n + " for " + delay + " ms");
try {
Thread.sleep(delay);
} catch (InterruptedException e) {
System.err.println("Interruption error");
}
log("finished waiting");
return n * n;
})
.filter(n -> n < 30)
.peek(n -> log("Found match: " + n))
.findAny();
log("First match: " + num);
}
Sample output:
13:56:52.954 [main] Waiting on 9 for 9936 ms
13:56:52.956 [ForkJoinPool.commonPool-worker-1] Waiting on 4 for 7436 ms
13:56:52.970 [ForkJoinPool.commonPool-worker-2] Waiting on 1 for 6523 ms
13:56:52.983 [ForkJoinPool.commonPool-worker-3] Waiting on 6 for 7488 ms
13:56:59.494 [ForkJoinPool.commonPool-worker-2] finished waiting
13:56:59.496 [ForkJoinPool.commonPool-worker-2] Found match: 1
13:57:00.392 [ForkJoinPool.commonPool-worker-1] finished waiting
13:57:00.392 [ForkJoinPool.commonPool-worker-1] Found match: 16
13:57:00.471 [ForkJoinPool.commonPool-worker-3] finished waiting
13:57:02.892 [main] finished waiting
13:57:02.894 [main] First match: Optional[1]
Here as you can see, 2 matches are found but the main thread is still busy, so it cannot return the match now.
This does not always explain all cases though:
13:58:52.116 [main] Waiting on 9 for 5256 ms
13:58:52.143 [ForkJoinPool.commonPool-worker-1] Waiting on 4 for 4220 ms
13:58:52.148 [ForkJoinPool.commonPool-worker-2] Waiting on 1 for 2136 ms
13:58:52.158 [ForkJoinPool.commonPool-worker-3] Waiting on 6 for 7262 ms
13:58:54.294 [ForkJoinPool.commonPool-worker-2] finished waiting
13:58:54.295 [ForkJoinPool.commonPool-worker-2] Found match: 1
13:58:56.364 [ForkJoinPool.commonPool-worker-1] finished waiting
13:58:56.364 [ForkJoinPool.commonPool-worker-1] Found match: 16
13:58:57.399 [main] finished waiting
13:58:59.422 [ForkJoinPool.commonPool-worker-3] finished waiting
13:58:59.424 [main] First match: Optional[1]
This might be explained by the way the fork-join pool merges the results. It seems some improvements are possible.
As an alternative, you could indeed do this using CompletableFuture:
// you should probably also pass your own executor to supplyAsync()
List<CompletableFuture<Integer>> futures = nums.stream().map(n -> CompletableFuture.supplyAsync(() -> {
long delay = Math.abs(random.nextLong()) % 10000;
log("Waiting on " + n + " for " + delay + " ms");
try {
Thread.sleep(delay);
} catch (InterruptedException e) {
System.err.println("Interruption error");
}
log("finished waiting");
return n * n;
})).collect(Collectors.toList());
CompletableFuture<Integer> result = CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.thenApply(unused -> futures.stream().map(CompletableFuture::join).filter(n -> n < 30).findAny().orElse(null));
// shortcircuiting
futures.forEach(f -> f.thenAccept(r -> {
if (r < 30) {
log("Found match: " + r);
result.complete(r);
}
}));
// cancelling remaining tasks
result.whenComplete((r, t) -> futures.forEach(f -> f.cancel(true)));
log("First match: " + result.join());
Output:
14:57:39.815 [ForkJoinPool.commonPool-worker-1] Waiting on 0 for 7964 ms
14:57:39.815 [ForkJoinPool.commonPool-worker-3] Waiting on 2 for 5743 ms
14:57:39.817 [ForkJoinPool.commonPool-worker-2] Waiting on 1 for 9179 ms
14:57:45.562 [ForkJoinPool.commonPool-worker-3] finished waiting
14:57:45.563 [ForkJoinPool.commonPool-worker-3] Found match: 4
14:57:45.564 [ForkJoinPool.commonPool-worker-3] Waiting on 3 for 7320 ms
14:57:45.566 [main] First match: 4
Note that the cancel(true) does not actually cancel the ongoing tasks (no interruption will occur for example), but it prevents further tasks from being run (you can even see that it might not be immediate since worker 3 still started to execute the next one).
You should also use your own executor, with the appropriate size based on whether it is more CPU or I/O intensive. As you can see, the default uses the common pool and thus it does not use all cores.
The allOf() is mainly needed in case no match is found. If you can guarantee that there is at least one match, you could simply use a `new CompletableFuture() instead.
Finally, as a simple approach I repeated the filter check, but it's easy to move that logic inside the main logic, return null or a marker, and then test for that in both places.
See also How to make a future that gets completed when any of the given CompletableFutures is completed with a result that matches a certain predicate?

Algorithm for total idle time for a process based on average

This was asked during an online interview process many days back, the question follows like this:
A computer system uses a preemptive process scheduling methodology called Less than average first which works as follows:
A process that has the remaining execution time lesser than the average execution times of all processes is executed.
If multiple processes satisfying the first condition are found, the process which arrived earlier is executed. If no processes exist that do not satisfy this, then the smallest remaining process is chosen first.
Given the arrival times and the total execution times of each process, find the total time for which each process remaining idle before its execution is completed.
Example: Each line contains arrival time and the remaining execution time:
1 4
2 2
3 1
Output:
Total time for which each process remains idle before its execution is completed.
4
Explanation:
At time = 1, only one process exists so it will be executed.
At time = 2, the avg. execution time is (3+2)/2 = 2.5, the remaining execution time of the 2nd process is lesser than the average. Hence end process will be executed.
Hence the final processes execution sequence for each time unit is:
1 2 2 3 1 1 1
Method signature is:
int process(int input[][]) {
}
I tried to understand this question by reading it many times, but I am not able to understand, can you please help me with how to solve this?
You can implement a method getNextProcess to determine the next process to run and then build a scheduler that actually processes the input and adds up the waiting times. This way you can simulate the whole processing and then have the result in the end.
Determining the next process to run (implementing the rules as stated in the question):
static Optional<Process> getNextProcess(List<Process> processes) {
if (processes.isEmpty()) {
return Optional.empty();
}
// calculate average
double avg = processes.stream()
.mapToInt(p -> p.getRemainingTime())
.average().getAsDouble();
// return first process with 'remaining time < avg'
// simultaneously process with smallest remaining time
Process nextProcess = null;
int minRemainingTime = Integer.MAX_VALUE;
for (Process p : processes) {
if (p.getRemainingTime() < avg) {
return Optional.of(p);
}
if (p.getRemainingTime() < minRemainingTime) {
nextProcess = p;
minRemainingTime = p.getRemainingTime();
}
}
return Optional.ofNullable(nextProcess);
}
This is how the simulator/scheduler can look like:
static int process(int[][] input) {
// transform the 2d-array into a custom object for convenience
List<Process> incomingProcesses = new ArrayList<>();
for (int i = 0; i < input.length; i++) {
incomingProcesses.add(new Process(i + 1, input[i][0], input[i][1]));
}
// simulate scheduling
int time = 1;
int totalWaitingTime = 0;
List<Process> currentProcesses = new ArrayList<>();
while (!incomingProcesses.isEmpty() || !currentProcesses.isEmpty()) {
// handle new processes that arrive at this time step
final int finalTime = time;
List<Process> newProcesses = incomingProcesses.stream()
.filter(p -> p.getArrivalTime() == finalTime)
.collect(Collectors.toList());
currentProcesses.addAll(newProcesses);
incomingProcesses.removeAll(newProcesses);
// remove processes with no remaining time
currentProcesses.removeIf(p -> p.getRemainingTime() <= 0);
// increase total waiting time (reduction follows later)
totalWaitingTime += currentProcesses.size();
// get next process and if found,
// reduce its remaining time and decrease total waiting time
Optional<Process> nextProcess = getNextProcess(currentProcesses);
if (nextProcess.isPresent()) {
Process p = nextProcess.get();
System.out.println("Process " + p.getId());
p.setRemainingTime(p.getRemainingTime() - 1);
totalWaitingTime -= 1; // reduction since this process actually run
}
time++; // move to next time step
}
return totalWaitingTime;
}
Let's test that:
public static void main(String[] args) {
int result = process(new int[][]{
{1, 4},
{2, 2},
{3, 1}
});
System.out.println("The result is: " + result);
}
Output:
Process 1
Process 2
Process 2
Process 3
Process 1
Process 1
Process 1
The result is: 4

Random for-loop in Java?

I have 25 batch jobs that are executed constantly, that is, when number 25 is finished, 1 is immediately started.
These batch jobs are started using an URL that contains the value 1 to 25. Basically, I use a for loop from 1 to 25 where I, in each round, call en URL with the current value of i, http://batchjobserver/1, http://batchjobserver/2 and so on.
The problem is that some of these batch jobs are a bit unstable and sometimes crashes which causes the for-loop to restart at 1. As a consequence, batch job 1 is run every time the loop is initiated while 25 runs much less frequently.
I like my current solution because it is so simple (in pseudo code)
for (i=1; i < 26; i++) {
getURL ("http://batchjob/" + Integer.toString(i));
}
However, I would like I to be a random number between 1 and 25 so that, in case something crashes, all the batch jobs, in the long run, are run approximately the same number of times.
Is there some nice hack/algorithm that allows me to achieve this?
Other requirements:
The number 25 changes frequently
This is not an absolut requirement but it would be nice one batch job wasn't run again until all other all other jobs have been attempted once. This doesn't mean that they have to "wait" 25 loops before they can run again, instead - if job 8 is executed in the 25th loop (the last loop of the first "set" of loops), the 26th loop (the first loop in the second set of loops) can be 8 as well.
Randomness has another advantage: it is desirable if the execution of these jobs looks a bit manual.
To handle errors, you should use a try-catch statement. It should look something like this:
for(int i = 1, i<26, i++){
try{
getURL();
}
catch (Exception e){
System.out.print(e);
}
}
This is a very basic example of what can be done. This will, however, only skip the failed attempts, print the error, and continue to the next iteration of the loop.
There are two parts of your requirement:
Randomness: For this, you can use Random#nextInt.
Skip the problematic call and continue with the remaining ones: For this, you can use a try-catch block.
Code:
Random random = new Random();
for (i = 1; i < 26; i++) {
try {
getURL ("http://batchjob/" + Integer.toString(random.nextInt(25) + 1));
} catch (Exception e) {
System.out.println("Error: " + e.getMessage());
}
}
Note: random.nextInt(25) returns an int value from 0 to 24 and thus, when 1 is added to it, the range becomes 1 to 25.
You could use a set and start randomizing numbers in the range of your batches, while doing this you will be tracking which batch you already passed by adding them to the set, something like this:
int numberOfBatches = 26;
Set<Integer> set = new HashSet<>();
List<Integer> failedBatches = new ArrayList<>();
Random random = new Random();
while(set.size() <= numberOfBatches)
{
int ran = random.nextInt(numberOfBatches) + 1;
if(set.contains(ran)) continue;
set.add(ran);
try
{
getURL ("http://batchjob/" + Integer.toString(ran));
} catch (Exception e)
{
failedBatches.add(ran);
}
}
As an extra, you can save which batches failed
The following is an example of a single-threaded, infinite looping (also colled Round-robin) scheduler with simple retry capabilities. I called "scrape" the routine that calls your batch job (scraping means indexing a website contents):
public static void main(String... args) throws Exception {
Runnable[] jobs = new Runnable[]{
() -> scrape("https://www.stackoverfow.com"),
() -> scrape("https://www.github.com"),
() -> scrape("https://www.facebook.com"),
() -> scrape("https://www.twitter.com"),
() -> scrape("https://www.wikipedia.org"),
};
for (int i = 0; true; i++) {
int remainingAttempts = 3;
while (remainingAttempts > 0) {
try {
jobs[i % jobs.length].run();
break;
} catch (Throwable err) {
err.printStackTrace();
remainingAttempts--;
}
}
}
}
private static void scrape(String website) {
System.out.printf("Doing my job against %s%n", website);
try {
Thread.sleep(100); // Simulate network work
} catch (InterruptedException e) {
throw new RuntimeException("Requested interruption");
}
if (Math.random() > 0.5) { // Simulate network failure
throw new RuntimeException("Ooops! I'm a random error");
}
}
You may want to add multi-thread capabilities (that is achieved by simply adding an ExecutorService guarded by a Semaphore) and some retry logic (for example only for certain type of errors and with a exponential backoff).

Trying to measure how much time an insert takes in a database

I have a Multithreaded program which will insert into one of my table and that program I am running like this-
java -jar CannedTest.jar 100 10000
which means:
Number of threads is 100
Number of tasks is 10000
So each thread will insert 10000 records in my table. So that means total count (100 * 10000) in the table should be 1,000,000 after program is finished executing.
I am trying to measure how much time an insert is taking into my table as a part of our LnP testing. I am storing all these numbers in a ConcurrentHashMap like how much time an insert into database is taking like below.
long start = System.nanoTime();
callableStatement[pos].executeUpdate(); // flush the records.
long end = System.nanoTime() - start;
final AtomicLong before = insertHistogram.putIfAbsent(end / 1000000L, new AtomicLong(1L));
if (before != null) {
before.incrementAndGet();
}
When all the threads are finished executing all the tasks, then I try to print out the numbers from the ConcurrentHashMap insertHistogram by sorting it on Key which is Milliseconds and I get the result like below-
Milliseconds Number
0 2335
1 62488
2 60286
3 54967
4 52374
5 93034
6 123083
7 179355
8 118686
9 87126
10 42305
.. ..
.. ..
.. ..
And also, from the same ConcurrentHashMap insertHistogram I tried to make a Histogram like below.
17:46:06,112 INFO LoadTest:195 - Insert Histogram List:
17:46:06,112 INFO LoadTest:212 - 64823 came back between 1 and 2 ms
17:46:06,112 INFO LoadTest:212 - 115253 came back between 3 and 4 ms
17:46:06,112 INFO LoadTest:212 - 447846 came back between 5 and 8 ms
17:46:06,112 INFO LoadTest:212 - 330533 came back between 9 and 16 ms
17:46:06,112 INFO LoadTest:212 - 29188 came back between 17 and 32 ms
17:46:06,112 INFO LoadTest:212 - 6548 came back between 33 and 64 ms
17:46:06,112 INFO LoadTest:212 - 3821 came back between 65 and 128 ms
17:46:06,113 INFO LoadTest:212 - 1988 came back greater than 128 ms
NOTE:- The database in which I am trying to insert records, it's in Memory Only mode currently.
Problem Statement:-
Take a look at this number in my above result which prints out by sorting it on the key-
0 2335
I am not sure how it is possible that 2335 calls was inserted in 0 milliseconds? And also I am using System.nanotime while measuring the insert.
Below is the code which will print out the above logs-
private static void logHistogramInfo() {
int[] definition = { 0, 2, 4, 8, 16, 32, 64, 128 };
long[] buckets = new long[definition.length];
System.out.println("Milliseconds Number");
SortedSet<Long> keys = new TreeSet<Long>(Task.insertHistogram.keySet());
for (long key : keys) {
AtomicLong value = Task.insertHistogram.get(key);
System.out.println(key+ " " + value);
}
LOG.info("Insert Histogram List: ");
for (Long time : Task.insertHistogram.keySet()) {
for (int i = definition.length - 1; i >= 0; i--) {
if (time >= definition[i]) {
buckets[i] += Task.insertHistogram.get(time).get();
break;
}
}
}
for (int i = 0; i < definition.length; i++) {
String period = "";
if (i == definition.length - 1) {
period = "greater than " + definition[i] + " ms";
} else {
period = "between " + (definition[i] + 1) + " and " + definition[i + 1] + " ms";
}
LOG.info(buckets[i] + " came back " + period);
}
}
I am not sure why 0 milliseconds is getting shown when I try to print the values from the Map directly by sorting it on the key.
But the same 0 milliseconds doesn't get shown when I try to make the histogram in the same logHistogramInfo method.
Is there anything wrong I am doing in my calculation process in my above method?

Why does sleeping between iterations causes operations in a loop to take longer than the case where it does not sleep

The attached program (see at the end), when executed, yields the following output:
..........
with sleep time of 0ms
times= [1, 1, 1, 0, 1, 1, 0, 1, 1, 0]
average= 0.7
..........
with sleep time of 2000ms
times= [2, 2, 2, 2, 2, 1, 2, 2, 2, 2]
average= 1.9
In both cases the exact same code is executed which is to repeatedly get the next value from a Random object instantiated which at the start of the program. The warm up method executed first is supposed to trigger any sort of JIT otimizations before the actual testing begins.
Can anyone explain the reason for this difference? I have been able to repeat this result in my machine every time so far, and this was executed on a multi-core Windows system with java 7.
One interesting thing is that if the order in which the tests are executed is reversed, that is, if we run the loop with the delay before the loop without the delay, then the timings are more similar (with the no delay loop actually taking longer):
..........
with sleep time of 2000ms
times= [2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
average= 2.0
..........
with sleep time of 0ms
times= [2, 3, 3, 2, 3, 3, 2, 3, 2, 3]
average= 2.6
As much as I could tell, no object is being created inside the operation method, and when running this through a profiler it does not seem that garbage collection is ever triggered. A wild guess is that some value gets cached in a processor-local cache which gets flushed out when the thread is put to sleep and then when the thread wakes up it needs to retrieve the value from main memory, but that is not so fast. That however does not explain why inverting the order makes a difference...
The real-life situation where I initially observed this behavior (which prompted me to write this sample test class) was XML unmarshalling, where I noticed that unmarshalling the same document repeated times one after the other in quick succession yielded better times than performing the same thing but with a delay between calls to unmarshal (delay generated through sleep or manually).
Here is the code:
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class Tester
{
public static void main(String[] args) throws InterruptedException
{
warmUp(10000);
int numRepetitions = 10;
runOperationInALoop(numRepetitions, 0);
runOperationInALoop(numRepetitions, 2000);
}
private static void runOperationInALoop(int numRepetitions, int sleepTime) throws InterruptedException
{
List<Long> times = new ArrayList<Long>(numRepetitions);
long totalDuration = 0;
for(int i=0; i<numRepetitions; i++)
{
Thread.sleep(sleepTime);
long before = System.currentTimeMillis();
someOperation();
long duration = System.currentTimeMillis() - before;
times.add(duration);
totalDuration = totalDuration + duration;
System.out.print(".");
}
System.out.println();
double averageTimePerOperation = totalDuration/(double)numRepetitions;
System.out.println("with sleep time of " + sleepTime + "ms");
System.out.println(" times= " + times);
System.out.println(" average= " + averageTimePerOperation);
}
private static void warmUp(int warmUpRepetitions)
{
for(int i=0; i<warmUpRepetitions; i++)
{
someOperation();
}
}
public static int someInt;
public static Random random = new Random(123456789L);
private static void someOperation()
{
for(int j=0; j<50000; j++)
{
someInt = ((int)random.nextInt()*10) + 1;
}
}
}
When you sleep for even a short period of time (you may find that 10 ms is long enough) you give up the CPU and the data, instruction and branch prediction caches are disturbed or even cleared. Even making a system call like System.currentTimeMillis() or the much more accurate System.nanoTime() can do this to a small degree.
AFAIK, The only way to avoid giving up the core is to busy wait and using thread affinity to lock your thread to a core. This prevent minimises such a disturbance and means your program can runs 2-5x faster in low latency situations i.e. when sub-millisecond tasks matter.
For your interest
http://vanillajava.blogspot.co.uk/2012/01/java-thread-affinity-support-for-hyper.html
http://vanillajava.blogspot.co.uk/2012/02/how-much-difference-can-thread-affinity.html
When you're thread goes to sleep you're essentially saying to the JVM: This thread is doing nothing for the next X milliseconds. The JVM is likely at that point to wake up various background threads to do their thing (GC, for example), which may well cause updates to data stored in the processor cache. When you're thread reawakes, some of its data may no longer be in the cache (fast), but may well be shifted out to main memory (slow).
Take a look at http://mechanical-sympathy.blogspot.co.uk/ for more discussion of low level caching effects.
There is no guarantee that sleep() sleeps for exactly the length of time you specify. There is a specific statement in the Javadoc to that effect.
System.currentTimeMillis() has a system-dependeny granularity which you are exposing by running such relatively few iterations as 2000. You should multiply that by at least 10 to get out of the granularity region. On Windows I believe it is as high as 16ms.

Categories