Comparing logically similar "for loops"

Comparing logically similar "for loops" - java

I came across simple java program with two for loops. The question was whether these for loops will take same time to execute or first will execute faster than second .
Below is programs :
public static void main(String[] args) {
Long t1 = System.currentTimeMillis();
for (int i = 999; i > 0; i--) {
System.out.println(i);
}
t1 = System.currentTimeMillis() - t1;
Long t2 = System.currentTimeMillis();
for (int j = 0; j < 999; j++) {
System.out.println(j);
}
t2 = System.currentTimeMillis() - t2;
System.out.println("for loop1 time : " + t1);
System.out.println("for loop2 time : " + t2);
}
After executing this I found that first for loop takes more time than second. But after swapping there location the result was same that is which ever for loop written first always takes more time than the other. I was quite surprised with result. Please anybody tell me how above program works.

The time taken by either loop will be dominated by I/O (i.e. printing to screen), which is highly variable. I don't think you can learn much from your example.

The first loop will allocate 1000 Strings in memory while the second loop, regardsless of working forwards or not, can use the already pre-allocated objects.
Although working with System.out.println, any allocation should be neglible in comparison.

Long (and other primitive wrappers) has cache (look here for LongCache class) for values -128...127. It is populated at first loop run.

i think, if you are going to do a real benchmark, you should run them in different threads and use a higher value (not just 1000), no IO (printing output during execution time), and not to run them sequentially, but one by one.
i have an experience executing the same code a few times may takes different execution time.
and in my opinion, both test won't be different.

Related

Strange code timing behavior in Java

In the following code:
long startingTime = System.nanoTime();
int max = (int) Math.pow(2, 19);
for(int i = 0; i < max; ){
i++;
}
long timePass = System.nanoTime() - startingTime;
System.out.println("Time pass " + timePass / 1000000F);
I am trying to calculate how much time it take to perform simple actions on my machine.
All the calculations up to the power of 19 increase the time it takes to run this code, but when I went above 19(up to max int value 31) I was amazed to discover that it have no effect on the time it takes.
It always shows 5 milliseconds on my machine!!!
How can this be?

You have just witnessed HotSpot optimizing your entire loop to oblivion. It's smart. You need to do some real action inside the loop. I recommend introducing an int accumulator var and doing some bitwise operations on it, and finally printing the result to ensure it's needed after the loop.

On the HotSpot JVM, -XX:CompileThreshold=10000 by default. This means a loop which iterates 10K times can trigger the whole method to be optimised. In your case you are timing how long it take to detect and compile (in the background) your method.

use another System.nanoTime() in the loop. no one can optimize this.
for(int i = 0; i < max; ){
i++;
dummy+=System.nanoTime();
}
dont forget to do:
System.out.println(dummy);
after the loop. ensures non-optimization

Comparing methods speed performance using nanotime in Java

I would like to compare the speed performance (if there were any) from the two readDataMethod() as I illustrate below.
private void readDataMethod1(List<Integer> numbers) {
final long startTime = System.nanoTime();
for (int i = 0; i < numbers.size(); i++) {
numbers.get(i);
}
final long endTime = System.nanoTime();
System.out.println("method 1 : " + (endTime - startTime));
}
private void readDataMethod2(List<Integer> numbers) {
final long startTime = System.nanoTime();
int i = numbers.size();
while (i-- > 0) {
numbers.get(i);
}
final long endTime = System.nanoTime();
System.out.println("method 2 : " + (endTime - startTime));
}
Most of the time the result I get shows that method 2 has "lower" value.
Run readDataMethod1 readDataMethod2
1 636331 468876
2 638256 479269
3 637485 515455
4 716786 420756
Does this test prove that the readDataMethod2 is faster than the earlier one ?

Does this test prove that the readDataMethod2 is faster than the earlier one ?
You are on the right track in that you're measuring comparative performance, rather than making assumptions.
However, there are lots of potential issues to be aware of when writing micro-benchmarks in Java. I would recommend that you read
How do I write a correct micro-benchmark in Java?

In the first one, you are calling numbers.size() for each iteration.
Try storing it in a variable, and check again.

The reason because of which the second version runs faster is because you are calling numbers.size() on each iteration. Replacing it by storing in a number would make it almost the same as the first one.

Does this test prove that the readDataMethod2 is faster than the earlier one ?
As #aix says, you are on the right track. However, there are a couple of specific issues with your methodology:
It doesn't look like you are "warming up" the JVM. Therefore it is conceivable that your figures could be distorted by startup effects (JIT compilation) or that none of the code has been JIT compiled.
I'd also argue that your runs are doing too little work. A 500000 nanoseconds, is 0.0005 seconds, and that's not much work. The risk is that "other things" external to your application could be introducing noise into the measurements. I'd have more confidence in runs that take tens of seconds.

Wrong time of method execution

I want to test the time of adding and getting item in simple and generic hashmap:
public void TestHashGeneric(){
Map hashsimple = new HashMap();
long startTime = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
hashsimple.put("key"+i, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" );
}
for (int i = 0; i < 100000; i++) {
String ret =(String)hashsimple.get("key"+i);
}
long endTime =System.currentTimeMillis();
System.out.println("Hash Time " + (endTime - startTime) + " millisec");
Map<String,String> hm = new HashMap<String,String>();
startTime = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
hm.put("key"+i, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" );
}
for (int i = 0; i < 100000; i++) {
String ret = hm.get("key"+i);
}
endTime = System.currentTimeMillis();
System.out.println("Hash generic Time " + (endTime - startTime) + " millisec");
}
The problem is that I get different time if I change places between hashmap's code section!
if i put the loops (with the time print ofcourse) of generic below the simple I get better time for generic and if i put simple below the generic I get better time for simple!
Same happens if I use different methods for this.

The JIT will compile and optimise your program while it is running, so the 2nd run will always be faster.
You should make the following modifications:
Run both tests untimed first, then re-run them timed so that you don't get affected by the JIT.
You should use System.nanoTime() as this is more accurate for timing (you should never get a diff of 0).
You should also test against some empty methods, as you are also timing the string concatenation operation in each loop.
Also note that in Java generic types are erased, so there should be no runtime difference at all.

The Java Runtime is pretty sophistocated - it does some learning/optimisation while it's running. You might be able to get the test you want by "warming up" the JVM first. Try calling TestHashGeneric() twice and see what the second set of results gives you.
You've also got twice as much stuff in memory on the second run. There's all sorts of variables that can affect this test.

This is not the correct way to perform micro-benchmarks as it is very involved (Why?). I suggest you use Caliper framework for this.

When you have a single loop which reaches the compile threshold (about 10K) the whole method is compiled. This can make either the first loop appear faster (as it is optimised correctly as the second loop has no counter information) or the second loop appear after (as its compiled before it is even started)
This simplest way to fix this is to place each test in its own method and they will be compiled and optimised independently. (The order can still matter but is less important) I would still suggest running the test a number of times to see how the results vary.

Java For loop vs While loop, strange behavior and time performance

I'm writing an algorithm which do a big loop over an integer array from the end to the beginning with a if condition inside. At the first time the condition is false the loop can be terminated.
So, with a for loop, if condition is false it continues to iterate with simple variables changes.
With a while loop with the condition as while parameter, the loop will stop once condition false and should save some iterations.
However, the while loop remains a little slower than the for loop!
But, if I put a int value as counter, and count iterations, the For loop as expected performed much more iterations.
However this time, the time execution of the mofified For method with the counter will be much more slower than the while method with a counter!
Any explanations?
here the code with a for loop:
for (int i = pairs.length - 1; i >= 0; i -= 2) {
//cpt++;
u = pairs[i];
v = pairs[i - 1];
duv = bfsResult.distanceMatrix.getDistance(u, v);
if (duv > delta) {
execute();
}
}
time execution: 6473
time execution with a counter: 8299
iterations counted: 2584401
here the code with the while loop:
int i = pairs.length - 1;
u = pairs[i];
v = pairs[i - 1];
duv = bfsResult.distanceMatrix.getDistance(u, v);
while (duv > delta) {
//cpt++;
execute();
u = pairs[i -= 2];
v = pairs[i - 1];
duv = bfsResult.distanceMatrix.getDistance(u, v);
}
time execution: 6632
time execution with a counter: 7163
iterations counted: 9793
Time is in ms, I repeated the experiment several times with different size intances, the measures remained almost the same. The execute() method updates the delta value. Method getDistance() is just a matrix int[][] access.
Thanks for any help.

Before you try to perform any performance tests on java I highly recommend you reading this article
http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html
In a few words - when running for some time Hotspot-enabled JVM can optimize your code which will affect the results of tests. So you need proper technique to test performance of your code.
To ease the pain there is a library used for performing proper tests: http://ellipticgroup.com/html/benchmarkingArticle.html
You can find links to both parts of the article on this page.
Update: to help you start quicker with this here is what you just need to do:
Download bb.jar, jsci-core.jar, mt-13.jar found on the page
Put them on classpath
Rewrite your code so that while loop approach and for loop approach both go in separate implementations of Runnable or Callable interface
In your main method just invoke
System.out.println(new Benchmark(new WhileApproach()));
to show execution time for while-loop and obviously
System.out.println(new Benchmark(new ForApproach()));
to get info for for-loop

You do not have the same termination condition. For the while loop it's:
duv > delta
and for the for loop it's
i >= 0
The two scenarios are not equivalent. My guess is that the while loop condition becomes false way sooner than the for condition and therefore it executes less iterations.

When duv>delta the while-loop stops, but the for-loop continues. Both get the same result, but for continues checking. You should modify the for-loop like this:
if (duv > delta)
{
execute();
}
else break;

Code inside thread slower than outside thread..?

I'm trying to alter some code so it can work with multithreading. I stumbled upon a performance loss when putting a Runnable around some code.
For clarification: The original code, let's call it
//doSomething
got a Runnable around it like this:
Runnable r = new Runnable()
{
public void run()
{
//doSomething
}
}
Then I submit the runnable to a ChachedThreadPool ExecutorService. This is my first step towards multithreading this code, to see if the code runs as fast with one thread as the original code.
However, this is not the case. Where //doSomething executes in about 2 seconds, the Runnable executes in about 2.5 seconds. I need to mention that some other code, say, //doSomethingElse, inside a Runnable had no performance loss compared to the original //doSomethingElse.
My guess is that //doSomething has some operations that are not as fast when working in a Thread, but I don't know what it could be or what, in that aspect is the difference with //doSomethingElse.
Could it be the use of final int[]/float[] arrays that makes a Runnable so much slower? The //doSomethingElse code also used some finals, but //doSomething uses more. This is the only thing I could think of.
Unfortunately, the //doSomething code is quite long and out-of-context, but I will post it here anyway. For those who know the Mean Shift segmentation algorithm, this a part of the code where the mean shift vector is being calculated for each pixel. The for-loop
for(int i=0; i<L; i++)
runs through each pixel.
timer.start(); // this is where I start the timer
// Initialize mode table used for basin of attraction
char[] modeTable = new char [L]; // (L is a class property and is about 100,000)
Arrays.fill(modeTable, (char)0);
int[] pointList = new int [L];
// Allcocate memory for yk (current vector)
double[] yk = new double [lN]; // (lN is a final int, defined earlier)
// Allocate memory for Mh (mean shift vector)
double[] Mh = new double [lN];
int idxs2 = 0; int idxd2 = 0;
for (int i = 0; i < L; i++) {
// if a mode was already assigned to this data point
// then skip this point, otherwise proceed to
// find its mode by applying mean shift...
if (modeTable[i] == 1) {
continue;
}
// initialize point list...
int pointCount = 0;
// Assign window center (window centers are
// initialized by createLattice to be the point
// data[i])
idxs2 = i*lN;
for (int j=0; j<lN; j++)
yk[j] = sdata[idxs2+j]; // (sdata is an earlier defined final float[] of about 100,000 items)
// Calculate the mean shift vector using the lattice
/*****************************************************/
// Initialize mean shift vector
for (int j = 0; j < lN; j++) {
Mh[j] = 0;
}
double wsuml = 0;
double weight;
// find bucket of yk
int cBucket1 = (int) yk[0] + 1;
int cBucket2 = (int) yk[1] + 1;
int cBucket3 = (int) (yk[2] - sMinsFinal) + 1;
int cBucket = cBucket1 + nBuck1*(cBucket2 + nBuck2*cBucket3);
for (int j=0; j<27; j++) {
idxd2 = buckets[cBucket+bucNeigh[j]]; // (buckets is a final int[] of about 75,000 items)
// list parse, crt point is cHeadList
while (idxd2>=0) {
idxs2 = lN*idxd2;
// determine if inside search window
double el = sdata[idxs2+0]-yk[0];
double diff = el*el;
el = sdata[idxs2+1]-yk[1];
diff += el*el;
//...
idxd2 = slist[idxd2]; // (slist is a final int[] of about 100,000 items)
}
}
//...
}
timer.end(); // this is where I stop the timer.
There is more code, but the last while loop was where I first noticed the difference in performance.
Could anyone think of a reason why this code runs slower inside a Runnable than original?
Thanks.
Edit: The measured time is inside the code, so excluding startup of the thread.

All code always runs "inside a thread".
The slowdown you see is most likely caused by the overhead that multithreading adds. Try parallelizing different parts of your code - the tasks should neither be too large, nor too small. For example, you'd probably be better off running each of the outer loops as a separate task, rather than the innermost loops.
There is no single correct way to split up tasks, though, it all depends on how the data looks and what the target machine looks like (2 cores, 8 cores, 512 cores?).
Edit: What happens if you run the test repeatedly? E.g., if you do it like this:
Executor executor = ...;
for (int i = 0; i < 10; i++) {
final int lap = i;
Runnable r = new Runnable() {
public void run() {
long start = System.currentTimeMillis();
//doSomething
long duration = System.currentTimeMillis() - start;
System.out.printf("Lap %d: %d ms%n", lap, duration);
}
};
executor.execute(r);
}
Do you notice any difference in the results?

I personally do not see any reason for this. Any program has at least one thread. All threads are equal. All threads are created by default with medium priority (5). So, the code should show the same performance in both the main application thread and other thread that you open.
Are you sure you are measuring the time of "do something" and not the overall time that your program runs? I believe that you are measuring the time of operation together with the time that is required to create and start the thread.

When you create a new thread you always have an overhead. If you have a small piece of code, you may experience performance loss.
Once you have more code (bigger tasks) you make get a performance improvement by your parallelization (the code on the thread will not necessarily run faster, but you are doing two thing at once).
Just a detail: this decision of how big small can a task be so parallelizing it is still worth is a known topic in parallel computation :)

You haven't explained exactly how you are measuring the time taken. Clearly there are thread start-up costs but I infer that you are using some mechanism that ensures that these costs don't distort your picture.
Generally speaking when measuring performance it's easy to get mislead when measuring small pieces of work. I would be looking to get a run of at least 1,000 times longer, putting the whole thing in a loop or whatever.
Here the one different between the "No Thread" and "Threaded" cases is actually that you have gone from having one Thread (as has been pointed out you always have a thread) and two threads so now the JVM has to mediate between two threads. For this kind of work I can't see why that should make a difference, but it is a difference.
I would want to be using a good profiling tool to really dig into this.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Comparing logically similar "for loops" - java

The time taken by either loop will be dominated by I/O (i.e. printing to screen), which is highly variable. I don't think you can learn much from your example.

The first loop will allocate 1000 Strings in memory while the second loop, regardsless of working forwards or not, can use the already pre-allocated objects. Although working with System.out.println, any allocation should be neglible in comparison.

Long (and other primitive wrappers) has cache (look here for LongCache class) for values -128...127. It is populated at first loop run.

Related

Strange code timing behavior in Java

Comparing methods speed performance using nanotime in Java

Wrong time of method execution

Java For loop vs While loop, strange behavior and time performance

Code inside thread slower than outside thread..?

Categories

Resources