I read this in an article. But the answer given here is not clear..
1. is it true?
2. Can anyone explain it better?
3. Is there a document that explains java / JVM caching mechanism overall?
**Which one is faster in Java ?**
for(int i = 100000; i > 0; i--) {}
for(int i = 1; i < 100001; i++) {}
Answer: Which ever is run second with be fastest. The server JVM can detect and
eliminate loops which don't do anything. A method with either loop is compiled when
the loop iterates about 10,000 times. (Based on -XX:CompileThreshold=10000) The first
loop will take time to detect it doesn't do anything, however the second will have been
compiled.
Java is a high level language, so there will be deference between the code you write, and the code that is generated by the compiler. The compiler and JVM are trying to optimize your code. The first for will not be executed because it doesn't do something, and it doesn't iterate more than 10000 times. The second for does iterate, but JVM execute it because it iterate over 10000 times.
Related
Out of curiosity, I recently wrote two nested for loops in Java and both of them simply counted to 1 billion (1'000'000'000).
Surprisingly, Java had this task done in less than one second. In other languages this would never be done that quickly.
Another weird thing is that when I add a third for loop, nested as well, the program did not seem to come to an end.
Can someone tell me where that speed comes from?
Edit:
Following is my code:
for (int i = 0; i < 1000000000; i++) {
for (int r = 0; r < 1000000000; r++) { }
}
System.out.println("done");
The java compiler optimizes the loop and removes it. But this isn't the case if you use a volatile int:
static volatile int i;
public static void main(String[] args) {
for (i = 0; i < 1000000000; i++);
}
The above loop will take a lot of time because now the java compiler can't optimize the loop.
Assuming Your using Java 8 a great deal of performance optimizations have been made to java over the years. As this test shows for & while loops are way faster than iterators. Just going through a loop is one of the fastest operations you can perform. As mentioned by #kevin in the comments the compiler is most likely deciding to skip iterations of the loop to optimize the code. This is why the poor mans sleep functions (count to a large number, were never really that accurate ([the second answer mentions about how compilers optimize this hence making these primitive do it yourself sleep functions unreliable.])3
The reason why the third loop fails is because your exponentially increasing the size. Your effectively trying the to iterate to 1 billion then 1billion to the second power then 1 billion to the third power.
Also what languages are you comparing these results to and are you sure your code is similar. Also for your third case you may just need to wait for it to finish executing assuming you dont get an Error/Exception.
The compiler is optimizing the code by removing your loops entirely, since they have no effect whatsoever on the program.
Try it this way and see what happens:
int i, r;
for (i = 0; i < 1000000000; i++) {
for (r = 0; r < 1000000000; r++) { }
}
System.out.println(String.format("Done. i=%d r=%d", i, r));
Now I'm forcing the compiler to compile the loops since i and r are accessed from both inside and outside the loops.
I've seen that JITC uses unsigned comparison for checking array bounds (the test 0 <= x < LIMIT is equivalent to 0 ≺ LIMIT where ≺ treats the numbers as unsigned quantities). So I was curious if it works for arbitrary comparisons of the form 0 <= x < LIMIT as well.
The results of my benchmark are pretty confusing. I've created three experiments of the form
for (int i=0; i<LENGTH; ++i) {
int x = data[i];
if (condition) result += x;
}
with different conditions
0 <= x called above
x < LIMIT called below
0 <= x && x < LIMIT called inRange
0 <= x & x < LIMIT called inRange2
and prepared the data so that the probabilities of the condition being true are the same.
The results should be fairly similar, just above might be slightly faster as it compares against zero. Even if the JITC couldn't use the unsigned comparison for the test, the results for above and below should still be similar.
Can anyone explain what's going on here? It's quite possible that I did something wrong...
Update
I'm using Java build 1.7.0_51-b13 on Ubuntu 2.6.32-54-generic with i5-2400 CPU # 3.10GHz, in case anybody cares. As the results for inRange and inRange2 near 0.00 are especially confusing, I re-ran the benchmark with more steps in this area.
The likely variation in the results of the benchmarks have to do with CPU caching at different levels.
Since primitive int(s) are being used, there is no JVM specific caching going on, as will happen with auto-boxed Integer to primitive values.
Thus all that remains, given minimal memory consumption of the data[] array, is CPU-caching of low level values/operations. Since as described the distributions of values are based on random values with statistical 'probabilities' of the conditions being true across the tests, the likely cause is that, depending on the values, more or less (random) caching is going on for each test, leading to more randomness.
Further, depending on the isolation of the computer (background services/processes), the test cases may not be running in complete isolation. Ensure that everything is shutdown for these tests except the core OS functions and the JVM. Set the JVM memory min/max the same, shutdown any networking processes, updates, etc.
Are you test results the avarage of a number of runs, or did you only test each function once?
One thing I have found are that the first time you run a for loop the JVM will interpret, then each time its run the JVM will optimize it. Therefore the first few runs may get horrible performance, but after a few runs it will be near native performance.
I also figured out that a loop will not be optimized while its running. I have not tested if this applies to just the loop or the whole function. If it only applies to the loop you may get much more performance if you nest in in an inner and outer loop, and work with your data one block at a time. If its the whole function, you will have to place the inner loop in its own function.
Also run the test more than once, if you compare the code you will notice how the JIT optimizes the code in stages.
For most code this gives Java optimal performance. It allows it to skip costly optimization on code that runs rarely and makes code that run often a lot faster. However if you have a code block that runs once but for a long time, it will become horribly slow.
This question already has answers here:
Loop counter in Java API
(3 answers)
Closed 9 years ago.
I was going through increment/decrement operators, and I encountered that if i run a loop in decremental form in that case it will run faster than the same loop in incremental form. I was expecting that both will take equal time since same amount of steps will be followed. I searched through the web but could not find convincing answer to this. Is it due to the case that decrement operator takes less time as compared to increment operator?
for(int i = 100000; i > 0; i--) {}
for(int i = 1; i < 100001; i++) {}
This is because in bytecode comparison with 0 is a different operation than comparison with a non-zero number. Actually i < 10001 requires to first load the number on stack then execute comparison, while i > 0 is executed as one operation. Of course there will be no speed difference in most cases because of JVM optimizations. But we can try to make it visible by running the code with -Xint option (interpreted mode execution only).
Piyush Bhardwaj
I tested both the loops in online compiler but my increment loop is executing faster then decrement loop.
Program execution depends upon many factors. When sometimes we run the same program on same machine many times we gets different execution times. So it depends upon many factors.
See the results
for(int i = 1; i < 100001; i++) {
}
Increment loop -- http://ideone.com/irdY0e
for(int i = 100000; i > 0; i--) {
}
Decrement loop -- http://ideone.com/yDO9Jf
Sir Evgeniy Dorofeev has given an excellent explanation which an expert only can give.
Finally, you need to consider the performance of your CPU. When considering a benchmark to determine the overall performance of a Java application, bear in mind that bytecode execution, native code execution, and graphics each play a role. Their impact varies depending on the nature of the specific application.
This question already has answers here:
Using collection size in for loop comparison
(4 answers)
for loop optimization
(15 answers)
Closed 9 years ago.
I would like to ask more experienced developers about one simple, but for me not obvious, thing. Assume you have got such a code (Java):
for(int i=0; i<vector.size(); i++){
//make some stuff here
}
I came across such statements very often, so maybe there is nothing wrong in it. But for me, it seems unnecessary to invoke a size method in each iteration. I would use such approach:
int vectorSize = vector.size();
for(int i=0; i<vectorSize; i++){
//make some stuff here
}
The same thing here:
for(int i=0; i<myTreeNode.getChildren().size(); i++){
//make some stuff here
}
I am definitely not an expert in programming yet, so my question is: Am I seeking a gap where the hedge is whole or it is important to take care of such details in professional code?
A method invocation requires that the JVM does indeed do additional stuff. So what you're doing, at first view seems like an optimization.
However, some JVM implementations are smart enough to inline method calls, and for those, the difference will be nonexistent.
The Android programming guidelines for example always recommend doing what you've pointed out, but again, the JVM implementation manual (if you can get your hands on one) will tell you if it optimizes code for you or not.
Usually size() is a small constant-time operation and so the cost of calling size is trivial compared to the cost of executing the loop body, and the just in time compiler may be taking care of this optimization for you; therefore, there may not be much benefit to this optimization.
That said, this optimization does not adversely affect code readability, so it isn't something to be avoided; often code optimizations that only affect speed by a small factor (as opposed to e.g. an optimization that changes a O(n) operation to a O(1) operation) should be avoided for this reason, for example you can unroll a loop:
int i;
int vectorSizeDivisibleBy4 = vectorSize - vectorSize % 4; // returns lowest multiple of four in vectorSize
for(i = 0; i < vectorSizeDivisibleBy4; i += 4) {
// loop body executed on [i]
// second copy of loop body executed on [i+1]
// third copy of loop body executed on [i+2]
// fourth copy of loop body executed on [i+3]
}
for(; i < vectorSize; i++) { // in case vectorSize wasn't a factor of four
// loop body
}
By unrolling the loop four times you reduce the number of times that i < vectorSize is evaluated by a factor of four, at the cost of making your code an unreadable mess (it might also muck up the instruction cache, resulting in a negative performance impact). Don't do this. But, like I said, int vectorSize = vector.size() doesn't fall into this category, so have at it.
At the 1st sight the alternative you are suggesting seams an optimization, but in terms of speed it is identical to the common approach, because of:
the complexity time of the call of size() function in a java vector has a complexity of order O(1) since each vector has always stored a variable containing its size, so you don't need to calculate its size in each iteration, you just access it.
note:
you can see that the size() function in: http://www.docjar.com/html/api/java/util/Vector.java.html is just returning a protected variable elementCount.
for (int i=0; i<arr.length; i++) {
}
This will result in a code:
getstatic #4;
arraylength
While the following code:
int length = arr.length;
for (int i=0; i<length; i++) {
}
will be compiled as:
iload_3
Is there a difference between the two snippets? Which code runs faster?
As you an see, the array is a static member in my case. Static and final to be exact. Taking JIT optimization into account, a basic optimizer can sense that and hard code the length of the array into the machine code of the method. It is much harder to follow this logic with a local variable (second case), so one would think there is a greater chance that the first one will be optimized than the second.
As it's static and final, I suspect it could hard-code the length, although I'm not sure it would go that far. But the JIT compiler may well still be able to do better with the first form than the second.
In particular, if it can detect that the array doesn't change within the loop, it can avoid evaluating the length more than once and remove array bounds checks within the loop - it can validate that you're never going to access the array outside the range [0, length).
I would hope that by now, decent JITs would notice the second form too - but I'd still prefer the first form for readability, and I'd want evidence of it not performing as well as the second before changing to that one.
As ever, write the most readable code first, but measure it against performance requirements.
Is there a difference between the two snippets?
It depends. If the arr variable is updated in the body of the loop, then the two code snippets are semantically different.
Which code runs faster?
It is impossible to say. It depends on the native code generated by the JIT compiler, and that can vary from one patch release to the next. The only know for sure is to dump the native code and examine it in detail, or benchmark the code. But either way, the difference is usually too small to be worth worrying about.
One optimisation the JVM does is to avoid bounds checking the array on every access. Instead it can bounds check the first and last value instead.
However, it is possible some micro-optimisation will confuse the JVM and you will get slower less optimised code in the end.
The form I use when micro-optimising is
for (int i = 0, length = methodCall(arr); i < length; i++) {
// use the array.
}
I prefer to use the simplest and most obvious solution to a problem because the JVM is most likely to optimise this use case best.