I think the question should be self explanatory, and the language I'm thinking about right now is Java, but it probably applies across all languages.
That being said, basically what I'm talking about is whether this:
// Initialize first
int i = 0;
for (i = 0; i < x; i++) {
// do some stuff
}
for (i = 0; i < x; i++) {
// do some more stuff
}
for (i = 0; i < x; i++) {
// do other stuff
}
Is better than this:
// Initializing i in the for loop
for(int i = 0; i < x; i++) {
// do some stuff
}
for(int i = 0; i < x; i++) {
// do some more stuff
}
for(int i = 0; i < x; i++) {
// do other stuff
}
This is a performance question, and I'm talking about initializing once /per/ scope resolution.
I performed a performance test with x=10 to evaluate the performance difference between the in-loop declaration method and the out-of-loop declaration method.
Details: I ran the code 300x with in-loop first and then 300x with out-of-loop first. Each run, I recorded the total runtime in nanoseconds to execute each method 10,000 times. So, I recorded a total of 1200 observations (600 per method). To measure steady-state performance (vice startup performance), I removed the 20 observations from each data set that had the longest duration. (The mean runtime for the 20 startup observations was an order of magnitude larger than the mean runtime for all the other observations.)
Results: A single-factor ANOVA indicates that the in-loop declaration is faster than the out-of-loop declaration (p-value=8.12584E-07). The mean runtimes were 158635.4931 nanoseconds for in-loop and 166943.7397 nanoseconds for out-of-loop. From a practical standpoint, we're talking about a difference of ~0.01ms per 10,000 iterations.
Conclusion: Just use the in-loop declaration. #FallAndLearn also points out that the in-loop declaration is easier to maintain because the local variable i is declared with the smallest scope possible .
You first piece of code is better than second one because int is a value type and its value is stored in stack once you initialize it , later on you just assign that type a value again and again .
On the other hand (second piece of code )you are initializing i three times i.e. creating stack entries three times .
So the first piece of code is better than the second one , performance wise .
The scope of local variables should always be the smallest possible.
Hence if int i is not used outside the loop then second way is always better. More readable too.
Performance wise they are both same. From a maintenance perspective, second option is better.
Also, the answers to this question will depend on your requirement. If your code has other data dependent on i or have only three for loops statement.
Let's check out the disassembled code for the following snippet:
public class Test {
public static void main(String[] args) {
int i = 0;
for(i = 0; i < 3; i++){
//do some stuff
}
}
}
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: iconst_0
3: istore_1
4: iload_1
5: iconst_3
6: if_icmpge 15
9: iinc 1, 1
12: goto 4
15: return
}
Now let's generate another one for the initialization of the control variable inside the loop:
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: iload_1
3: iconst_3
4: if_icmpge 13
7: iinc 1, 1
10: goto 2
13: return
}
They're not the same, I'm not a byte-code expert but I can tell that the second one has less overhead. The first one pushes int constant twice (the two iconst_<i> instructions), and has two istore_<n> instructions, compared to one instruction in the second code.
Related
I'm developing a code with lots of iterations
and I was wondering which one of these conditions is more efficient.
//1
Boolean.FALSE.equals(x)
//2
x == false
//3
!x
I am using the first one but i am not sure about it. If someone can give some information and help me I will appreciate it.
The second and third one shoud be the fastest. The first one involves extra overhead, although it could well be that the JIT compiler optimises it.
The issue is more about readability. The first one is practically unreadable.
At popular request, I've expanded the answer a bit. I wrote this class:
package com.severityone.test;
public class Main {
public static void main(String[] args) {
final boolean x = false;
final boolean a = Boolean.FALSE.equals(x);
final boolean b = x == false;
final boolean c = !x;
}
}
This is the resulting byte code:
Compiled from "Main.java"
public class com.severityone.test.Main {
public com.severityone.test.Main();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: getstatic #2 // Field java/lang/Boolean.FALSE:Ljava/lang/Boolean;
5: iconst_0
6: invokestatic #3 // Method java/lang/Boolean.valueOf:(Z)Ljava/lang/Boolean;
9: invokevirtual #4 // Method java/lang/Boolean.equals:(Ljava/lang/Object;)Z
12: istore_2
13: iconst_1
14: istore_3
15: iconst_1
16: istore 4
18: return
}
What we can see from here is that numbers 2 and 3 take two instructions each, whereas number 1 takes five. For most programs, it won't make any difference, but if you're running in a tight loop, it could make a difference.
As for readability, the adage of "less is more" goes. Because my eyes aren't exactly 100%, I have problems with large amounts of text, so I prefer to put plenty of whitespace in my code. If you need to write Boolean.FALSE.equals(x), and it's actually the x that you're interested in, you need to mentally swap the whole thing around.
As for the other two, readability is mostly a matter of personal preference. There's something to be said for all three options:
!x
x == false
false == x
The ! can be easy to overlook if you have a very long expression, such as !((value < 0 || value > 10) && "yes".equals(response)). Some people prefer to write ... == false or false == ..., because you don't easily miss it.
So, pop()method in java framework java.util.Stack class looks like this:
#SuppressWarnings("unchecked")
public synchronized E pop() {
if (elementCount == 0) {
throw new EmptyStackException();
}
final int index = --elementCount;
final E obj = (E) elementData[index];
elementData[index] = null;
modCount++;
return obj;
}
The part that I have trouble in understanding is local variable index. It seems we don't need it. elementCount is a instance variable in Vector class which Stack class extended.
So my point is,
final int index = --elementCount;
final E obj = (E) elementData[index];
elementData[index] = null;
These 3 lines of code can be written like
final E obj = (E) elementData[--elementCount];
elementData[elementCount] = null;
which consumes less memory, because memory space for index local variable isn't used.
Also, I found this pattern along the java framework source code. For example add(E Object) method in java.util.ArrayList class looks :
#Override public boolean add(E object) {
Object[] a = array;
int s = size;
if (s == a.length) {
Object[] newArray = new Object[s +
(s < (MIN_CAPACITY_INCREMENT / 2) ?
MIN_CAPACITY_INCREMENT : s >> 1)];
System.arraycopy(a, 0, newArray, 0, s);
array = a = newArray;
}
a[s] = object;
size = s + 1;
modCount++;
return true;
}
in this example, array is a instance variable, and as you can see, a new local variable a is assigned to hold it.
Does anybody know about this? Big Thanks in advance. :)
Though this is a really old question, but I want to share some information I earned during my journey.
I could find some explanation about my question on Performance Tips on Android page. First see sample code from the page,
static class Foo {
int mSplat;
}
Foo[] mArray = ...
public void zero() {
int sum = 0;
for (int i = 0; i < mArray.length; ++i) {
sum += mArray[i].mSplat;
}
}
public void one() {
int sum = 0;
Foo[] localArray = mArray;
int len = localArray.length;
for (int i = 0; i < len; ++i) {
sum += localArray[i].mSplat;
}
}
public void two() {
int sum = 0;
for (Foo a : mArray) {
sum += a.mSplat;
}
}
According to the above page, zero() is slowest, one() is faster. Because it pulls everything out into local variables, avoiding the lookups.
I think this explanation might solve my second question, which was asking "a new local variable a is assigned to hold it. but why?"
I hope this might help someone who have the same curiosity.
[EDIT] Let me add some details about "lookups".
So if you compile above code and disassembles the class file with javap command with -c option, it will print out disassembled code, i.e., the instructions that comprise the Java bytecodes.
public void zero();
Code:
0: iconst_0 // Push int constant 0
1: istore_1 // Store into local variable 1 (sum=0)
2: iconst_0 // Push int constant 0
3: istore_2 // Store into local variable 2 (i=0)
4: goto 22 // First time through don't increment
7: iload_1
8: aload_0
9: getfield #14 // Field mArray:[LTest$Foo;
12: iload_2
13: aaload
14: getfield #39 // Field Test$Foo.mSplat:I
17: iadd
18: istore_1
19: iinc 2, 1
22: iload_2 // Push value of local variable 2 (i)
23: aload_0 // Push local variable 0 (this)
24: getfield #14 // Field mArray:[LTest$Foo;
27: arraylength // Get length of array
28: if_icmplt 7 // Compare and loop if less than (i < mArray.length)
31: return
public void one();
Code:
0: iconst_0 // Push int constant 0
1: istore_1 // Store into local variable 1 (sum=0)
2: aload_0 // Push this
3: getfield #14 // Field mArray:[LTest$Foo;
6: astore_2 // Store reference into local variable (localArray)
7: aload_2 // Load reference from local variable
8: arraylength // Get length of array
9: istore_3 // Store into local variable 3 (len = mArray.length)
10: iconst_0 // Push int constant 0
11: istore 4 // Store into local variable 4 (i=0)
13: goto 29 // First time through don't increment
16: iload_1
17: aload_2
18: iload 4
20: aaload
21: getfield #39 // Field Test$Foo.mSplat:I
24: iadd
25: istore_1
26: iinc 4, 1
29: iload 4 // Load i from local variable
31: iload_3 // Load len from local variable
32: if_icmplt 16 // // Compare and loop if less than (i < len)
35: return
These instructions are a bit unfamiliar, so I looked up in JVM spec documents. (If you are curious, especially chapter 3, Compiling for the Java Virtual Machine, and chapter 6, The Java Virtual Machine Instruction Set would be helpful).
I added comment to help you understand, but in a nut shell, method zero() should operate getfield instruction on every iteration. According to JVM spec documentation 3.8. Working with Class Instances section, getfield operation performs several jobs like below.
The compiler generates symbolic references to the fields of an
instance, which are stored in the run-time constant pool. Those
run-time constant pool items are resolved at run-time to determine the
location of the field within the referenced object.
These 3 lines of code can be written like
We're in the business of making a useful and extendable programs, and in order to acheive that we should make our life as Developers easy as we can.
If it takes me a 5 more seconds to read the code and i can simplify it, i would. Specially if it comes in the expense of a int memory.. hardly calls as Optimization.
in this example, array is a instance variable, and as you can see, a new local variable a is assigned to hold it. Does anybody know about this?
This is hardly calls as question, i believe you meant to phrase it like that:
Why does they used another reference to array called a if they could use array ?
Well, I truly can't see why, because they could have use the E type since it given to them. It may be a reason of Covariance and Contravariance but i'm not sure.
Tip: Also next time you add pieces of a language source code, it will be nice to know which JDK you are viewing and a link me very help.
Keep in mind that --elementCount does the assignment before decrement. That means the fragment:
final int index = --elementCount;
final E obj = (E) elementData[index];
elementData[index] = null;
Can be translated into
final int index = elementCount;
elementCount--;
final E obj = (E) elementData[index];
elementData[index] = null;
Which means in your proposed replacement "elementData[--elementCount]" and "elementData[elementCount]" do not reference the same item. Your proposed replacement is not equivalent.
Hope this helps.
I was wondering which of the following would execute faster, just out of curiosity. The language is Java.
int num = -500;
int num2 = 0;
while( Math.abs(num) > num2 )
num2 ++;
or
int num = -500;
int num2 = 0;
num = Math.abs(num);
while( num > num2 )
num2 ++;
Essentially I am wondering whether 'Math.abs' is called for every iteration of the while loop, or is there some code optimization going on in the background?
Thanks!
Math.abs() is what is called a pure function, so a really good compiler could theoretically optimize it out. There are functional programming languages specifically designed to do just that, but in Java it would be difficult.
Not only is the second one likely to be compiled into faster code, it's generally accepted as better style, as it makes more clear what actually changes in the loop and what doesn't.
Yes, Math.abs(num) is called for each iteration, because Java can never tell or guess, that a return value only depends on the parameter.
For Java, the method is "equal" to Math.random().
So the first example uses more CPU time.
Out of curiosity, I executed an unscientific benchmark that returned the following results:
For comparability:
Number of warmups ([1]): 1000
Number of iterations ([1]): 1500
Number of removed outliers ([1]): 300
Error bars showing an CI with 95% propability
[1] for each solution and scale level
Host Information:
Java Version: 1.6.0_21
Java Vendor: Sun Microsystems Inc.
Java VM Arguments: -Xmx1024m; -Dfile.encoding=Cp1252
OS Architecture: amd64
OS Name: Windows 7
OS Version: 6.1
Available cores: 2
Free memory available to JVM (bytes): 122182232
Maximum memory (bytes): 954466304
Total memory in use (bytes): 124125184
The second one has to do the absolute value function every pass of the while loop. Of course, this is only true if Java no longer optimizes to store the results of operations like this, and I believe this is the case. Java hasn't had that optimization for a while, it now relies on the JIT.
So to answer your question, the first one is faster.
class AbsTest {
public static void main(String[] args) {
int num = -2000000000;
int num2 = 0;
long then = System.currentTimeMillis();
while( Math.abs(num) > num2 )
num2 ++;
long then2 = System.currentTimeMillis();
num = Math.abs(num);
num2 = 0;
while( num > num2 )
num2 ++;
long now = System.currentTimeMillis();
System.out.println(then2 - then); // first time
System.out.println(now - then2); // second time
}
}
result:
C:\Documents and Settings\glowcoder\My Documents>java AbsTest
2953
1828
C:\Documents and Settings\glowcoder\My Documents>
Hey Inventor i think it is worthless to have comparison between these two as it depends upon your compiler implementation. As
if your compiler performs optimization then you find that most probably both code will take same running time.
if it doesn't perform that then obviously the second one is faster.
So i think you should not be worry about these kind of things.
Java Code
package test;
public class SpeedTest
{
public void first()
{
int num = -500;
int num2 = 0;
while( Math.abs(num) > num2 )
num2 ++;
}
public void second()
{
int num = -500;
int num2 = 0;
num = Math.abs(num);
while( num > num2 )
num2 ++;
}
}
Byte Code
Compiled from "SpeedTest.java"
public class test.SpeedTest extends java.lang.Object{
public test.SpeedTest();
Code:
0: aload_0
1: invokespecial #8; //Method java/lang/Object."<init>":()V
4: return
public void first();
Code:
0: sipush -500
3: istore_1
4: iconst_0
5: istore_2
6: goto 12
9: iinc 2, 1
12: iload_1
13: invokestatic #15; //Method java/lang/Math.abs:(I)I
16: iload_2
17: if_icmpgt 9
20: return
public void second();
Code:
0: sipush -500
3: istore_1
4: iconst_0
5: istore_2
6: iload_1
7: invokestatic #15; //Method java/lang/Math.abs:(I)I
10: istore_1
11: goto 17
14: iinc 2, 1
17: iload_1
18: iload_2
19: if_icmpgt 14
22: return
}
From the above both of them are essentially taking ~20 instruction. If you are very picky then the first one is fast.
The reason for difference is that you are calculating and storing the result in the second approach. Which you need to popup again while comparing. While in the first case you are directly comparing the register value just after the Math.abs. And hence two extra instruction.
Update
As pointed out by #ide and #bestsss:
The number of instructions in the
bytecode doesn't really correlate with
the number of times they're actually
called. Plus there's HotSpot to spice
things up further (like dead code
optimization).
As in this example the Math.abs() is called upon a fixed value of -500. So it is possible for HotSpot JVM to optimize it.
See the below comments for more details.
This isn't meant to be subjective, I am looking for reasons based on resource utilisation, compiler performance, GC performance etc. rather than elegance. Oh, and the position of brackets doesn't count, so no stylistic comments please.
Take the following loop;
Integer total = new Integer(0);
Integer i;
for (String str : string_list)
{
i = Integer.parse(str);
total += i;
}
versus...
Integer total = 0;
for (String str : string_list)
{
Integer i = Integer.parse(str);
total += i;
}
In the first one i is function scoped whereas in the second it is scoped in the loop. I have always thought (believed) that the first one would be more efficient because it just references an existing variable already allocated on the stack, whereas the second one would be pushing and popping i each iteration of the loop.
There are quite a lot of other cases where I tend to scope variables more broadly than perhaps necessary so I thought I would ask here to clear up a gap in my knowledge. Also notice that assignment of the variable on initialisation either involving the new operator or not. Do any of these sorts of semi-stylistic semi-optimisations make any difference at all?
The second one is what I would prefer. There is no functional difference other than the scoping.
Setting the same variable in each iteration makes no difference because Integer is an immutable class. Now, if you were modifying an object instead of creating a new one each time, then there would be a difference.
And as a side note, in this code you should be using int and Integer.parseInt() rather than Integer and Integer.parse(). You're introducing quite a bit of unnecessary boxing and unboxing.
Edit: It's been a while since I mucked around in bytecode, so I thought I'd get my hands dirty again.
Here's the test class I compiled:
class ScopeTest {
public void outside(String[] args) {
Integer total = 0;
Integer i;
for (String str : args)
{
i = Integer.valueOf(str);
total += i;
}
}
public void inside(String[] args) {
Integer total = 0;
for (String str : args)
{
Integer i = Integer.valueOf(str);
total += i;
}
}
}
Bytecode output (retrieved with javap -c ScopeTest after compiling):
Compiled from "ScopeTest.java"
class ScopeTest extends java.lang.Object{
ScopeTest();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: return
public void outside(java.lang.String[]);
Code:
0: iconst_0
1: invokestatic #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
4: astore_2
5: aload_1
6: astore 4
8: aload 4
10: arraylength
11: istore 5
13: iconst_0
14: istore 6
16: iload 6
18: iload 5
20: if_icmpge 55
23: aload 4
25: iload 6
27: aaload
28: astore 7
30: aload 7
32: invokestatic #3; //Method java/lang/Integer.valueOf:(Ljava/lang/String;)Ljava/lang/Integer;
35: astore_3
36: aload_2
37: invokevirtual #4; //Method java/lang/Integer.intValue:()I
40: aload_3
41: invokevirtual #4; //Method java/lang/Integer.intValue:()I
44: iadd
45: invokestatic #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
48: astore_2
49: iinc 6, 1
52: goto 16
55: return
public void inside(java.lang.String[]);
Code:
0: iconst_0
1: invokestatic #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
4: astore_2
5: aload_1
6: astore_3
7: aload_3
8: arraylength
9: istore 4
11: iconst_0
12: istore 5
14: iload 5
16: iload 4
18: if_icmpge 54
21: aload_3
22: iload 5
24: aaload
25: astore 6
27: aload 6
29: invokestatic #3; //Method java/lang/Integer.valueOf:(Ljava/lang/String;)Ljava/lang/Integer;
32: astore 7
34: aload_2
35: invokevirtual #4; //Method java/lang/Integer.intValue:()I
38: aload 7
40: invokevirtual #4; //Method java/lang/Integer.intValue:()I
43: iadd
44: invokestatic #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
47: astore_2
48: iinc 5, 1
51: goto 14
54: return
}
Contrary to my expectations, there was one difference between the two: in outside(), the variable i still took up a register even though it was omitted from the actual code (note that all the iload and istore instructions point one register higher).
The JIT compiler should make short work of this difference, but still you can see that limiting scope is a good practice.
(And with regards to my earlier side note, you can see that to add two Integer objects, Java must unbox both with intValue, add them, and then create a new Integer with valueOf. Don't do this unless absolutely necessary, because it's senseless and slower.)
The second one is far better because the first style is should only be used in C code as its mandatory. Java allows for inline declarations to minimize the scope of variables and you should take advantage of that. But you code can be further improved:
int total = 0;
for (String str: stringList) {
try {
total += Integer.valueOf(str);
} catch(NumberFormationException nfe) {
// more code to deal with the error
}
}
That follows the Java code style convention. Read the full guide here:
http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html
It makes no significant difference apart from on the last iteration, when the reference is cleared quicker in the second example (and that would be my preference - not so much for that reason, but clarity.)
Keep the scope to the minimum possible. The hotspot VM does escape analysis to determine when references are no longer accessible, and on the basis of this allocates some objects on the stack rather than on the heap. Keeping scope as small as possible aids this process.
I would ask why you're using Integer instead of a simple int...or perhaps it's just by way of example?
The second is far better. Scoping variables as narrowly as possible makes the code far easier to read and maintain, which are much more important overall than the performance differences between these examples, which are trivial and easily optimized away.
Neither. Integer.valueOf(0); will use a reference to a cached 0. :)
Well, in this case, you're instantiating an Integer primitive every single time you say i = Integer.parseInt(str) (where i is an Integer), so (unless Java knows how to optimize it), both cases are almost equally inefficient. Consider using int instead:
int total = 0;
for (String str : string_list)
{
int i = Integer.parseInt(str);
total += i;
}
Now we're back to the question of whether to put the int declaration on the inside or outside. Assuming the Java compiler has a lick of decent optimization, I'd say it doesn't matter. Efficiency aside, it is considered good practice to declare variables as close as possible to their use.
The second one is the preferable of the two for readability, maintainability, and efficiency.
All three of these goals are achieved because you are succinctly explaining what you are doing and how your variables are being used. You are explaining this clearly to both developers and the compiler. When the variable i is defined in the for block everyone knows that it is safe to ignore it outside of the block and that the value is only valid for this iteration of the block. This will lead the to the Garbage Collector being able to be able to easily mark this memory to be freed.
I would suggest not using Integer for intermediate values. Accumulate the total as an int and after the loop create the Object or depend on auto-boxing.
Assuming you have positive numbers in your list and you're serious with
I am looking for reasons based on
resource utilisation, compiler
performance, GC performance etc.
rather than elegance.
You should implement it by yourself like:
import java.util.ArrayList;
import java.util.List;
public class Int {
public static void main(String[] args) {
List<String> list = new ArrayList<String>();
list.add("10");
list.add("20");
int total = 0;
for (String str : list) {
int val = 0;
for (char c : str.toCharArray()) {
val = val * 10 + (int) c - 48;
}
total += val;
}
System.out.print(total);
}
}
The only GC relevant thing would be toCharArray() which could be replaced by another loop using charAt()
The question of what variable scope to use is a readability issue more than anything else. The code is better understood when every variable is restricted to the scope where it is actually used.
Now, if we inspect the technical consequences of using wide/narrow scopes, I believe that there IS a performance/footpring advantage with narrow scopes. Consider the following method, where we have 3 local variables, belonging to one global scope:
private static Random rnd = new Random();
public static void test() {
int x = rnd.nextInt();
System.out.println(x);
int y = rnd.nextInt();
System.out.println(y);
int z = rnd.nextInt();
System.out.println(z);
}
If you diassemble this code (using javap -c -verbose {class name} for example), you will see that the compiler reserves 3 slots for local variables in the stack frame structure of the test() method.
Now, suppose that we add some artificial scopes:
public static void test() {
{
int x = rnd.nextInt();
System.out.println(x);
}
{
int y = rnd.nextInt();
System.out.println(y);
}
{
int z = rnd.nextInt();
System.out.println(z);
}
}
If you diassemble the code now, you will notice that the compiler reserves only 1 slot for local variables. Since the scopes are completely independent, each time x,y or z are used, the same slot #0 is used.
What does it mean?
1) Narrow scopes save stack space
2) If we are dealing with object variables, it means that the objects may become unreachable faster, therefore are eligible for GC sooner than otherwise.
Again,note that these 2 "advantages" are really minor, and the readability concern should be by far the most important concern.
Second one since you want to keep the scope of your variables as "inner" as possible. The advantage of smaller scope is less chance for collision. In your example, there's only a few lines so the advantage might not be so obvious. But if it's larger, having the smaller-scope variables definitely is more beneficial. If someone else later has to look at the code, they would have to scan all the way back to right outside the method definition to know what i is. The argument is not much different than that of why we want to avoid global variable.
Consider the following two ways of writing a loop in Java to see if a list contains a given value:
Style 1
boolean found = false;
for(int i = 0; i < list.length && !found; i++)
{
if(list[i] == testVal)
found = true;
}
Style 2
boolean found = false;
for(int i = 0; i < list.length && !found; i++)
{
found = (list[i] == testVal);
}
The two are equivalent, but I always use style 1 because 1) I find it more readable, and 2) I am assuming that reassigning found to false hundreds of times feels like it would take more time. I am wondering: is this second assumption true?
Nitpicker's corner
I am well aware that this is a case of premature optimization. That doesn't mean that it isn't something that is useful to know.
I don't care which style you think is more readable. I am only interested in whether one has a performance penalty compared to the other.
I know that style 1 has the advantage of allowing you to also put a break; statement in the if block, but I don't care. Again, this question is about performance, not style.
Well, just write a micro benchmark:
import java.util.*;
public class Test {
private static int[] list = new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9} ;
private static int testVal = 6;
public static boolean version1() {
boolean found = false;
for(int i = 0; i < list.length && !found; i++)
{
if(list[i] == testVal)
found = true;
}
return found;
}
public static boolean version2() {
boolean found = false;
for(int i = 0; i < list.length && !found; i++)
{
found = (list[i] == testVal);
}
return found;
}
public static void main(String[] args) {
// warm up
for (int i=0; i<100000000; i++) {
version1();
version2();
}
long time = System.currentTimeMillis();
for (int i=0; i<100000000; i++) {
version1();
}
System.out.println("Version1:" + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();
for (int i=0; i#lt;100000000; i++) {
version2();
}
System.out.println("Version2:" + (System.currentTimeMillis() - time));
}
}
On my machine version1 seems to be a little bit faster:
Version1:5236
Version2:5477
(But that's 0.2 seconds on a 100 million iterations. I wouldn't care about this.)
If you look at the generated bytecode there are two more instructions in version2 which probably cause the longer execution time:
public static boolean version1();
Code:
0: iconst_0
1: istore_0
2: iconst_0
3: istore_1
4: iload_1
5: getstatic #2; //Field list:[I
8: arraylength
9: if_icmpge 35
12: iload_0
13: ifne 35
16: getstatic #2; //Field list:[I
19: iload_1
20: iaload
21: getstatic #3; //Field testVal:I
24: if_icmpne 29
27: iconst_1
28: istore_0
29: iinc 1, 1
32: goto 4
35: iload_0
36: ireturn
public static boolean version2();
Code:
0: iconst_0
1: istore_0
2: iconst_0
3: istore_1
4: iload_1
5: getstatic #2; //Field list:[I
8: arraylength
9: if_icmpge 39
12: iload_0
13: ifne 39
16: getstatic #2; //Field list:[I
19: iload_1
20: iaload
21: getstatic #3; //Field testVal:I
24: if_icmpne 31
27: iconst_1
28: goto 32
31: iconst_0
32: istore_0
33: iinc 1, 1
36: goto 4
39: iload_0
40: ireturn
Comment about nitpicks corner:
If you're really concerned with absolute performance, putting a break in and removing the "&& !found" will give you theoretically better performance on #1. Two less binary ops to worry about every iteration.
If you wanted to get really anal about optimization without using breaks then
boolean notFound = true;
for(int i = 0; notFound && i < list.length; i++)
{
if(list[i] == testVal)
notFound = false;
}
will run faster in the average case than the existing option #1.
And of course it's personal preference, but I prefer to never put any extra evaluations inside the head of a for loop. I find it can cause confusion while reading code, because it's easy to miss. If I can't get the desired behavior using break/continues I will use a while or do/while loop instead.
Actually, the "if" will slow your program down more than assignment due to the pipeline.
It depends on what compiler you use since different compilers might do different optimizations.
I believe style 2 is ever-so-slightly faster - say 1 clock cycle or so.
I'd rewrite it into the following, though, if I were tackling it:
for(i=0; i<list.length && list[i]!=testval; i++);
boolean found = (i!=list.length);
It seems to me that if you expect your value to be found before the end of the list, you'd be better off with #2 - as it short circuits the check with !found in the loop conditional. Assuming you put a break in, the 1st option (the only sensible thing, IMO), then pseudo assembly would look something like:
Option 1:
start:
CMP i, list.length
JE end
CMP list[i], testval
JE equal
JMP start
equal:
MOV true, found
end:
Option 2:
start:
CMP i, list.length
JE end
CMP true, found
JE end
CMP list[i], testval
JE equal
JNE notequal
equal:
MOV true, found
JMP start
notequal:
MOV false, found
JMP start
end:
I'd say Option 1 is superior here, as it's about 1/3rd less instructions. Of course, this is without optimizations - but that'd be compiler and situation specific (what is found doing after this? can we just optimize it away all together?).
Here is another style
for(int i = 0; i < list.length; i++)
{
if(list[i] == testVal)
return true;
}
return false;
I think both alternatives leave something to be desired, from a performance point of view.
Think about how many tests (which are almost always jumps) you do per iteration, and try to minimize the amount.
The solution by Matt, to return out when the answer is found, reduces the number of tests from three (loop iterator, found-test in loop, actual comparison) to two. Doing the "found"-test essentially twice is wasteful.
I'm not sure if the classic, but somewhat obfuscating, trick of looping backwards is a win in Java, and not hot enough at reading JVM code to figure it out right now, either.
I would say that in 98% of systems, it does not matter. The difference, if there is any, is hardly noticeable unless that loop is the main portion of code and is running a mindnumbing number of times.
Edit: That is ofcourse assuming that it is not being already optimized by the compiler.
Any decent compiler would keep found in a register for a duration of the loop and so the cost is absolutely negligible.
If the second style is done without a branch then it would be preferable, since the CPU's pipeline will not get disrupted as much ... but that depends on how the compiler uses the instruction set.
This will only be measurable in code which is extremely performance-sensitive (simulators, emulators, video encoding software, etc.) in which case you probably want to manually inspect the generated code anyway to make sure that the compiler actually generates sensible code.
To be sure, you should compile both versions (say with latest compiler from Sun) and examine the generated bytecode with an appropriate tool... That's the only reliable way to know for sure, everything else is wild guess.
boolean found = false;
for(int i = 0; i < list.length && !found; i++)
{
if(list[i] == testVal)
found = true;
}
I don't see a break statement in the block.
Other than that, I prefer this style. It improves readability and thereby the chance that a maintainer mis-reading and mis-fixing it.