Is there any difference between:
String x = getString();
doSomething(x);
vs.
doSomething(getString());
Resources and performance wise, Especially is it's done within a loop for tens, hundreds or thousands of times?
It has the same overhead. Local variables are just there to make your life easier. At the VM level they don't necessarily exist and certainly not anymore when machine code is run.
So what you need to worry about here is getString(), whether it is potentially expensive. x has very likely no effect at all.
Let me first begin by saying that your overriding goal should almost always be to maintain code readability. Your compiler is almost always better at trivial optimizations than you are. Trust it!
In response to your specific example: the bytecode generated for each example IS different. It didn't appear to make much difference though, because there wasn't a statistically significant or even consistent difference between the two approaches in a loop over Integer.MAX_VALUE iterations.
I believe both would be the same at compile time, the first may become more code readable in some cases though.
Both the statements are same. only difference is that in first approach you have used a local variable X which can be avoided using second syntax.
That largely depends on the use-case. Are you going to make repeated calls to doSomething using that exact String? Then using the local variable is a bit more efficient. However, if it's a single call or multiple calls with different Strings, it makes no difference.
Related
If for some reason I have to use as little memory space as possible, then is the second code bellow better that the first? (the code is just for illustration and doesn't have any meaning). (edit: imagine I want to make the assembly code of this before any optimization done by the JVM. Then do i use 99999998 extra memory locations in the first procedure compared to the second one? the focus is just one memory usage)
First:
for(int i=0; i<99999999; i++){
int k=2*i
}
Second:
int k=0;
for(int i=0; i<99999999; i++){
k=2*i
}
What I'm sure of :
In every case, The difference will not be visible. if you want to make such small optimization, Java is surely not the best technology. Which makes me recommend the first one cause it make the code more readable and logical. (Find it strange to declare a variable outside the for if you use it only inside it, it's confusing).
What I think :
In your small example and since your looking for a memory footprint. The first one is better because it follow the implicit rule mentioned in this comment: the smaller the scope is the better it is.
In the first case the variable k is used only in a really small loop. So the optimizer will easily understand it and use only a register, so no memory usage and less instructions.
In the second case, it will be harder for the optimizer to determine that k is not use elsewhere. So it could allow some memory instead of using a register. It will then use some memory and be less optimized since it will need instructions to load and store the memory.
As mentionned in this comment, it will mostly depend on how you use it. In your example the optimizer will detect it's the same usage in both case and will use no memory. But in harder code it will not always find it. So I recommend to have the smaller scope has possible.
#pranay-khandelwal 's response links to an excellent discussion of this question in a different language. The Java JVM, however, throws bytecode around at runtime to try to achieve better performance, which can complicate things.
There's actually another discussion here on a similar topic of in/out best practices in general for readability, which resulted in some benchmarks and discussion that one of the participants documented here
As a general rule of thumb the second option will be better for memory and performance under almost all circumstances - where the former may be more self-documenting and maintainable, and avoid accidental use elsewhere in the outer scope.
Since you mention that the code in your post is only a representative example and this could apply to more than just simple types:
Replacing the contents of an already registered memory area is less costly than registering a new one and deregistering the old one (or waiting for it to be garbage collected). Even where a language's compiler or interpreter smartly uses recently unreferenced things to store new things, that work also takes work, and is overhead that can optimistically be avoided with outer declarations - though as others mention, this is usually an unnecessary and potentially bug-spawning micro-optimization.
As from these short examples you provided, the second option. However, it always depends on the logic of your code.
Thinking about performance and minimising execution space and time, the second code scales better even though it looks countering some good coding practices.
K in your code is used only inside the loop block. However, it is "reused" over multiple "block iterations". Take a look at the syntax of your for loop, it declares i (int i) in the beginning of the statement; this declaration will happen just once. Again, declaring the variable multiple times may lead to waste to time and memory.
The JVM optimiser might do a good job in general, simple cases. However, it might fail in capturing the semantics of your code (Java).
for(int i=0,k=0; i<99999999; i++){
k=2*i
}
Consider the following method:
private static long maskAndNegate(long l) {
int numberOfLeadingZeros = Long.numberOfLeadingZeros(l)
long mask = CustomBitSet.masks[numberOfLeadingZeros];
long result = (~l) & mask;
return result;
}
The method can be abbreviated to:
private static long maskAndNegate(long l) {
return (~l) & CustomBitSet.masks[Long.numberOfLeadingZeros(l)];
}
Are these two representations equal in actual run time? In other words, does the Java compiler optimize away the unnecessary definition of extra variables, that I've placed for readability and debugging?
The Java compiler itself hardly does any optimization. It's the JIT that does almost everything.
Local variables themselves are somewhat irrelevant to optimization though - a multi-operator expression still needs the various operands logically to go on the stack, just in unnamed "slots". You may well find that the generated bytecode for your two implementations if very similar, just without the names in the second case.
More importantly, any performance benefit that might occur very occasionally from reducing the number of local variables you use is almost certainly going to be insignificant. The readability benefit of the first method is much more likely to be significant. As always, avoid micro-optimizing without first having evidence that the place you're trying to optimize is a bottleneck, and then only allow optimizations which have proved their worth.
(By the time you've proved you need to optimize a particular method, you'll already have the tools to test any potential optimization, so you won't need to guess.)
The code is not significantly large enough to be optimized for starters. And in the second way you are just saving the memory used for storing references to numberOfLeadingZeros and all.
But when you will use this code significantly enough on runtime such as 10000 times at least, then JIT will identify it as HOT code and then optimize it with neat tricks such as Method Inlining and similar sorts.
But in your case preferable option is first one as it is more readable.
You should not compromise Readability for small optimizations.
I am curious whether packing multiple and/or nested method calls within the same line of code is better for performance and that is why some developers do it, at the cost of making their code less readable.
E.g.
//like
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
Could be also written as
//dislike
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
Personally, I hate the latter because it does multiple evaluations in the same line and is hard for me to read the code. That is why I try to avoid by all means to have more than one evaluation per line of code. I also don't know that jobParams.keySet() returns a Set and that bugs me.
Another example would be:
//dislike
Bar.processParameter(Foo.getParameter());
vs
//like
Parameter param = Foo.getParameter();
Bar.processParameter(param);
The former makes me noxious and dizzy as I like to consume simple and clean evaluations in every line of code and I just hate it when I see other people's code written like that.
But are there any (performance) benefits to packing multiple method calls in the same line?
EDIT: Single liners are also more difficult to debug, thanks to #stemm for reminding
Micro optimization is killer. If the code references you are showing are either instance scope (or) method scope, I would go with second approach.
Method scope variables will be eligible for GC as soon as method execution done, so even you declare another variable, it's ok because scope is limited and the advantage you get will be readable and main-table code.
I tend to disagree with most others on this list. I actually find the first way cleaner and easier to read.
In your example:
//like
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
Could be also written as
//dislike
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
the first method (the one you like) has a lot of irrelevant information. The whole point of the iterator interface, for example, is to give you a standard interface that you can use to loop over whatever backing implementation there is. So the fact that it is a keyset has no bearing on the code itself. All you are looking for is the iterator to loop over the implemented object.
Secondly, the second implementation actually gives you more information. It tells you that the code will be ignoring the implementation of jobParams and that it will only be looping through the keys. In the first code, you must first trace back what jobParamKeySet is (as a variable) to figure out what you are iterating over. Additionally, you do not know if/where jobParamKeySet is used elsewhere in the scope.
Finally, as a last comment, the second way makes it easier to switch implementations if necessary; in the first case, you might need to recode two lines (the first variable assignment if it changes from a set to something else), whereas the second case you only need to change out one line.
That being said, there are limits to everything. Chaining 10 calls within a single line can be complicated to read and debug. However 3 or 4 levels is usually clear. Sometimes, especially if an intermediary variable is required several times, it makes more sense to declare it explicitly.
In your second example:
//dislike
Bar.processParameter(Foo.getParameter());
vs
//like
Parameter param = Foo.getParameter();
Bar.processParameter(param);
I find it actually more difficult to understand exactly which parameters are being processed by Bar.processParameter(param). It will take me longer to match param to the variable instantiation to see that it is Foo.getParameter(). Whereas the first case, the information is very clear and presented very well - you are processing Foo.getParameter() params. Personally, I find the first method is less prone to error as well - it is unlikely that you accidentally use Foo2.getParamter() when it is within the same call as opposed to a separate line.
There is one less variable assignment, but even the compiler can optimize it in some cases.
I would not do it for performance, it is kind of an early optimization. Write the code that is easier to maintain.
In my case, I find:
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
easier to be read than:
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
But I guess it is a matter of personal taste.
Code is never developed by same user. I would choose second way. Also it is easier to understand and maintain.
Also This is beneficial when two different teams are working on the code at different locations.
Many times we take an hour or more time to understand what other developer has done, if he uses first option. Personally I had this situation many times.
But are there any (performance) benefits to packing multiple method calls in the same line?
I seriously doubt the difference is measurable but even if there were I would consider
is hard for me to read the code.
to be so much more important it cannot be over stated.
Even if the it were half the speed, I would still write the simplest, cleanest and easiest to understand code and only when you have profiled the application and identified that you have an issue would I consider optimising it.
BTW: I prefer the more dense, chained code, but I would suggest you use what you prefer.
The omission of an extra local variable probably has a neglible performance advantage (although the JIT may be able to optimize this).
Personally I don't mind call chaining when its pretty clear whats done and the intermediate object is very unlikely to be null (like your first 'dislike'-example). When it gets complex (multiple .'s in the expression), I prefer explicit local variables, because its so much simpler to debug.
So I decide case by case what I prefer :)
I don't see where a().b().c().d is that much harder to read than a.b.c.d which people don't seem to mind too much. (Though I would break it up.)
If you don't like that it's all on one line, you could say
a()
.b()
.c()
.d
(I don't like that either.)
I prefer to break it up, using a couple extra variables.
It makes it easier to debug.
If performance is your concern (as it should be), the first thing to understand is not to sweat the small stuff.
If adding extra local variables costs anything at all, the rest of the code has to be rippin' fat-free before it even begins to matter.
I'm a beginner and I've always read that it's bad to repeat code. However, it seems that in order to not do so, you would have to have extra method calls usually. Let's say I have the following class
public class BinarySearchTree<E extends Comparable<E>>{
private BinaryTree<E> root;
private final BinaryTree<E> EMPTY = new BinaryTree<E>();
private int count;
private Comparator<E> ordering;
public BinarySearchTree(Comparator<E> order){
ordering = order;
clear();
}
public void clear(){
root = EMPTY;
count = 0;
}
}
Would it be more optimal for me to just copy and paste the two lines in my clear() method into the constructor instead of calling the actual method? If so how much of a difference does it make? What if my constructor made 10 method calls with each one simply setting an instance variable to a value? What's the best programming practice?
Would it be more optimal for me to just copy and paste the two lines in my clear() method into the constructor instead of calling the actual method?
The compiler can perform that optimization. And so can the JVM. The terminology used by compiler writer and JVM authors is "inline expansion".
If so how much of a difference does it make?
Measure it. Often, you'll find that it makes no difference. And if you believe that this is a performance hotspot, you're looking in the wrong place; that's why you'll need to measure it.
What if my constructor made 10 method calls with each one simply setting an instance variable to a value?
Again, that depends on the generated bytecode and any runtime optimizations performed by the Java Virtual machine. If the compiler/JVM can inline the method calls, it will perform the optimization to avoid the overhead of creating new stack frames at runtime.
What's the best programming practice?
Avoiding premature optimization. The best practice is to write readable and well-designed code, and then optimize for the performance hotspots in your application.
What everyone else has said about optimization is absolutely true.
There is no reason from a performance point of view to inline the method. If it's a performance issue, the JIT in your JVM will inline it. In java, method calls are so close to free that it isn't worth thinking about it.
That being said, there's a different issue here. Namely, it is bad programming practice to call an overrideable method (i.e., one that is not final, static, or private) from the constructor. (Effective Java, 2nd Ed., p. 89 in the item titled "Design and document for inheritance or else prohibit it")
What happens if someone adds a subclass of BinarySearchTree called LoggingBinarySearchTree that overrides all public methods with code like:
public void clear(){
this.callLog.addCall("clear");
super.clear();
}
Then the LoggingBinarySearchTree will never be constructable! The issue is that this.callLog will be null when the BinarySearchTree constructor is running, but the clear that gets called is the overridden one, and you'll get a NullPointerException.
Note that Java and C++ differ here: in C++, a superclass constructor that calls a virtual method ends up calling the one defined in the superclass, not the overridden one. People switching between the two languages sometimes forget this.
Given that, I think it's probably cleaner in your case to inline the clear method when called from the constructor, but in general in Java you should go ahead and make all the method calls you want.
I would definitely leave it as is. What if you change the clear() logic? It would be impractical to find all the places where you copied the 2 lines of code.
Generally speaking (and as a beginner this means always!) you should never make micro-optimisations like the one you're considering. Always favour readability of code over things like this.
Why? Because the compiler / hotspot will make these sorts of optimisations for you on the fly, and many, many more. If anything, when you try and make optimisations along these sorts of lines (though not in this case) you'll probably make things slower. Hotspot understands common programming idioms, if you try and do that optimisation yourself it probably won't understand what you're trying to do so it won't be able to optimise it.
There's also a much greater maintenance cost. If you start repeating code then it's going to be much more effort to maintain, which will probably be a lot more hassle than you might think!
As an aside, you may get to some points in your coding life where you do need to make low level optimisations - but if you hit those points, you'll definitely, definitely know when the time comes. And if you don't, you can always go back and optimise later if you need to.
The best practice is to measure twice and cut once.
Once you've wasted time optimization, you can never get it back again! (So measure it first and ask yourself if it's worth optimisation. How much actual time will you save?)
In this case, the Java VM is probably already doing the optimization you are talking about.
The cost of a method call is the creation (and disposal) of a stack frame and some extra byte code expressions if you need to pass values to the method.
The pattern that I follow, is whether or not this method in question would satisfy one of the following:
Would it be helpful to have this method available outside this class?
Would it be helpful to have this method available in other methods?
Would it be frustrating to rewrite this every time i needed it?
Could the versatility of the method be increased with the use of a few parameters?
If any of the above are true, it should be wrapped up in it's own method.
Keep the clear() method when it helps readability. Having unmaintainable code is more expensive.
Optimizing compilers usually do a pretty good job of removing the redundancy from these "extra" operations; in many instances, the difference between "optimized" code and code simply written the way you want, and run through an optimizing compiler is none; that is to say, the optimizing compiler usually does just as good a job as you'd do, and it does it without causing any degradation of the source code. In fact, many times, "hand-optimized" code ends up being LESS efficient, because the compiler considers many things when doing the optimization. Leave your code in a readable format, and don't worry about optimization until a later time.
"Premature optimization is the root of
all evil." - Donald Knuth
I wouldn't worry about method call as much but the logic of the method. If it was critical systems, and the system needed to "be fast" then, I would look at optimising codes that takes long to execute.
Given the memory of modern computers this is very inexpensive. Its always better to break your code up into methods so someone can quickly read whats going on. It will also help with narrowing down errors in the code if the error is restricted to a single method with a body of a few lines.
As others have said, the cost of the method call is trivial-to-nada, as the compiler will optimize it for you.
That said, there are dangers in making method calls to instance methods from a constructor. You run the risk of later updating the instance method so that it may try to use an instance variable that has not been initiated yet by the constructor. That is, you don't necessarily want to separate out the construction activities from the constructor.
Another question--your clear() method sets the root to EMPTY, which is initialized when the object is created. If you then add nodes to EMPTY, and then call clear(), you won't be resetting the root node. Is this the behavior you want?
Recently I got into a discussion with my Team lead about using temp variables vs calling getter methods. I was of the opinion for a long time that, if I know that I was going to have to call a simple getter method quite a number of times, I would put it into a temp variable and then use that variable instead. I thought that this would be a better both in terms of style and performance. However, my lead pointed out that in Java 4 and newer editions, this was not true somewhat. He is a believer of using a smaller variable space, so he told me that calling getter methods had a very negligible performance hit as opposed to using a temp variable, and hence using getters was better. However, I am not totally convinced by his argument. What do you guys think?
Never code for performance, always code for readability. Let the compiler do the work.
They can improve the compiler/runtime to run good code faster and suddenly your "Fast" code is actually slowing the system down.
Java compiler & runtime optimizations seem to address more common/readable code first, so your "Optimized" code is more likely to be de-optimized at a later time than code that was just written cleanly.
Note:
This answer is referring to Java code "Tricks" like the question referenced, not bad programming that might raise the level of loops from an O(N) to an O(N^2). Generally write clean, DRY code and wait for an operation to take noticeably too long before fixing it. You will almost never reach this point unless you are a game designer.
Your lead is correct. In modern versions of the VM, simple getters that return a private field are inlined, meaning the performance overhead of a method call doesn't exist.
Don't forget that by assigning the value of getSomething() to a variable rather than calling it twice, you are assuming that getSomething() would have returned the same thing the second time you called it. Perhaps that's a valid assumption in the scenario you are talking about, but there are times when it isn't.
It depends. If you would like to make it clear that you use the same value again and again, I'd assign it to a temp variable. I'd do so if the call of the getter is somewhat lengthy, like myCustomObject.getASpecificValue().
You will get much fewer errors in your code if it is readable. So this is the main point.
The performance differences are very small or not existent.
If you keep the code evolution in mind, simple getters in v1.0 tend to become not-so-simple getters in v2.0.
The coder who changes a simple getter to not-so-simple getter usually has no clue that there is a function that calls this getter 10 times instead of 1 and never corrects it there, etc.
That's why from the point of view of the DRY principal it makes sense to cache value for repeated use.
I will not sacrifice "Code readability" to some microseconds.
Perhaps it is true that getter performs better and can save you several microseconds in runtime. But i believe, variables can save you several hours or perhaps days when bug fixing time comes.
Sorry for the non-technical answer.
I think that recent versions of the JVM are often sufficiently clever to cache the result of a function call automatically, if some conditions are met. I think the function must have no side effects and reliably return the same result every time it is called. Note that this may or may not be the case for simple getters, depending on what other code in your class is doing to the field values.
If this is not the case and the called function does significant processing then you would indeed be better of caching its result in a temporary variable. While the overhead of a call may be insignificant, a busy method will eat your lunch if you call it more often than necessary.
I also practice your style; even if not for performance reasons, I find my code more legible when it isn't full of cascades of function calls.
It is not worth if it is just getFoo(). By caching it into a temp variable you are not making it much faster and maybe asking for trouble because getFoo() may return different value later. But if it is something like getFoo().getBar().getBaz().getSomething() and you know the value will not be changed within the block of code, then there may be a reason to use temp variable for better readability.
A general comment: In any modern system, except for I/O, do not worry about performance issues. Blazing fast CPUs and heaps of memory mean, all other issues are most of the time completely immaterial to actual performance of your system. [Of course, there are exceptions like caching solutions but they are far and rare.]
Now coming to this specific problem, yes, compiler will inline all the gets. Yet, even that is not the actual consideration, what should really matter is over all readability and flow of your code. Replacing indirections by a local variable is better, if the call used multiple times, like customer.gerOrder().getAddress() is better captured in local variable.
The virtual machine can handle the first four local variables more efficiently than any local variable declared after that (see lload and lload_<n> instructions). So caching the result of the (inlined) getter may actually hurt your performance.
Of course on their own either performance influence is almost negligible so if you want to optimize your code make sure that you are really tackling an actual bottleneck!
Another reason to not use a temporary variable to contain the result of a method call is that using the method you get the most updated value. This could not be a problem with the actual code, but it could become a problem when the code is changed.
I am in favour of using temp variable if you are sure about getter will return same value throughout the scope. Because if you have a variable having name of length 10 or more getter looks bad in readability aspect.
I've tested it in a very simple code :
created a class with a simple getter of an int (I tried both with final and non-final value for Num, didn't see any difference, mind that it's in the case num never change also...!):
Num num = new Num(100_000_000);
compared 2 differents for loops:
1: for(int i = 0; i < num.getNumber(); ++i){(...)}
2: number = num.getNumber();
for(int i = 0; i < number; ++i){(...)}
The result were around 3 millis int the first one and around 2 millis in the second one. So there's a tiny difference, nothing to worry about for small loops, may be more problematic on big iterations or if you always call getter and need them a lot. For instance, in image processing if you want to be quick, don't use repetively getters I would advise...
I'm +1 for saving the variable.
1) Readability over performance - your code is not just for you.
2) Performance might be negligible but not all the time. I think it is important to be consistent and set a precedent. So, while it might not matter for one local variable - it could matter in a larger class using the same value multiples times or in the case of looping.
3) Ease of changing implementation/ avoiding DRY code. For now you get the value from this one place with a getter and theoretically you use the getter 100 times in one class. But in the future - if you want to change where/how you get the value - now you have to change it 100 times instead of just once when you save it as an instance variable.