Java Method invocation vs using a variable - java

Recently I got into a discussion with my Team lead about using temp variables vs calling getter methods. I was of the opinion for a long time that, if I know that I was going to have to call a simple getter method quite a number of times, I would put it into a temp variable and then use that variable instead. I thought that this would be a better both in terms of style and performance. However, my lead pointed out that in Java 4 and newer editions, this was not true somewhat. He is a believer of using a smaller variable space, so he told me that calling getter methods had a very negligible performance hit as opposed to using a temp variable, and hence using getters was better. However, I am not totally convinced by his argument. What do you guys think?

Never code for performance, always code for readability. Let the compiler do the work.
They can improve the compiler/runtime to run good code faster and suddenly your "Fast" code is actually slowing the system down.
Java compiler & runtime optimizations seem to address more common/readable code first, so your "Optimized" code is more likely to be de-optimized at a later time than code that was just written cleanly.
Note:
This answer is referring to Java code "Tricks" like the question referenced, not bad programming that might raise the level of loops from an O(N) to an O(N^2). Generally write clean, DRY code and wait for an operation to take noticeably too long before fixing it. You will almost never reach this point unless you are a game designer.

Your lead is correct. In modern versions of the VM, simple getters that return a private field are inlined, meaning the performance overhead of a method call doesn't exist.

Don't forget that by assigning the value of getSomething() to a variable rather than calling it twice, you are assuming that getSomething() would have returned the same thing the second time you called it. Perhaps that's a valid assumption in the scenario you are talking about, but there are times when it isn't.

It depends. If you would like to make it clear that you use the same value again and again, I'd assign it to a temp variable. I'd do so if the call of the getter is somewhat lengthy, like myCustomObject.getASpecificValue().
You will get much fewer errors in your code if it is readable. So this is the main point.
The performance differences are very small or not existent.

If you keep the code evolution in mind, simple getters in v1.0 tend to become not-so-simple getters in v2.0.
The coder who changes a simple getter to not-so-simple getter usually has no clue that there is a function that calls this getter 10 times instead of 1 and never corrects it there, etc.
That's why from the point of view of the DRY principal it makes sense to cache value for repeated use.

I will not sacrifice "Code readability" to some microseconds.
Perhaps it is true that getter performs better and can save you several microseconds in runtime. But i believe, variables can save you several hours or perhaps days when bug fixing time comes.
Sorry for the non-technical answer.

I think that recent versions of the JVM are often sufficiently clever to cache the result of a function call automatically, if some conditions are met. I think the function must have no side effects and reliably return the same result every time it is called. Note that this may or may not be the case for simple getters, depending on what other code in your class is doing to the field values.
If this is not the case and the called function does significant processing then you would indeed be better of caching its result in a temporary variable. While the overhead of a call may be insignificant, a busy method will eat your lunch if you call it more often than necessary.
I also practice your style; even if not for performance reasons, I find my code more legible when it isn't full of cascades of function calls.

It is not worth if it is just getFoo(). By caching it into a temp variable you are not making it much faster and maybe asking for trouble because getFoo() may return different value later. But if it is something like getFoo().getBar().getBaz().getSomething() and you know the value will not be changed within the block of code, then there may be a reason to use temp variable for better readability.

A general comment: In any modern system, except for I/O, do not worry about performance issues. Blazing fast CPUs and heaps of memory mean, all other issues are most of the time completely immaterial to actual performance of your system. [Of course, there are exceptions like caching solutions but they are far and rare.]
Now coming to this specific problem, yes, compiler will inline all the gets. Yet, even that is not the actual consideration, what should really matter is over all readability and flow of your code. Replacing indirections by a local variable is better, if the call used multiple times, like customer.gerOrder().getAddress() is better captured in local variable.

The virtual machine can handle the first four local variables more efficiently than any local variable declared after that (see lload and lload_<n> instructions). So caching the result of the (inlined) getter may actually hurt your performance.
Of course on their own either performance influence is almost negligible so if you want to optimize your code make sure that you are really tackling an actual bottleneck!

Another reason to not use a temporary variable to contain the result of a method call is that using the method you get the most updated value. This could not be a problem with the actual code, but it could become a problem when the code is changed.

I am in favour of using temp variable if you are sure about getter will return same value throughout the scope. Because if you have a variable having name of length 10 or more getter looks bad in readability aspect.

I've tested it in a very simple code :
created a class with a simple getter of an int (I tried both with final and non-final value for Num, didn't see any difference, mind that it's in the case num never change also...!):
Num num = new Num(100_000_000);
compared 2 differents for loops:
1: for(int i = 0; i < num.getNumber(); ++i){(...)}
2: number = num.getNumber();
for(int i = 0; i < number; ++i){(...)}
The result were around 3 millis int the first one and around 2 millis in the second one. So there's a tiny difference, nothing to worry about for small loops, may be more problematic on big iterations or if you always call getter and need them a lot. For instance, in image processing if you want to be quick, don't use repetively getters I would advise...

I'm +1 for saving the variable.
1) Readability over performance - your code is not just for you.
2) Performance might be negligible but not all the time. I think it is important to be consistent and set a precedent. So, while it might not matter for one local variable - it could matter in a larger class using the same value multiples times or in the case of looping.
3) Ease of changing implementation/ avoiding DRY code. For now you get the value from this one place with a getter and theoretically you use the getter 100 times in one class. But in the future - if you want to change where/how you get the value - now you have to change it 100 times instead of just once when you save it as an instance variable.

Related

variable declaration inside/outside loop

If for some reason I have to use as little memory space as possible, then is the second code bellow better that the first? (the code is just for illustration and doesn't have any meaning). (edit: imagine I want to make the assembly code of this before any optimization done by the JVM. Then do i use 99999998 extra memory locations in the first procedure compared to the second one? the focus is just one memory usage)
First:
for(int i=0; i<99999999; i++){
int k=2*i
}
Second:
int k=0;
for(int i=0; i<99999999; i++){
k=2*i
}
What I'm sure of :
In every case, The difference will not be visible. if you want to make such small optimization, Java is surely not the best technology. Which makes me recommend the first one cause it make the code more readable and logical. (Find it strange to declare a variable outside the for if you use it only inside it, it's confusing).
What I think :
In your small example and since your looking for a memory footprint. The first one is better because it follow the implicit rule mentioned in this comment: the smaller the scope is the better it is.
In the first case the variable k is used only in a really small loop. So the optimizer will easily understand it and use only a register, so no memory usage and less instructions.
In the second case, it will be harder for the optimizer to determine that k is not use elsewhere. So it could allow some memory instead of using a register. It will then use some memory and be less optimized since it will need instructions to load and store the memory.
As mentionned in this comment, it will mostly depend on how you use it. In your example the optimizer will detect it's the same usage in both case and will use no memory. But in harder code it will not always find it. So I recommend to have the smaller scope has possible.
#pranay-khandelwal 's response links to an excellent discussion of this question in a different language. The Java JVM, however, throws bytecode around at runtime to try to achieve better performance, which can complicate things.
There's actually another discussion here on a similar topic of in/out best practices in general for readability, which resulted in some benchmarks and discussion that one of the participants documented here
As a general rule of thumb the second option will be better for memory and performance under almost all circumstances - where the former may be more self-documenting and maintainable, and avoid accidental use elsewhere in the outer scope.
Since you mention that the code in your post is only a representative example and this could apply to more than just simple types:
Replacing the contents of an already registered memory area is less costly than registering a new one and deregistering the old one (or waiting for it to be garbage collected). Even where a language's compiler or interpreter smartly uses recently unreferenced things to store new things, that work also takes work, and is overhead that can optimistically be avoided with outer declarations - though as others mention, this is usually an unnecessary and potentially bug-spawning micro-optimization.
As from these short examples you provided, the second option. However, it always depends on the logic of your code.
Thinking about performance and minimising execution space and time, the second code scales better even though it looks countering some good coding practices.
K in your code is used only inside the loop block. However, it is "reused" over multiple "block iterations". Take a look at the syntax of your for loop, it declares i (int i) in the beginning of the statement; this declaration will happen just once. Again, declaring the variable multiple times may lead to waste to time and memory.
The JVM optimiser might do a good job in general, simple cases. However, it might fail in capturing the semantics of your code (Java).
for(int i=0,k=0; i<99999999; i++){
k=2*i
}

Java use getter in for loop or create a local variable? [duplicate]

This question already has answers here:
java how expensive is a method call
(12 answers)
Closed 6 years ago.
I have a for loop which runs 4096 times and it should be as fast as possible. Performance is really important here. Currently I use getter methods inside the loop which just return values or objects from fields which don't change while the loop is in progress.
Example:
for (;;) {
doSomething(example.getValue());
}
Is there any overhead using getters? Is it faster using the following way?
Example:
Object object = example.getValue();
for (;;) {
doSomething(object);
}
If yes, is that also true for accessing public fields like example.value?
Edit: I don't use System.out.println() inside the loop.
Edit: Some fields are not final. No fields are volatile and no method (getter) is synchronized.
As Rogério answered, getting the object reference outside the loop (Object object = example.getValue();) will likely be faster (or will at least never be slower) than calling the getter inside the loop because
in the "worst" case, example.getValue() might actually do some very computationally-expensive stuff in the background despite that getter methods are supposed to be "trivial". By assigning a reference once and re-using it, you do this expensive computation only once.
in the "best" case, example.getValue() does something trivial such as return value; and so assigning it inside the loop would be no more expensive than outside the loop after the JIT compiler inlines the code.
However, more important is the difference in semantics between the two and its possible effects in a multi-threaded environment: If the state of the object example changes in a way which causes example.getValue() to return references to different objects, it is possible that, in each iteration, the method doSomething(Object object) will actually operate on a different instance of Object by directly calling doSomething(example.getValue());. On the other hand, by calling a getter outside the loop and setting a reference to the returned instance (Object object = example.getValue();), doSomething(object); will operate on object n times for n iterations.
This difference in semantics can cause behavior in a multi-threaded environment to be radically different from that in a single-threaded environment. Moreover, this need not be an actual "in-memory" multi-threading issue: If example.getValue() depends on e.g. database/HDD/network resources, it is possible that this data changes during execution of the loop, making it possible that a different object is returned even if the Java application itself is single-threaded. For this reason, it is best to consider what you actually want to accomplish with your loop and to then choose the option which best reflects the intended behavior.
It depends on the getter.
If it's a simple getter, the JIT will in-line it to a direct field access anyway, so there won't be a measurable difference. From a style point of view, use the getter - it's less code.
If the getter is accessing a volatile field, there's an extra memory access hit as the value can't be kept in the register, however the hit is very small.
If the getter is synchronized, then using a local variable will be measurably faster as locks don't need to be obtained and released every call, but the loop code will use the potentially stale value of the field at the time the getter was called.
You should prefer a local variable outside the loop, for the following reasons:
It tends to make the code easier to read/understand, by avoiding nested method calls like doSomething(example.getValue()) in a single line of code, and by allowing the code to give a better, more specific, name to the value returned by the getter method.
Not all getter methods are trivial (ie, they sometimes do some potentially expensive work), but developers often don't notice it, assuming a given method is trivial and inexpensive when it really isn't. In such cases, the code may take a significant performance hit without the developer realizing it. Extraction into a local variable tends to avoid this issue.
It's very easy to worry about performance much more than is necessary. I know the feeling. Some things to consider:
4096 is not much, so unless this has to complete in an extremely short time don't worry about performance so much.
If there is anything else remotely expensive going on in this loop, the getter won't matter.
Premature optimisation is the root of all evil. Focus on making your code correct and clear first. Then measure and profile it and narrow down the most expensive thing, and take care of that. Improve the actual algorithm if possible.
Regarding your question, I don't know exactly what the JIT does, but unless it can prove with certainty that example.getValue() or example.value doesn't change in the loop (which is hard to do unless the field is final and the getter is trivial) then there is logically no way it can avoid calling the getter repeatedly in the former sample since that would risk changing the behaviour of the program. The repeated calls are certainly some nonzero amount of extra work.
Having said all that, create the local variable outside the loop, whether or not it's faster, because it's clearer. Maybe that surprises you, but good code is not always the shortest. Expressing intent and other information is extremely important. In this case the local variable outside the loop makes it obvious to anyone reading the code that the argument to doSomething doesn't change (especially if you make it final) which is useful to know. Otherwise they might have to do some extra digging to make sure they know how the program behaves.
If you need to run it as fast as possible, you should not use System.out.println in critical sections.
Concerning getter: There is slight overhead for using getter, but you should not bother about it. Java does have getter and setter optimization in JIT compiler. So eventually they will be replaced by native code.

Java performance vs. code-style: Making multiple method calls from the same line of code

I am curious whether packing multiple and/or nested method calls within the same line of code is better for performance and that is why some developers do it, at the cost of making their code less readable.
E.g.
//like
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
Could be also written as
//dislike
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
Personally, I hate the latter because it does multiple evaluations in the same line and is hard for me to read the code. That is why I try to avoid by all means to have more than one evaluation per line of code. I also don't know that jobParams.keySet() returns a Set and that bugs me.
Another example would be:
//dislike
Bar.processParameter(Foo.getParameter());
vs
//like
Parameter param = Foo.getParameter();
Bar.processParameter(param);
The former makes me noxious and dizzy as I like to consume simple and clean evaluations in every line of code and I just hate it when I see other people's code written like that.
But are there any (performance) benefits to packing multiple method calls in the same line?
EDIT: Single liners are also more difficult to debug, thanks to #stemm for reminding
Micro optimization is killer. If the code references you are showing are either instance scope (or) method scope, I would go with second approach.
Method scope variables will be eligible for GC as soon as method execution done, so even you declare another variable, it's ok because scope is limited and the advantage you get will be readable and main-table code.
I tend to disagree with most others on this list. I actually find the first way cleaner and easier to read.
In your example:
//like
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
Could be also written as
//dislike
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
the first method (the one you like) has a lot of irrelevant information. The whole point of the iterator interface, for example, is to give you a standard interface that you can use to loop over whatever backing implementation there is. So the fact that it is a keyset has no bearing on the code itself. All you are looking for is the iterator to loop over the implemented object.
Secondly, the second implementation actually gives you more information. It tells you that the code will be ignoring the implementation of jobParams and that it will only be looping through the keys. In the first code, you must first trace back what jobParamKeySet is (as a variable) to figure out what you are iterating over. Additionally, you do not know if/where jobParamKeySet is used elsewhere in the scope.
Finally, as a last comment, the second way makes it easier to switch implementations if necessary; in the first case, you might need to recode two lines (the first variable assignment if it changes from a set to something else), whereas the second case you only need to change out one line.
That being said, there are limits to everything. Chaining 10 calls within a single line can be complicated to read and debug. However 3 or 4 levels is usually clear. Sometimes, especially if an intermediary variable is required several times, it makes more sense to declare it explicitly.
In your second example:
//dislike
Bar.processParameter(Foo.getParameter());
vs
//like
Parameter param = Foo.getParameter();
Bar.processParameter(param);
I find it actually more difficult to understand exactly which parameters are being processed by Bar.processParameter(param). It will take me longer to match param to the variable instantiation to see that it is Foo.getParameter(). Whereas the first case, the information is very clear and presented very well - you are processing Foo.getParameter() params. Personally, I find the first method is less prone to error as well - it is unlikely that you accidentally use Foo2.getParamter() when it is within the same call as opposed to a separate line.
There is one less variable assignment, but even the compiler can optimize it in some cases.
I would not do it for performance, it is kind of an early optimization. Write the code that is easier to maintain.
In my case, I find:
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
easier to be read than:
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
But I guess it is a matter of personal taste.
Code is never developed by same user. I would choose second way. Also it is easier to understand and maintain.
Also This is beneficial when two different teams are working on the code at different locations.
Many times we take an hour or more time to understand what other developer has done, if he uses first option. Personally I had this situation many times.
But are there any (performance) benefits to packing multiple method calls in the same line?
I seriously doubt the difference is measurable but even if there were I would consider
is hard for me to read the code.
to be so much more important it cannot be over stated.
Even if the it were half the speed, I would still write the simplest, cleanest and easiest to understand code and only when you have profiled the application and identified that you have an issue would I consider optimising it.
BTW: I prefer the more dense, chained code, but I would suggest you use what you prefer.
The omission of an extra local variable probably has a neglible performance advantage (although the JIT may be able to optimize this).
Personally I don't mind call chaining when its pretty clear whats done and the intermediate object is very unlikely to be null (like your first 'dislike'-example). When it gets complex (multiple .'s in the expression), I prefer explicit local variables, because its so much simpler to debug.
So I decide case by case what I prefer :)
I don't see where a().b().c().d is that much harder to read than a.b.c.d which people don't seem to mind too much. (Though I would break it up.)
If you don't like that it's all on one line, you could say
a()
.b()
.c()
.d
(I don't like that either.)
I prefer to break it up, using a couple extra variables.
It makes it easier to debug.
If performance is your concern (as it should be), the first thing to understand is not to sweat the small stuff.
If adding extra local variables costs anything at all, the rest of the code has to be rippin' fat-free before it even begins to matter.

Is there any difference between these statements?

Is there any difference between:
String x = getString();
doSomething(x);
vs.
doSomething(getString());
Resources and performance wise, Especially is it's done within a loop for tens, hundreds or thousands of times?
It has the same overhead. Local variables are just there to make your life easier. At the VM level they don't necessarily exist and certainly not anymore when machine code is run.
So what you need to worry about here is getString(), whether it is potentially expensive. x has very likely no effect at all.
Let me first begin by saying that your overriding goal should almost always be to maintain code readability. Your compiler is almost always better at trivial optimizations than you are. Trust it!
In response to your specific example: the bytecode generated for each example IS different. It didn't appear to make much difference though, because there wasn't a statistically significant or even consistent difference between the two approaches in a loop over Integer.MAX_VALUE iterations.
I believe both would be the same at compile time, the first may become more code readable in some cases though.
Both the statements are same. only difference is that in first approach you have used a local variable X which can be avoided using second syntax.
That largely depends on the use-case. Are you going to make repeated calls to doSomething using that exact String? Then using the local variable is a bit more efficient. However, if it's a single call or multiple calls with different Strings, it makes no difference.

java how expensive is a method call

I'm a beginner and I've always read that it's bad to repeat code. However, it seems that in order to not do so, you would have to have extra method calls usually. Let's say I have the following class
public class BinarySearchTree<E extends Comparable<E>>{
private BinaryTree<E> root;
private final BinaryTree<E> EMPTY = new BinaryTree<E>();
private int count;
private Comparator<E> ordering;
public BinarySearchTree(Comparator<E> order){
ordering = order;
clear();
}
public void clear(){
root = EMPTY;
count = 0;
}
}
Would it be more optimal for me to just copy and paste the two lines in my clear() method into the constructor instead of calling the actual method? If so how much of a difference does it make? What if my constructor made 10 method calls with each one simply setting an instance variable to a value? What's the best programming practice?
Would it be more optimal for me to just copy and paste the two lines in my clear() method into the constructor instead of calling the actual method?
The compiler can perform that optimization. And so can the JVM. The terminology used by compiler writer and JVM authors is "inline expansion".
If so how much of a difference does it make?
Measure it. Often, you'll find that it makes no difference. And if you believe that this is a performance hotspot, you're looking in the wrong place; that's why you'll need to measure it.
What if my constructor made 10 method calls with each one simply setting an instance variable to a value?
Again, that depends on the generated bytecode and any runtime optimizations performed by the Java Virtual machine. If the compiler/JVM can inline the method calls, it will perform the optimization to avoid the overhead of creating new stack frames at runtime.
What's the best programming practice?
Avoiding premature optimization. The best practice is to write readable and well-designed code, and then optimize for the performance hotspots in your application.
What everyone else has said about optimization is absolutely true.
There is no reason from a performance point of view to inline the method. If it's a performance issue, the JIT in your JVM will inline it. In java, method calls are so close to free that it isn't worth thinking about it.
That being said, there's a different issue here. Namely, it is bad programming practice to call an overrideable method (i.e., one that is not final, static, or private) from the constructor. (Effective Java, 2nd Ed., p. 89 in the item titled "Design and document for inheritance or else prohibit it")
What happens if someone adds a subclass of BinarySearchTree called LoggingBinarySearchTree that overrides all public methods with code like:
public void clear(){
this.callLog.addCall("clear");
super.clear();
}
Then the LoggingBinarySearchTree will never be constructable! The issue is that this.callLog will be null when the BinarySearchTree constructor is running, but the clear that gets called is the overridden one, and you'll get a NullPointerException.
Note that Java and C++ differ here: in C++, a superclass constructor that calls a virtual method ends up calling the one defined in the superclass, not the overridden one. People switching between the two languages sometimes forget this.
Given that, I think it's probably cleaner in your case to inline the clear method when called from the constructor, but in general in Java you should go ahead and make all the method calls you want.
I would definitely leave it as is. What if you change the clear() logic? It would be impractical to find all the places where you copied the 2 lines of code.
Generally speaking (and as a beginner this means always!) you should never make micro-optimisations like the one you're considering. Always favour readability of code over things like this.
Why? Because the compiler / hotspot will make these sorts of optimisations for you on the fly, and many, many more. If anything, when you try and make optimisations along these sorts of lines (though not in this case) you'll probably make things slower. Hotspot understands common programming idioms, if you try and do that optimisation yourself it probably won't understand what you're trying to do so it won't be able to optimise it.
There's also a much greater maintenance cost. If you start repeating code then it's going to be much more effort to maintain, which will probably be a lot more hassle than you might think!
As an aside, you may get to some points in your coding life where you do need to make low level optimisations - but if you hit those points, you'll definitely, definitely know when the time comes. And if you don't, you can always go back and optimise later if you need to.
The best practice is to measure twice and cut once.
Once you've wasted time optimization, you can never get it back again! (So measure it first and ask yourself if it's worth optimisation. How much actual time will you save?)
In this case, the Java VM is probably already doing the optimization you are talking about.
The cost of a method call is the creation (and disposal) of a stack frame and some extra byte code expressions if you need to pass values to the method.
The pattern that I follow, is whether or not this method in question would satisfy one of the following:
Would it be helpful to have this method available outside this class?
Would it be helpful to have this method available in other methods?
Would it be frustrating to rewrite this every time i needed it?
Could the versatility of the method be increased with the use of a few parameters?
If any of the above are true, it should be wrapped up in it's own method.
Keep the clear() method when it helps readability. Having unmaintainable code is more expensive.
Optimizing compilers usually do a pretty good job of removing the redundancy from these "extra" operations; in many instances, the difference between "optimized" code and code simply written the way you want, and run through an optimizing compiler is none; that is to say, the optimizing compiler usually does just as good a job as you'd do, and it does it without causing any degradation of the source code. In fact, many times, "hand-optimized" code ends up being LESS efficient, because the compiler considers many things when doing the optimization. Leave your code in a readable format, and don't worry about optimization until a later time.
"Premature optimization is the root of
all evil." - Donald Knuth
I wouldn't worry about method call as much but the logic of the method. If it was critical systems, and the system needed to "be fast" then, I would look at optimising codes that takes long to execute.
Given the memory of modern computers this is very inexpensive. Its always better to break your code up into methods so someone can quickly read whats going on. It will also help with narrowing down errors in the code if the error is restricted to a single method with a body of a few lines.
As others have said, the cost of the method call is trivial-to-nada, as the compiler will optimize it for you.
That said, there are dangers in making method calls to instance methods from a constructor. You run the risk of later updating the instance method so that it may try to use an instance variable that has not been initiated yet by the constructor. That is, you don't necessarily want to separate out the construction activities from the constructor.
Another question--your clear() method sets the root to EMPTY, which is initialized when the object is created. If you then add nodes to EMPTY, and then call clear(), you won't be resetting the root node. Is this the behavior you want?

Categories