variable declaration inside/outside loop - java

If for some reason I have to use as little memory space as possible, then is the second code bellow better that the first? (the code is just for illustration and doesn't have any meaning). (edit: imagine I want to make the assembly code of this before any optimization done by the JVM. Then do i use 99999998 extra memory locations in the first procedure compared to the second one? the focus is just one memory usage)
First:
for(int i=0; i<99999999; i++){
int k=2*i
}
Second:
int k=0;
for(int i=0; i<99999999; i++){
k=2*i
}

What I'm sure of :
In every case, The difference will not be visible. if you want to make such small optimization, Java is surely not the best technology. Which makes me recommend the first one cause it make the code more readable and logical. (Find it strange to declare a variable outside the for if you use it only inside it, it's confusing).
What I think :
In your small example and since your looking for a memory footprint. The first one is better because it follow the implicit rule mentioned in this comment: the smaller the scope is the better it is.
In the first case the variable k is used only in a really small loop. So the optimizer will easily understand it and use only a register, so no memory usage and less instructions.
In the second case, it will be harder for the optimizer to determine that k is not use elsewhere. So it could allow some memory instead of using a register. It will then use some memory and be less optimized since it will need instructions to load and store the memory.
As mentionned in this comment, it will mostly depend on how you use it. In your example the optimizer will detect it's the same usage in both case and will use no memory. But in harder code it will not always find it. So I recommend to have the smaller scope has possible.

#pranay-khandelwal 's response links to an excellent discussion of this question in a different language. The Java JVM, however, throws bytecode around at runtime to try to achieve better performance, which can complicate things.
There's actually another discussion here on a similar topic of in/out best practices in general for readability, which resulted in some benchmarks and discussion that one of the participants documented here
As a general rule of thumb the second option will be better for memory and performance under almost all circumstances - where the former may be more self-documenting and maintainable, and avoid accidental use elsewhere in the outer scope.
Since you mention that the code in your post is only a representative example and this could apply to more than just simple types:
Replacing the contents of an already registered memory area is less costly than registering a new one and deregistering the old one (or waiting for it to be garbage collected). Even where a language's compiler or interpreter smartly uses recently unreferenced things to store new things, that work also takes work, and is overhead that can optimistically be avoided with outer declarations - though as others mention, this is usually an unnecessary and potentially bug-spawning micro-optimization.

As from these short examples you provided, the second option. However, it always depends on the logic of your code.
Thinking about performance and minimising execution space and time, the second code scales better even though it looks countering some good coding practices.
K in your code is used only inside the loop block. However, it is "reused" over multiple "block iterations". Take a look at the syntax of your for loop, it declares i (int i) in the beginning of the statement; this declaration will happen just once. Again, declaring the variable multiple times may lead to waste to time and memory.
The JVM optimiser might do a good job in general, simple cases. However, it might fail in capturing the semantics of your code (Java).
for(int i=0,k=0; i<99999999; i++){
k=2*i
}

Related

Should Guava Splitters/Joiners be created each time they are used?

Guava contains utilities for splitting and joining Strings, but it requires the instantiation of a Splitter/Joiner object to do so. These are small objects that typically only contain the character(s) on which to split/join. Is it a good idea to maintain references to these objects in order to reuse them, or is it preferable to just create them whenever you need them and let them be garbage collected?
For example, I could implement this method in the following two ways:
String joinLines(List<String> lines) {
return Joiner.on("\n").join(lines);
}
OR
static final Joiner LINE_JOINER = Joiner.on("\n");
String joinLines(List<String> lines) {
return LINE_JOINER.join(lines);
}
I find the first way more readable, but it seems wasteful to create a new Joiner each time the method is called.
To be honest, this sounds like premature optimization to me. I agree with #Andy Turner, write whatever is easiest to understand and maintain.
If you plan to use Joiner.on("\n") in a few places, make it a well named constant; go with option two.
If you only plan to use it in your joinLines method, a constant seems overly verbose; go with option one.
It depends greatly on how often you expect the code to be called and what tradeoffs you want to make between CPU time, memory consumption and readability. Since Joiner is such a small thing, it's not going to make a huge difference either way: if you make it a constant, you save the (fairly minimal) costs of allocating it and GCing it for each call, while adding the (fairly minimal) memory consumption overhead to the program.
It also depends in part on what platform you're running the code on: if you're running on the server, typically you'll have plenty of memory so keeping a constant won't be an issue. On the other hand, if you're running on Android you're more memory constrained, but you also want to avoid unnecessary allocations since garbage collection is going to be worse and more impactful to your performance.
Personally, I tend to allocate a constant unless I know it's only going to be used some fixed number of times as opposed to repeatedly throughout the program.

Java performance vs. code-style: Making multiple method calls from the same line of code

I am curious whether packing multiple and/or nested method calls within the same line of code is better for performance and that is why some developers do it, at the cost of making their code less readable.
E.g.
//like
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
Could be also written as
//dislike
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
Personally, I hate the latter because it does multiple evaluations in the same line and is hard for me to read the code. That is why I try to avoid by all means to have more than one evaluation per line of code. I also don't know that jobParams.keySet() returns a Set and that bugs me.
Another example would be:
//dislike
Bar.processParameter(Foo.getParameter());
vs
//like
Parameter param = Foo.getParameter();
Bar.processParameter(param);
The former makes me noxious and dizzy as I like to consume simple and clean evaluations in every line of code and I just hate it when I see other people's code written like that.
But are there any (performance) benefits to packing multiple method calls in the same line?
EDIT: Single liners are also more difficult to debug, thanks to #stemm for reminding
Micro optimization is killer. If the code references you are showing are either instance scope (or) method scope, I would go with second approach.
Method scope variables will be eligible for GC as soon as method execution done, so even you declare another variable, it's ok because scope is limited and the advantage you get will be readable and main-table code.
I tend to disagree with most others on this list. I actually find the first way cleaner and easier to read.
In your example:
//like
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
Could be also written as
//dislike
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
the first method (the one you like) has a lot of irrelevant information. The whole point of the iterator interface, for example, is to give you a standard interface that you can use to loop over whatever backing implementation there is. So the fact that it is a keyset has no bearing on the code itself. All you are looking for is the iterator to loop over the implemented object.
Secondly, the second implementation actually gives you more information. It tells you that the code will be ignoring the implementation of jobParams and that it will only be looping through the keys. In the first code, you must first trace back what jobParamKeySet is (as a variable) to figure out what you are iterating over. Additionally, you do not know if/where jobParamKeySet is used elsewhere in the scope.
Finally, as a last comment, the second way makes it easier to switch implementations if necessary; in the first case, you might need to recode two lines (the first variable assignment if it changes from a set to something else), whereas the second case you only need to change out one line.
That being said, there are limits to everything. Chaining 10 calls within a single line can be complicated to read and debug. However 3 or 4 levels is usually clear. Sometimes, especially if an intermediary variable is required several times, it makes more sense to declare it explicitly.
In your second example:
//dislike
Bar.processParameter(Foo.getParameter());
vs
//like
Parameter param = Foo.getParameter();
Bar.processParameter(param);
I find it actually more difficult to understand exactly which parameters are being processed by Bar.processParameter(param). It will take me longer to match param to the variable instantiation to see that it is Foo.getParameter(). Whereas the first case, the information is very clear and presented very well - you are processing Foo.getParameter() params. Personally, I find the first method is less prone to error as well - it is unlikely that you accidentally use Foo2.getParamter() when it is within the same call as opposed to a separate line.
There is one less variable assignment, but even the compiler can optimize it in some cases.
I would not do it for performance, it is kind of an early optimization. Write the code that is easier to maintain.
In my case, I find:
Iterator<String> jobParamItrtr = jobParams.keySet().iterator();
easier to be read than:
Set<String> jobParamKeySet = jobParams.keySet();
Iterator<String> jobParamItrtr = jobParamKeySet.iterator();
But I guess it is a matter of personal taste.
Code is never developed by same user. I would choose second way. Also it is easier to understand and maintain.
Also This is beneficial when two different teams are working on the code at different locations.
Many times we take an hour or more time to understand what other developer has done, if he uses first option. Personally I had this situation many times.
But are there any (performance) benefits to packing multiple method calls in the same line?
I seriously doubt the difference is measurable but even if there were I would consider
is hard for me to read the code.
to be so much more important it cannot be over stated.
Even if the it were half the speed, I would still write the simplest, cleanest and easiest to understand code and only when you have profiled the application and identified that you have an issue would I consider optimising it.
BTW: I prefer the more dense, chained code, but I would suggest you use what you prefer.
The omission of an extra local variable probably has a neglible performance advantage (although the JIT may be able to optimize this).
Personally I don't mind call chaining when its pretty clear whats done and the intermediate object is very unlikely to be null (like your first 'dislike'-example). When it gets complex (multiple .'s in the expression), I prefer explicit local variables, because its so much simpler to debug.
So I decide case by case what I prefer :)
I don't see where a().b().c().d is that much harder to read than a.b.c.d which people don't seem to mind too much. (Though I would break it up.)
If you don't like that it's all on one line, you could say
a()
.b()
.c()
.d
(I don't like that either.)
I prefer to break it up, using a couple extra variables.
It makes it easier to debug.
If performance is your concern (as it should be), the first thing to understand is not to sweat the small stuff.
If adding extra local variables costs anything at all, the rest of the code has to be rippin' fat-free before it even begins to matter.

Is there any difference between these statements?

Is there any difference between:
String x = getString();
doSomething(x);
vs.
doSomething(getString());
Resources and performance wise, Especially is it's done within a loop for tens, hundreds or thousands of times?
It has the same overhead. Local variables are just there to make your life easier. At the VM level they don't necessarily exist and certainly not anymore when machine code is run.
So what you need to worry about here is getString(), whether it is potentially expensive. x has very likely no effect at all.
Let me first begin by saying that your overriding goal should almost always be to maintain code readability. Your compiler is almost always better at trivial optimizations than you are. Trust it!
In response to your specific example: the bytecode generated for each example IS different. It didn't appear to make much difference though, because there wasn't a statistically significant or even consistent difference between the two approaches in a loop over Integer.MAX_VALUE iterations.
I believe both would be the same at compile time, the first may become more code readable in some cases though.
Both the statements are same. only difference is that in first approach you have used a local variable X which can be avoided using second syntax.
That largely depends on the use-case. Are you going to make repeated calls to doSomething using that exact String? Then using the local variable is a bit more efficient. However, if it's a single call or multiple calls with different Strings, it makes no difference.

does a Java getter incur a performance penalty

if i have the code
int getA(){
return a;
}
and then do something like
int b = obj.getA();
instead of
int b = obj.a;
will that mean that the stack will have to be pushed and popped ultimately slowing down my code?
The JIT compiler will inline the method.
The code should look like
int b = obj.GetA();
I have two answers for you:
I don't think that there is a significant performance penalty for using the getter vs accessing the variable directly. I would worry more about how understandable and readable the code is than performance for this sort of decision.
According to OO design principles, which may or may not be important to you, you would normally hide the data and provide the getter method to access it—there is a detailed discussion on the merits of this approach here.
Theoretically there is some runtime penalty, due to a method call being made. In reality, this has very little effect on the overall performance due to two reasons:
Unless the obj.getA() is taking place inside the inner-most loop of your program, then its effect on the overall performance of your code will be negligible. When performance is an issue you should consider the bottleneck of your code. There's no point in optimizing code that is out not at these hot spots. In order to identify these spots you need to analyze the execution of your code via a profiler.
As #Michael was saying the JVM uses a "Just In Time" compiler/optimizer that inlines code based on actual execution. It performs just this kind of optimizations (see this talk)

Java Method invocation vs using a variable

Recently I got into a discussion with my Team lead about using temp variables vs calling getter methods. I was of the opinion for a long time that, if I know that I was going to have to call a simple getter method quite a number of times, I would put it into a temp variable and then use that variable instead. I thought that this would be a better both in terms of style and performance. However, my lead pointed out that in Java 4 and newer editions, this was not true somewhat. He is a believer of using a smaller variable space, so he told me that calling getter methods had a very negligible performance hit as opposed to using a temp variable, and hence using getters was better. However, I am not totally convinced by his argument. What do you guys think?
Never code for performance, always code for readability. Let the compiler do the work.
They can improve the compiler/runtime to run good code faster and suddenly your "Fast" code is actually slowing the system down.
Java compiler & runtime optimizations seem to address more common/readable code first, so your "Optimized" code is more likely to be de-optimized at a later time than code that was just written cleanly.
Note:
This answer is referring to Java code "Tricks" like the question referenced, not bad programming that might raise the level of loops from an O(N) to an O(N^2). Generally write clean, DRY code and wait for an operation to take noticeably too long before fixing it. You will almost never reach this point unless you are a game designer.
Your lead is correct. In modern versions of the VM, simple getters that return a private field are inlined, meaning the performance overhead of a method call doesn't exist.
Don't forget that by assigning the value of getSomething() to a variable rather than calling it twice, you are assuming that getSomething() would have returned the same thing the second time you called it. Perhaps that's a valid assumption in the scenario you are talking about, but there are times when it isn't.
It depends. If you would like to make it clear that you use the same value again and again, I'd assign it to a temp variable. I'd do so if the call of the getter is somewhat lengthy, like myCustomObject.getASpecificValue().
You will get much fewer errors in your code if it is readable. So this is the main point.
The performance differences are very small or not existent.
If you keep the code evolution in mind, simple getters in v1.0 tend to become not-so-simple getters in v2.0.
The coder who changes a simple getter to not-so-simple getter usually has no clue that there is a function that calls this getter 10 times instead of 1 and never corrects it there, etc.
That's why from the point of view of the DRY principal it makes sense to cache value for repeated use.
I will not sacrifice "Code readability" to some microseconds.
Perhaps it is true that getter performs better and can save you several microseconds in runtime. But i believe, variables can save you several hours or perhaps days when bug fixing time comes.
Sorry for the non-technical answer.
I think that recent versions of the JVM are often sufficiently clever to cache the result of a function call automatically, if some conditions are met. I think the function must have no side effects and reliably return the same result every time it is called. Note that this may or may not be the case for simple getters, depending on what other code in your class is doing to the field values.
If this is not the case and the called function does significant processing then you would indeed be better of caching its result in a temporary variable. While the overhead of a call may be insignificant, a busy method will eat your lunch if you call it more often than necessary.
I also practice your style; even if not for performance reasons, I find my code more legible when it isn't full of cascades of function calls.
It is not worth if it is just getFoo(). By caching it into a temp variable you are not making it much faster and maybe asking for trouble because getFoo() may return different value later. But if it is something like getFoo().getBar().getBaz().getSomething() and you know the value will not be changed within the block of code, then there may be a reason to use temp variable for better readability.
A general comment: In any modern system, except for I/O, do not worry about performance issues. Blazing fast CPUs and heaps of memory mean, all other issues are most of the time completely immaterial to actual performance of your system. [Of course, there are exceptions like caching solutions but they are far and rare.]
Now coming to this specific problem, yes, compiler will inline all the gets. Yet, even that is not the actual consideration, what should really matter is over all readability and flow of your code. Replacing indirections by a local variable is better, if the call used multiple times, like customer.gerOrder().getAddress() is better captured in local variable.
The virtual machine can handle the first four local variables more efficiently than any local variable declared after that (see lload and lload_<n> instructions). So caching the result of the (inlined) getter may actually hurt your performance.
Of course on their own either performance influence is almost negligible so if you want to optimize your code make sure that you are really tackling an actual bottleneck!
Another reason to not use a temporary variable to contain the result of a method call is that using the method you get the most updated value. This could not be a problem with the actual code, but it could become a problem when the code is changed.
I am in favour of using temp variable if you are sure about getter will return same value throughout the scope. Because if you have a variable having name of length 10 or more getter looks bad in readability aspect.
I've tested it in a very simple code :
created a class with a simple getter of an int (I tried both with final and non-final value for Num, didn't see any difference, mind that it's in the case num never change also...!):
Num num = new Num(100_000_000);
compared 2 differents for loops:
1: for(int i = 0; i < num.getNumber(); ++i){(...)}
2: number = num.getNumber();
for(int i = 0; i < number; ++i){(...)}
The result were around 3 millis int the first one and around 2 millis in the second one. So there's a tiny difference, nothing to worry about for small loops, may be more problematic on big iterations or if you always call getter and need them a lot. For instance, in image processing if you want to be quick, don't use repetively getters I would advise...
I'm +1 for saving the variable.
1) Readability over performance - your code is not just for you.
2) Performance might be negligible but not all the time. I think it is important to be consistent and set a precedent. So, while it might not matter for one local variable - it could matter in a larger class using the same value multiples times or in the case of looping.
3) Ease of changing implementation/ avoiding DRY code. For now you get the value from this one place with a getter and theoretically you use the getter 100 times in one class. But in the future - if you want to change where/how you get the value - now you have to change it 100 times instead of just once when you save it as an instance variable.

Categories