Given the following two code options, is there any performance benefit (over a very large scale or a long period of time) of the second over the first?
Option 1
private Map<Long, Animal> animals = ...;
public Map<Long, Animal> getAnimals() {
return animals;
}
public void useAnimals() {
for (int i=0; i < SOME_LARGE_NUMBER; i++) {
Animal animal = getAnimals().get(id);
}
// Many many calls to getAnimals() are made...
}
Option 2 - no getter
private Map<Long, Animal> animals = ...;
public void useAnimals() {
for (int i=0; i < SOME_NUMBER; i++) {
Animal animal = animals.get(id);
}
// No method calls made
}
If it is bad for performance, why, and how should I determine whether it is worth mitigating?
And, would storing the result of getAnimals() as a local provide a benefit...
if SOME_NUMBER is hundreds or thousands?
if SOME_NUMBER is only in the order of magnitude of 10?
Note: I previously said "encapsulation". I changed it to "getter" because the purpose is actually not that the field can't be modified but that it can't be reassigned. The encapsulation is simply to remove responsibility for assignment from subclasses.
Most likely JVM will inline getAnimals() invocation in tight loop effectively falling back to Option 1. So don't bother, this is really a micro (nano?) optimization.
Another thing is migrating from field access to local variable. This sounds good since instead of traversing through this reference every time you always have a reference on the stack (two memory accesses vs. one). However I believe (correct me if I'm wrong) that since animals is private and non-volatile again JVM will perform this optimization for you at runtime.
The second snippet is more encapsulated than the first one. The first one gives access to the internal map to anyone, whereas the second keeps it encapsulated in the class.
Both will lead to comparable performance.
EDIT: since you change the question, I'll also change the answer.
If you go through a getter, and the getter is not final, it means that subclasses may return another map than the one you hold in the class. Choose whether you want your method to operate on the subclass's map or on the class's map. Both could be acceptable, depending on the context.
Anyway, suppose your subclass always makes a defensive copy of the map, you'll end up having many copies if you don't cache the result of the getter in a local variable of useAnimals. It might be required to always work on the latest value of the subclass's map, but I doubt it's the case.
If there is no subclass, or the subclass doesn't override the method, or override it by always returning the same map, both will lead to comparable performance and you shouldn't care about it.
Have you profiled this to see if it matter, for a modern JIT I would guess it would get optomized away, especially if animals was marked final but there is nothing stopping you from testing this yourself.
Either way, I am 100% this would NEVER be your bottle neck in an application.
Well, I don't think that JVM will inline the function call. So probably it may affect performance. The better way is to create local variable and assign class field animals to it.
Related
This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?
For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.
With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.
This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.
Consider the class Foo.
public class Foo {
private double size;
public double getSize() {
return this.size; // Always O(1)
}
}
Foo has a property called size, which is frequently accessed, but never modified, by a given method. I've always cached a property in a variable whenever it is accessed more than once in any method, because "someone told me so" without giving it much thought. i.e.
public void test(Foo foo) {
double size = foo.getSize(); // Cache it or not?
// size will be referenced in several places later on.
}
Is this worth it, or an overkill?
If I don't cache it, are modern compilers smart enough to cache it themselves?
A couple of factors (in no particular order) that I consider when deciding whether or not to store the value returned by a call to a "get() method":
Performance of the get() method - Unless the API specifies, or unless the calling code is tightly coupled with the called method, there are no guarantees of the performance of the get() method. The code may be fine in testing now, but may get worse if the get() methods performace changes in the future or if testing does not reflect real-world conditions. (e.g. testing with only a thousand objects in a container when a real-world container might have ten million) Used in a for-loop, the get() method will be called before every iteration
Readability - A variable can be given a specific and descriptive name, providing clarification of its use and/or meaning in a way that may not be clear from inline calls to the get() method. Don't underestimate the value of this to those reviewing and maintaining the code.
Thread safety - Can the value returned by the get() method potentially change if another thread modifies the object while the calling method is doing its thing? Should such a change be reflected in the calling method's behavior?
Regarding the question of whether or not compilers will cache it themselves, I'm going to speculate and say that in most cases the answer has to be 'no'. The only way the compiler could safely do so would be if it could determine that the get() method would return the same value at every invocation. And this could only be guaranteed if the get() method itself was marked final and all it did was return a constant (i.e an object or primitive also marked 'final'). I'm not sure but I think this is probably not a scenario the compiler bothers with. The JIT compiler has more information and thus could have more flexibility but you have no guarantees that some method will get JIT'ed.
In conclusion, don't worry about what the compiler might do. Caching the return value of a get() method is probably the right thing to do most of the time, and will rarely (i.e almost never) be the wrong thing to do. Favor writing code that is readable and correct over code that is fast(est) and flashy.
I don't know whether there is a "right" answer, but I would keep a local copy.
In your example, I can see that getSize() is trivial, but in real code, I don't always know whether it is trivial or not; and even if it is trivial today, I don't know that somebody won't come along and change the getSize() method to make it non-trivial sometime in the future.
The biggest factor would be performance. If it's a simple operation that doesn't require a whole lot of CPU cycles, I'd say don't cache it. But if you constantly need to execute an expensive operation on data that doesn't change, then definitely cache it. For example, in my app the currently logged in user is serialized on every page in JSON format, the serialization operation is pretty expensive, so in order to improve performance I now serialize the user once when he signs in and then use the serialized version for putting JSON on the page. Here is before and after, made a noticeable improvement in performance:
//Before
public User(Principal principal) {
super(principal.getUsername(), principal.getPassword(), principal.getAuthorities());
uuid = principal.getUuid();
id = principal.getId();
name = principal.getName();
isGymAdmin = hasAnyRole(Role.ROLE_ADMIN);
isCustomBranding= hasAnyRole(Role.ROLE_CUSTOM_BRANDING);
locations.addAll(principal.getLocations());
}
public String toJson() {
**return JSONAdapter.getGenericSerializer().serialize(this);**
}
// After
public User(Principal principal) {
super(principal.getUsername(), principal.getPassword(), principal.getAuthorities());
uuid = principal.getUuid();
id = principal.getId();
name = principal.getName();
isGymAdmin = hasAnyRole(Role.ROLE_ADMIN);
isCustomBranding= hasAnyRole(Role.ROLE_CUSTOM_BRANDING);
locations.addAll(principal.getLocations());
**json = JSONAdapter.getGenericSerializer().serialize(this);**
}
public String toJson() {
return json;
}
The User object has no setter methods, there is no way the data would ever change unless the user signs out and then back in, so in this case I'd say it is safe to cache the value.
If the value of size was calculated each time say by looping through an array and thus not O(1), caching the value would have obvious benefits performance-wise. However since size of Foo is not expected to change at any point and it is O(1), caching the value mainly aids in readability. I recommend continuing to cache the value simply because readability is often times more of a concern than performance in modern computing systems.
IMO, if you are really worried about performance this is a bit overkill or extensive but there is a couple of ways to ensure that the variable is "cached" by your VM,
First, you can create final static variables of the results (as per your example 1 or 0), hence only one copy is stored for the whole class, then your local variable is only a boolean (using only 1 bit), but still maintaining the result value of double (also, maybe you can use int, if it is only 0 or 1)
private static final double D_ZERO = 0.0;
private static final double D_ONE = 1.0;
private boolean ZERO = false;
public double getSize(){
return (ZERO ? D_ZERO : D_ONE);
}
Or if you are able to set the size on initialization of the class you can go with this, you can set the final variable through constructor, and static, but since this is a local variable you can go with the constructor:
private final int SIZE;
public foo(){
SIZE = 0;
}
public double getSize(){
return this.SIZE;
}
this can be accessed via foo.getSize()
In my code, i would cache it if either the getSize() method is time consuming or - and that is more often - the result is used in more or less complex expressions.
For example if calculating an offset from the size
int offset = fooSize * count1 + fooSize * count2;
is easier to read (for me) than
int offset = foo.getSize() * count1 + foo.getSize() * count2;
A very unimportant question about Java performance, but it made me wondering today.
Say I have simple getter:
public Object getSomething() {
return this.member;
}
Now, say I need the result of getSomething() twice (or more) in some function/algorithm. My question: is there any difference in either calling getSomething() twice (or more) or in declaring a temporary, local variable and use this variable from then on?
That is, either
public void algo() {
Object o = getSomething();
... use o ...
}
or
public void algo() {
... call getSomething() multiple times ...
}
I tend to mix both options, for no specific reason. I know it doesn't matter, but I am just wondering.
Thanks!
Technically, it's faster to not call the method multiple times, however this might not always be the case. The JVM might optimize the method calls to be inline and you won't see the difference at all. In any case, the difference is negligible.
However, it's probably safer to always use a getter. What if the value of the state changes between your calls? If you want to use a consistent version, then you can save the value from the first call. Otherwise, you probably want to always use the getter.
In any case, you shouldn't base this decision on performance because it's so negligible. I would pick one and stick with it consistently. I would recommend always going through your getters/setters.
Getters and setters are about encapsulation and abstraction. When you decide to invoke the getter multiple times, you are making assumptions about the inner workings of that class. For example that it does no expensive calculations, or that the value is not changed by other threads.
I'd argue that its better to call the getter once and store its result in a temporary variable, thus allowing you to freely refactor the implementing class.
As an anecdote, I was once bitten by a change where a getter returned an array, but the implementing class was changed from an array property to using a list and doing the conversion in the getter.
The compiler should optimize either one to be basically the same code.
When I find myself calling the same getter method multiple times, should this be considered a problem? Is it better to [always] assign to a local variable and call only once?
I'm sure the answer of course is "it depends".
I'm more concerned about the simpler case where the getter is simply a "pass-along-the-value-of-a-private-variable" type method. i.e. there's no expensive computation involved, no database connections being consumed, etc.
My question of "is it better" pertains to both code readability (style) and also performance. i.e. is it that much of a performance hit to have:
SomeMethod1(a, b, foo.getX(), c);
SomeMethod2(b, foo.getX(), c);
SomeMethod3(foo.getX());
vs:
X x = foo.getX();
SomeMethod1(a, b, x, c);
SomeMethod2(b, x, c);
SomeMethod3(x);
I realize this question is a bit nit-picky and gray. But I just realized, I have no consistent way of evaluating these trade-offs, at all. Am fishing for some criteria that are more than just completely whimsical.
Thanks.
The choice shouldn't really be about performance hit but about code readability.
When you create a variable you can give it the name it deserves in the current context. When you use a same value more than one time it has surely a real meaning, more than a method name (or worse a chain of methods).
And it's really better to read:
String username = user.getName();
SomeMethod1(a, b, username, c);
SomeMethod2(b, username, c);
SomeMethod3(username);
than
SomeMethod1(a, b, user.getName(), c);
SomeMethod2(b, user.getName(), c);
SomeMethod3(user.getName());
For plain getters - those that just returns a value - HotSpot inlines it in the calling code, so it will be as fast as it can be.
I, however, have a principle about keeping a statement on a single line, which very often results in expressions like "foo.getBar()" being too long to fit. Then it is more readable - to me - to extract it to a local variable ("Bar bar = foo.getBar()").
They could be 2 different things.
If GetX is non-deterministic then the 1st one will give different results than the 2nd
Personally, I'd use the 2nd one. It's more obvious and less unnecessarily verbose.
I use the second style if it makes my code more readable or if I have to use the assigned value again. I never consider performance (on trivial things) unless I have to.
That depends on what getX() actually does. Consider this class:
public class Foo {
private X x;
public X getX() { return x; }
}
In this case, when you make a call to foo.getX(), JVM will optimize it all the way down to foo.x (as in direct reference to foo's private field, basically a memory pointer). However, if the class looks like this:
public class Foo {
private X x;
public X getX() { return cleanUpValue(x); }
private X cleanUpValue(X x) {
/* some modifications/sanitization to x such as null safety checks */
}
}
the JVM can't actually inline it as efficiently anymore since by Foo's constructional contract, it has to sanitize x before handing it out.
To summarize, if getX() doesn't really do anything beyond returning a field, then there's no difference after initial optimization runs to the bytecode in whether you call the method just once or multiple times.
Most of the time I would use getX if it was only once, and create a var for it for all other cases. Often just to save typing.
With regards to performance, the compiler would probably be able to optimize away most of the overhead, but the possibility of side-effects could force the compiler into more work when doing multiple method-calls.
I generally store it locally if:
I'm will use it in a loop and I don't want or expect the value to change during the loop.
I'm about to use it in a long line of code or the function & parameters are very long.
I want to rename the variable to better correspond to the task at hand.
Testing indicates a significant performance boost.
Otherwise I like the ability to get current values and lower level of abstraction of method calls.
Two things have to be considered:
Does the call to getX() have any side effects? Following established coding patterns, a getter should not alter the object on which it is called, the in most cases, there is no side effect. Therefore, it is semantically equivalent to call the getter once and store the value locally vs. calling the getter multiple times. (This concept is called idempotency - it does not matter whether you call a method once or multiple times; the effect on the data is exactly the same.)
If the getter has no side effect, the compiler can safely remove subsequent calls to the getter and create the temporary local storage on its own - thus, the code remains ultra-readable and you have all the speed advantage from calling the getter only once. This is all the more important if the getter does not simply return a value but has to fetch/compute the value or runs some validations.
Assuming your getter does not change the object on which it operates it is probably more readable to have multiple calls to getX() - and thanks to the compiler you do not have to trade performance for readability and maintainability.
I'm in my first programming class in high school. We're doing our end of the first semester project.
This project only involves one class, but many methods. My question is about best practice with instance variables and local variables. It seems that it would be much easier for me to code using almost only instance variables. But I'm not sure if this is how I should be doing it or if I should be using local variables more (I would just have to have methods take in the values of local variables a lot more).
My reasoning for this is also because a lot of times I'll want to have a method return two or three values, but this is of course not possible. Thus it just seems easier to simply use instance variables and never having to worry since they are universal in the class.
I haven't seen anyone discuss this so I'll throw in more food for thought. The short answer/advice is don't use instance variables over local variables just because you think they are easier to return values. You are going to make working with your code very very hard if you don't use local variables and instance variables appropriately. You will produce some serious bugs that are really hard to track down. If you want to understand what I mean by serious bugs, and what that might look like read on.
Let's try and use only instance variables as you suggest to write to functions. I'll create a very simple class:
public class BadIdea {
public Enum Color { GREEN, RED, BLUE, PURPLE };
public Color[] map = new Colors[] {
Color.GREEN,
Color.GREEN,
Color.RED,
Color.BLUE,
Color.PURPLE,
Color.RED,
Color.PURPLE };
List<Integer> indexes = new ArrayList<Integer>();
public int counter = 0;
public int index = 0;
public void findColor( Color value ) {
indexes.clear();
for( index = 0; index < map.length; index++ ) {
if( map[index] == value ) {
indexes.add( index );
counter++;
}
}
}
public void findOppositeColors( Color value ) {
indexes.clear();
for( index = 0; i < index < map.length; index++ ) {
if( map[index] != value ) {
indexes.add( index );
counter++;
}
}
}
}
This is a silly program I know, but we can use it to illustrate the concept that using instance variables for things like this is a tremendously bad idea. The biggest thing you'll find is that those methods use all of the instance variables we have. And it modifies indexes, counter, and index every time they are called. The first problem you'll find is that calling those methods one after the other can modify the answers from prior runs. So for example, if you wrote the following code:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
idea.findColor( Color.GREEN ); // whoops we just lost the results from finding all Color.RED
Since findColor uses instance variables to track returned values we can only return one result at a time. Let's try and save off a reference to those results before we call it again:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes;
int redCount = idea.counter;
idea.findColor( Color.GREEN ); // this causes red positions to be lost! (i.e. idea.indexes.clear()
List<Integer> greenPositions = idea.indexes;
int greenCount = idea.counter;
In this second example we saved the red positions on the 3rd line, but same thing happened!?Why did we lose them?! Because idea.indexes was cleared instead of allocated so there can only be one answer used at a time. You have to completely finish using that result before calling it again. Once you call a method again the results are cleared and you lose everything. In order to fix this you'll have to allocate a new result each time so red and green answers are separate. So let's clone our answers to create new copies of things:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes.clone();
int redCount = idea.counter;
idea.findColor( Color.GREEN );
List<Integer> greenPositions = idea.indexes.clone();
int greenCount = idea.counter;
Ok finally we have two separate results. The results of red and green are now separate. But, we had to know a lot about how BadIdea operated internally before the program worked didn't we? We need to remember to clone the returns every time we called it to safely make sure our results didn't get clobbered. Why is the caller forced to remember these details? Wouldn't it be easier if we didn't have to do that?
Also notice that the caller has to use local variables to remember the results so while you didn't use local variables in the methods of BadIdea the caller has to use them to remember results. So what did you really accomplish? You really just moved the problem to the caller forcing them to do more. And the work you pushed onto the caller is not an easy rule to follow because there are some many exceptions to the rule.
Now let's try doing that with two different methods. Notice how I've been "smart" and I reused those same instance variables to "save memory" and kept the code compact. ;-)
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes;
int redCount = idea.counter;
idea.findOppositeColors( Color.RED ); // this causes red positions to be lost again!!
List<Integer> greenPositions = idea.indexes;
int greenCount = idea.counter;
Same thing happened! Damn but I was being so "smart" and saving memory and the code uses less resources!!! This is the real peril of using instance variables like this is calling methods is order dependent now. If I change the order of the method calls the results are different even though I haven't really changed the underlying state of BadIdea. I didn't change the contents of the map. Why does the program yield different results when I call the methods in different order?
idea.findColor( Color.RED )
idea.findOppositeColors( Color.RED )
Produces a different result than if I swapped those two methods:
idea.findOppositeColors( Color.RED )
idea.findColor( Color.RED )
These types of errors are really hard to track down especially when those lines aren't right next to each other. You can completely break your program by just adding a new call in anywhere between those two lines and get wildly different results. Sure when we're dealing with small number of lines it's easy to spot errors. But, in a larger program you can waste days trying to reproduce them even though the data in the program hasn't changed.
And this only looks at single threaded problems. If BadIdea was being used in a multi-threaded situation the errors can get really bizarre. What happens if findColors() and findOppositeColors() is called at the same time? Crash, all your hair falls out, Death, space and time collapse into a singularity and the universe is swallows up? Probably at least two of those. Threads are probably above your head now, but hopefully we can steer you away from doing bad things now so when you do get to threads those bad practices don't cause you real heartache.
Did you notice how careful you had to be when calling the methods? They overwrote each other, they shared memory possibly randomly, you had to remember the details of how it worked on the inside to make it work on the outside, changing the order in which things were called produce very big changes in the next lines down, and it only could only work in a single thread situation. Doing things like this will produce really brittle code that seems to fall apart whenever you touch it. These practices I showed contributed directly to the code being brittle.
While this might look like encapsulation it is the exact opposite because the technical details of how you wrote it have to be known to the caller. The caller has to write their code in a very particular way to make their code work, and they can't do it without knowing about the technical details of your code. This is often called a Leaky Abstraction because the class is suppose to hide the technical details behind an abstraction/interface, but the technical details leak out forcing the caller to change their behavior. Every solution has some degree of leaky-ness, but using any of the above techniques like these guarantees no matter what problem you are trying to solve it will be terribly leaky if you apply them. So let's look at the GoodIdea now.
Let's rewrite using local variables:
public class GoodIdea {
...
public List<Integer> findColor( Color value ) {
List<Integer> results = new ArrayList<Integer>();
for( int i = 0; i < map.length; i++ ) {
if( map[index] == value ) {
results.add( i );
}
}
return results;
}
public List<Integer> findOppositeColors( Color value ) {
List<Integer> results = new ArrayList<Integer>();
for( int i = 0; i < map.length; i++ ) {
if( map[index] != value ) {
results.add( i );
}
}
return results;
}
}
This fixes every problem we discussed above. I know I'm not keeping track of counter or returning it, but if I did I can create a new class and return that instead of List. Sometimes I use the following object to return multiple results quickly:
public class Pair<K,T> {
public K first;
public T second;
public Pair( K first, T second ) {
this.first = first;
this.second = second;
}
}
Long answer, but a very important topic.
Use instance variables when it's a core concept of your class. If you're iterating, recursing or doing some processing, then use local variables.
When you need to use two (or more) variables in the same places, it's time to create a new class with those attributes (and appropriate means to set them). This will make your code cleaner and help you think about problems (each class is a new term in your vocabulary).
One variable may be made a class when it is a core concept. For example real-world identifiers: these could be represented as Strings, but often, if you encapsulate them into their own object they suddenly start "attracting" functionality (validation, association to other objects, etc.)
Also (not entirely related) is object consistency - an object is able to ensure that its state makes sense. Setting one property may alter another. It also makes it far easier to alter your program to be thread-safe later (if required).
Local variables internal to methods are always prefered, since you want to keep each variable's scope as small as possible. But if more than one method needs to access a variable, then it's going to have to be an instance variable.
Local variables are more like intermediate values used to reach a result or compute something on the fly. Instance variables are more like attributes of a class, like your age or name.
The easy way: if the variable must be shared by more than one method, use instance variable, otherwise use local variable.
However, the good practice is to use as more local variables as possible. Why? For your simple project with only one class, there is no difference. For a project that includes a lot of classes, there is big difference. The instance variable indicates the state of your class. The more instance variables in your class, the more states this class can have and then, the more complex this class is, the hard the class is maintained or the more error prone your project might be. So the good practice is to use as more local variable as possible to keep the state of the class as simple as possible.
Short story: if and only if a variable needs to be accessed by more than one method (or outside of the class), create it as an instance variables. If you need it only locally, in a single method, it has to be a local variable.
Instance variables are more costly than local variables.
Keep in mind: instance variables are initialized to default values while local variables are not.
Declare variables to be scoped as narrowly as possible. Declare local variables first. If this isn't sufficient, use instance variables. If this isn't sufficient, use class (static) variables.
I you need to return more than one value return a composite structure, like an array or an object.
Try to think about your problem in terms of objects. Each class represents a different type of object. Instance variables are the pieces of data that a class needs to remember in order to work, either with itself or with other objects. Local variables should just be used intermediate calculations, data that you don't need to save once you leave the method.
Try not to return more than one value from your methods in first place. If you can't, and in some cases you really can't, then I would recommend encapsulating that in a class. Just in last case I would recommend changing another variable inside your class (an instance variable). The problem with the instance variables approach is that it increases side effects - for example, you call method A in your program and it modifies some instance(s) variable(s). Over time, that leads to increased complexity in your code and maintenance becomes harder and harder.
When I have to use instance variables, I try to make then final and initialize then in the class constructors, so side effects are minimized. This programming style (minimizing the state changes in your application) should lead to better code that is easier to maintain.
Generally variables should have minimal scope.
Unfortunately, in order to build classes with minimized variable scope, one often needs to do a lot of method parameter passing.
But if you follow that advice all the time, perfectly minimizing variable scope, you
may end up with a lot of redundancy and method inflexibility with all the required objects passed in and out of methods.
Picture a code base with thousands of methods like this:
private ClassThatHoldsReturnInfo foo(OneReallyBigClassThatHoldsCertainThings big,
AnotherClassThatDoesLittle little) {
LocalClassObjectJustUsedHere here;
...
}
private ClassThatHoldsReturnInfo bar(OneMediumSizedClassThatHoldsCertainThings medium,
AnotherClassThatDoesLittle little) {
...
}
And, on the other hand, imagine a code base with lots of instance variables like this:
private OneReallyBigClassThatHoldsCertainThings big;
private OneMediumSizedClassThatHoldsCertainThings medium;
private AnotherClassThatDoesLittle little;
private ClassThatHoldsReturnInfo ret;
private void foo() {
LocalClassObjectJustUsedHere here;
....
}
private void bar() {
....
}
As code increases, the first way may minimize variable scope best, but can easily lead to a lot of method parameters being passed around. The code will usually be more verbose and this can lead to a complexity as one refactors all these methods.
Using more instance variables can reduce the complexity of lots of method parameters being passed around and can give a flexibility to methods when you are frequently reorganizing methods for clarity. But it creates more object state that you have to maintain. Generally the advice is to do the former and refrain from the latter.
However, very often, and it may depend on the person, one can more easily manage state complexity compared with the thousands of extra object references of the first case. One may notice this when business logic within methods increases and organization needs to change to keep order and clarity.
Not only that. When you reorganize your methods to keep clarity and make lots of method parameter changes in the process, you end up with lots of version control diffs which is not so good for stable production quality code. There is a balance. One way causes one kind of complexity. The other way causes another kind of complexity.
Use the way that works best for you. You will find that balance over time.
I think this young programmer has some insightful first impressions for low maintenance code.
Use instance variables when
If two functions in the class need the same value, then make it an instance variable
or
If the state is not expected to change, make it an instance variable. For example: immutable object, DTO, LinkedList, those with final variables
or
If it is an underlying data on whom actions are performed. For example: final in arr[] in the PriorityQueue.java source code file
or
Even if it is used only once and state is expected to change, make it an instance if it is used only once by a function whose parameter list should be empty. For example: HTTPCookie.java Line: 860 hashcode() function uses 'path variable'.
Similarly, use a local variable when none of these conditions match, specifically if the role of the variable would end after the stack is popped off. For example: Comparator.compare(o1, o2);