I'm in my first programming class in high school. We're doing our end of the first semester project.
This project only involves one class, but many methods. My question is about best practice with instance variables and local variables. It seems that it would be much easier for me to code using almost only instance variables. But I'm not sure if this is how I should be doing it or if I should be using local variables more (I would just have to have methods take in the values of local variables a lot more).
My reasoning for this is also because a lot of times I'll want to have a method return two or three values, but this is of course not possible. Thus it just seems easier to simply use instance variables and never having to worry since they are universal in the class.
I haven't seen anyone discuss this so I'll throw in more food for thought. The short answer/advice is don't use instance variables over local variables just because you think they are easier to return values. You are going to make working with your code very very hard if you don't use local variables and instance variables appropriately. You will produce some serious bugs that are really hard to track down. If you want to understand what I mean by serious bugs, and what that might look like read on.
Let's try and use only instance variables as you suggest to write to functions. I'll create a very simple class:
public class BadIdea {
public Enum Color { GREEN, RED, BLUE, PURPLE };
public Color[] map = new Colors[] {
Color.GREEN,
Color.GREEN,
Color.RED,
Color.BLUE,
Color.PURPLE,
Color.RED,
Color.PURPLE };
List<Integer> indexes = new ArrayList<Integer>();
public int counter = 0;
public int index = 0;
public void findColor( Color value ) {
indexes.clear();
for( index = 0; index < map.length; index++ ) {
if( map[index] == value ) {
indexes.add( index );
counter++;
}
}
}
public void findOppositeColors( Color value ) {
indexes.clear();
for( index = 0; i < index < map.length; index++ ) {
if( map[index] != value ) {
indexes.add( index );
counter++;
}
}
}
}
This is a silly program I know, but we can use it to illustrate the concept that using instance variables for things like this is a tremendously bad idea. The biggest thing you'll find is that those methods use all of the instance variables we have. And it modifies indexes, counter, and index every time they are called. The first problem you'll find is that calling those methods one after the other can modify the answers from prior runs. So for example, if you wrote the following code:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
idea.findColor( Color.GREEN ); // whoops we just lost the results from finding all Color.RED
Since findColor uses instance variables to track returned values we can only return one result at a time. Let's try and save off a reference to those results before we call it again:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes;
int redCount = idea.counter;
idea.findColor( Color.GREEN ); // this causes red positions to be lost! (i.e. idea.indexes.clear()
List<Integer> greenPositions = idea.indexes;
int greenCount = idea.counter;
In this second example we saved the red positions on the 3rd line, but same thing happened!?Why did we lose them?! Because idea.indexes was cleared instead of allocated so there can only be one answer used at a time. You have to completely finish using that result before calling it again. Once you call a method again the results are cleared and you lose everything. In order to fix this you'll have to allocate a new result each time so red and green answers are separate. So let's clone our answers to create new copies of things:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes.clone();
int redCount = idea.counter;
idea.findColor( Color.GREEN );
List<Integer> greenPositions = idea.indexes.clone();
int greenCount = idea.counter;
Ok finally we have two separate results. The results of red and green are now separate. But, we had to know a lot about how BadIdea operated internally before the program worked didn't we? We need to remember to clone the returns every time we called it to safely make sure our results didn't get clobbered. Why is the caller forced to remember these details? Wouldn't it be easier if we didn't have to do that?
Also notice that the caller has to use local variables to remember the results so while you didn't use local variables in the methods of BadIdea the caller has to use them to remember results. So what did you really accomplish? You really just moved the problem to the caller forcing them to do more. And the work you pushed onto the caller is not an easy rule to follow because there are some many exceptions to the rule.
Now let's try doing that with two different methods. Notice how I've been "smart" and I reused those same instance variables to "save memory" and kept the code compact. ;-)
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes;
int redCount = idea.counter;
idea.findOppositeColors( Color.RED ); // this causes red positions to be lost again!!
List<Integer> greenPositions = idea.indexes;
int greenCount = idea.counter;
Same thing happened! Damn but I was being so "smart" and saving memory and the code uses less resources!!! This is the real peril of using instance variables like this is calling methods is order dependent now. If I change the order of the method calls the results are different even though I haven't really changed the underlying state of BadIdea. I didn't change the contents of the map. Why does the program yield different results when I call the methods in different order?
idea.findColor( Color.RED )
idea.findOppositeColors( Color.RED )
Produces a different result than if I swapped those two methods:
idea.findOppositeColors( Color.RED )
idea.findColor( Color.RED )
These types of errors are really hard to track down especially when those lines aren't right next to each other. You can completely break your program by just adding a new call in anywhere between those two lines and get wildly different results. Sure when we're dealing with small number of lines it's easy to spot errors. But, in a larger program you can waste days trying to reproduce them even though the data in the program hasn't changed.
And this only looks at single threaded problems. If BadIdea was being used in a multi-threaded situation the errors can get really bizarre. What happens if findColors() and findOppositeColors() is called at the same time? Crash, all your hair falls out, Death, space and time collapse into a singularity and the universe is swallows up? Probably at least two of those. Threads are probably above your head now, but hopefully we can steer you away from doing bad things now so when you do get to threads those bad practices don't cause you real heartache.
Did you notice how careful you had to be when calling the methods? They overwrote each other, they shared memory possibly randomly, you had to remember the details of how it worked on the inside to make it work on the outside, changing the order in which things were called produce very big changes in the next lines down, and it only could only work in a single thread situation. Doing things like this will produce really brittle code that seems to fall apart whenever you touch it. These practices I showed contributed directly to the code being brittle.
While this might look like encapsulation it is the exact opposite because the technical details of how you wrote it have to be known to the caller. The caller has to write their code in a very particular way to make their code work, and they can't do it without knowing about the technical details of your code. This is often called a Leaky Abstraction because the class is suppose to hide the technical details behind an abstraction/interface, but the technical details leak out forcing the caller to change their behavior. Every solution has some degree of leaky-ness, but using any of the above techniques like these guarantees no matter what problem you are trying to solve it will be terribly leaky if you apply them. So let's look at the GoodIdea now.
Let's rewrite using local variables:
public class GoodIdea {
...
public List<Integer> findColor( Color value ) {
List<Integer> results = new ArrayList<Integer>();
for( int i = 0; i < map.length; i++ ) {
if( map[index] == value ) {
results.add( i );
}
}
return results;
}
public List<Integer> findOppositeColors( Color value ) {
List<Integer> results = new ArrayList<Integer>();
for( int i = 0; i < map.length; i++ ) {
if( map[index] != value ) {
results.add( i );
}
}
return results;
}
}
This fixes every problem we discussed above. I know I'm not keeping track of counter or returning it, but if I did I can create a new class and return that instead of List. Sometimes I use the following object to return multiple results quickly:
public class Pair<K,T> {
public K first;
public T second;
public Pair( K first, T second ) {
this.first = first;
this.second = second;
}
}
Long answer, but a very important topic.
Use instance variables when it's a core concept of your class. If you're iterating, recursing or doing some processing, then use local variables.
When you need to use two (or more) variables in the same places, it's time to create a new class with those attributes (and appropriate means to set them). This will make your code cleaner and help you think about problems (each class is a new term in your vocabulary).
One variable may be made a class when it is a core concept. For example real-world identifiers: these could be represented as Strings, but often, if you encapsulate them into their own object they suddenly start "attracting" functionality (validation, association to other objects, etc.)
Also (not entirely related) is object consistency - an object is able to ensure that its state makes sense. Setting one property may alter another. It also makes it far easier to alter your program to be thread-safe later (if required).
Local variables internal to methods are always prefered, since you want to keep each variable's scope as small as possible. But if more than one method needs to access a variable, then it's going to have to be an instance variable.
Local variables are more like intermediate values used to reach a result or compute something on the fly. Instance variables are more like attributes of a class, like your age or name.
The easy way: if the variable must be shared by more than one method, use instance variable, otherwise use local variable.
However, the good practice is to use as more local variables as possible. Why? For your simple project with only one class, there is no difference. For a project that includes a lot of classes, there is big difference. The instance variable indicates the state of your class. The more instance variables in your class, the more states this class can have and then, the more complex this class is, the hard the class is maintained or the more error prone your project might be. So the good practice is to use as more local variable as possible to keep the state of the class as simple as possible.
Short story: if and only if a variable needs to be accessed by more than one method (or outside of the class), create it as an instance variables. If you need it only locally, in a single method, it has to be a local variable.
Instance variables are more costly than local variables.
Keep in mind: instance variables are initialized to default values while local variables are not.
Declare variables to be scoped as narrowly as possible. Declare local variables first. If this isn't sufficient, use instance variables. If this isn't sufficient, use class (static) variables.
I you need to return more than one value return a composite structure, like an array or an object.
Try to think about your problem in terms of objects. Each class represents a different type of object. Instance variables are the pieces of data that a class needs to remember in order to work, either with itself or with other objects. Local variables should just be used intermediate calculations, data that you don't need to save once you leave the method.
Try not to return more than one value from your methods in first place. If you can't, and in some cases you really can't, then I would recommend encapsulating that in a class. Just in last case I would recommend changing another variable inside your class (an instance variable). The problem with the instance variables approach is that it increases side effects - for example, you call method A in your program and it modifies some instance(s) variable(s). Over time, that leads to increased complexity in your code and maintenance becomes harder and harder.
When I have to use instance variables, I try to make then final and initialize then in the class constructors, so side effects are minimized. This programming style (minimizing the state changes in your application) should lead to better code that is easier to maintain.
Generally variables should have minimal scope.
Unfortunately, in order to build classes with minimized variable scope, one often needs to do a lot of method parameter passing.
But if you follow that advice all the time, perfectly minimizing variable scope, you
may end up with a lot of redundancy and method inflexibility with all the required objects passed in and out of methods.
Picture a code base with thousands of methods like this:
private ClassThatHoldsReturnInfo foo(OneReallyBigClassThatHoldsCertainThings big,
AnotherClassThatDoesLittle little) {
LocalClassObjectJustUsedHere here;
...
}
private ClassThatHoldsReturnInfo bar(OneMediumSizedClassThatHoldsCertainThings medium,
AnotherClassThatDoesLittle little) {
...
}
And, on the other hand, imagine a code base with lots of instance variables like this:
private OneReallyBigClassThatHoldsCertainThings big;
private OneMediumSizedClassThatHoldsCertainThings medium;
private AnotherClassThatDoesLittle little;
private ClassThatHoldsReturnInfo ret;
private void foo() {
LocalClassObjectJustUsedHere here;
....
}
private void bar() {
....
}
As code increases, the first way may minimize variable scope best, but can easily lead to a lot of method parameters being passed around. The code will usually be more verbose and this can lead to a complexity as one refactors all these methods.
Using more instance variables can reduce the complexity of lots of method parameters being passed around and can give a flexibility to methods when you are frequently reorganizing methods for clarity. But it creates more object state that you have to maintain. Generally the advice is to do the former and refrain from the latter.
However, very often, and it may depend on the person, one can more easily manage state complexity compared with the thousands of extra object references of the first case. One may notice this when business logic within methods increases and organization needs to change to keep order and clarity.
Not only that. When you reorganize your methods to keep clarity and make lots of method parameter changes in the process, you end up with lots of version control diffs which is not so good for stable production quality code. There is a balance. One way causes one kind of complexity. The other way causes another kind of complexity.
Use the way that works best for you. You will find that balance over time.
I think this young programmer has some insightful first impressions for low maintenance code.
Use instance variables when
If two functions in the class need the same value, then make it an instance variable
or
If the state is not expected to change, make it an instance variable. For example: immutable object, DTO, LinkedList, those with final variables
or
If it is an underlying data on whom actions are performed. For example: final in arr[] in the PriorityQueue.java source code file
or
Even if it is used only once and state is expected to change, make it an instance if it is used only once by a function whose parameter list should be empty. For example: HTTPCookie.java Line: 860 hashcode() function uses 'path variable'.
Similarly, use a local variable when none of these conditions match, specifically if the role of the variable would end after the stack is popped off. For example: Comparator.compare(o1, o2);
Related
This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?
For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.
With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.
This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.
I'm learning how to code in Java. I'm a little confused by "return;" and what it does and when we use it. Please see the following example of code:
public int something() {
return 1;
}
public static void main() {
int returnValue = something();
System.out.println(returnValue);
//Prints 1
}
Why wouldn't we just store 1 into a int variable called something then use System.out.print(something);
When would we use the return method instead of simply storing into a variable?
Thank you
Sure you could store into a variable but then you would lose one of the very important features, namely the ability to call the method inside itself.
This is relevant for algorithms that divide the work into smaller chunks and invoke themselves on the smaller chunks (and then combine the individual result to a big result). This is very common in sorting algorithms. The technical term is recursion.
Usually the compiler actually does exactly this; creates a variable for storing the value from where the calling code can pick it up. This variable is typically put in the same location - the stack - as the parameters passed in to the called method, and is invisible to your code.
(Also it is needed to make it threadsafe which is essential to utilizing more than one core on a modern cpu).
I have class, which contains variables of multiple type, most of those (about 30) are double:
String something;
double x;
double y;
double z;
...
I want to iterate over doubles, but also keep them written in this "classic way", not inside array, because derived classes use most of them. The function I am having problem with now is how to iterate across all the double type variables, find how many of those are non zero and then pick one of all these variables randomly. There will be thousands of instances of this class and as I said, there are classes that expand this one. So I am working on solution, preferably something like pseudo:
nonzeros = 0
foreach doubleVarInClass variable
{
if (variable != 0)
nonzeros++;
}
if (nonzeros < parameter)
{
randomDoubleVarInClass = random.next(...);
}
One solution which I was thinking about was to use HashMap to keep all the variables in, but then I will have to rewrite all classes that uses this one and not sure how it will affect performance, since it will be pretty intensively used all the time. Should I be afraid of performance and try something with classic arrays perhaps? I'd like to atleast keep variable names if nothing. I thought about array with references to these variables, so I can keep them written this way, not sure if its possible due to value passing in Java.
Also maybe there is some structure that keeps info about how many of those are non zero or have efficient function for it?
Thank you for any info that could solve my problem :)
I suggest using reflection for this. Suppose you have instance of your class named o:
int nonzeros = 0;
for (Field f : o.getClass().getDeclaredFields()) {
f.setAccessible(true);
if (f.getType().equals(Double.TYPE) && f.getDouble(o) != 0.0) {
nonzeros++;
}
}
NOTE: Java Reflection will probably be bad idea from the point of performance, and you should test this first, from that point of view. Besides that, this provide easy checks without any changes in your class definition. In Java 6 performance of reflection is little better than on older versions, and you should check this in your personal use case and in your environment.
I'm cleaning up Java code for someone who starts their functions by declaring all variables up top, and initializing them to null/0/whatever, as opposed to declaring them as they're needed later on.
What are the specific guidelines for this? Are there optimization reasons for one way or the other, or is one way just good practice? Are there any cases where it's acceptable to deviate from whatever the proper way of doing it is?
Declare variables as close to the first spot that you use them as possible. It's not really anything to do with efficiency, but makes your code much more readable. The closer a variable is declared to where it is used, the less scrolling/searching you have to do when reading the code later. Declaring variables closer to the first spot they're used will also naturally narrow their scope.
The proper way is to declare variables exactly when they are first used and minimize their scope in order to make the code easier to understand.
Declaring variables at the top of functions is a holdover from C (where it was required), and has absolutely no advantages (variable scope exists only in the source code, in the byte code all local variables exist in sequence on the stack anyway). Just don't do it, ever.
Some people may try to defend the practice by claiming that it is "neater", but any need to "organize" code within a method is usually a strong indication that the method is simply too long.
From the Java Code Conventions, Chapter 6 on Declarations:
6.3 Placement
Put declarations only at the beginning
of blocks. (A block is any code
surrounded by curly braces "{" and
"}".) Don't wait to declare variables
until their first use; it can confuse
the unwary programmer and hamper code
portability within the scope.
void myMethod() {
int int1 = 0; // beginning of method block
if (condition) {
int int2 = 0; // beginning of "if" block
...
}
}
The one exception to the rule is
indexes of for loops, which in Java
can be declared in the for statement:
for (int i = 0; i < maxLoops; i++) { ... }
Avoid local declarations that hide
declarations at higher levels. For
example, do not declare the same
variable name in an inner block:
int count;
...
myMethod() {
if (condition) {
int count = 0; // AVOID!
...
}
...
}
If you have a kabillion variables used in various isolated places down inside the body of a function, your function is too big.
If your function is a comfortably understandable size, there's no difference between "all up front" and "just as needed".
The only not-up-front variable would be in the body of a for statement.
for( Iterator i= someObject.iterator(); i.hasNext(); )
From Google Java Style Guide:
4.8.2.2 Declared when needed
Local variables are not habitually declared at the start of their
containing block or block-like construct. Instead, local variables are
declared close to the point they are first used (within reason), to
minimize their scope. Local variable declarations typically have
initializers, or are initialized immediately after declaration.
Well, I'd follow what Google does, on a superficial level it might seem that declaring all variables at the top of the method/function would be "neater", it's quite apparent that it'd be beneficial to declare variables as necessary. It's subjective though, whatever feels intuitive to you.
I've found that declaring them as-needed results in fewer mistakes than declaring them at the beginning. I've also found that declaring them at the minimum scope possible to also prevent mistakes.
When I looked at the byte-code generated by the location of the declaration few years ago, I found they were more-or-less identical. There were ocassionally differences depending on when they were assigned. Even something like:
for(Object o : list) {
Object temp = ...; //was not "redeclared" every loop iteration
}
vs
Object temp;
for(Object o : list) {
temp = ...; //nearly identical bytecoode, if not exactly identical.
}
Came out more or less identical
I am doing this very same thing at the moment. All of the variables in the code that I am reworking are declared at the top of the function. I've seen as I've been looking through this that several variables are declared but NEVER used or they are declared and operations are being done with them (ie parsing a String and then setting a Calendar object with the date/time values from the string) but then the resulting Calendar object is NEVER used.
I am going through and cleaning these up by taking the declarations from the top and moving them down in the function to a spot closer to where it is used.
Defining variable in a wider scope than needed hinders understandability quite a bit. Limited scope signals that this variable has meaning for only this small block of code and you can not think about when reading further. This is a pretty important issue because of the tiny short-term working memory that the brain has (it said that on average you can keep track of only 7 things). One less thing to keep track of is significant.
Similarly you really should try to avoid variables in the literal sense. Try to assign all things once, and declare them final so this is known to the reader. Not having to keep track whether something changes or not really cuts the cognitive load.
Principle: Place local variable declarations as close to their first use as possible, and NOT simply at the top of a method. Consider this example:
/** Return true iff s is a blah or a blub. */
public boolean checkB(String s) {
// Return true if s is a blah
... code to return true if s is a blah ...
// Return true if s is a blub. */
int helpblub= s.length() + 1;
... rest of code to return true is s is a blah.
return false;
}
Here, local variable helpblub is placed where it is necessary, in the code to test whether s is a blub. It is part of the code that implements "Return true is s is a blub".
It makes absolutely no logical sense to put the declaration of helpblub as the first statement of the method. The poor reader would wonder, why is that variable there? What is it for?
I think it is actually objectively provable that the declare-at-the-top style is more error-prone.
If you mutate-test code in either style by moving lines around at random (to simulate a merge gone bad or someone unthinkingly cut+pasting), then the declare-at-the-top style has a greater chance of compiling while functionally wrong.
I don't think declare-at-the-top has any corresponding advantage that doesn't come down to personal preference.
So assuming you want to write reliable code, learn to prefer doing just-in-time declaration.
Its a matter of readability and personal preference rather than performance. The compiler does not care and will generate the same code anyway.
I've seen people declare at the top and at the bottom of functions. I prefer the top, where I can see them quickly. It's a matter of choice and preference.
I recently started a new project and I'm trying to keep my instance variables always initialized to some value, so none of them is at any time null. Small example below:
public class ItemManager {
ItemMaster itemMaster;
List<ItemComponentManager> components;
ItemManager() {
itemMaster = new ItemMaster();
components = new ArrayList<ItemComponentManager>();
}
...
}
The point is mainly to avoid the tedious checking for null before using an instance variable somewhere in the code. So far, it's working good and you mostly don't need the null-value as you can check also for empty string or empty list, etc. I'm not using this approach for method scoped variables as their scope is very limited and so doesn't affect other parts of the code.
This all is kind of experimental, so I'd like to know if this approach could work or if there are some pitfalls which I'm not seeing yet. Is it generally a good idea to keep instance variables initialized?
I usually treat an empty collection and a null collection as two separate things:
An empty collection implies that I know there are zero items available. A null collection will tell me that I don't know the state of the collection, which is a different thing.
So I really do not think it's an either/or. And I would declare the variable final if I initialize them in the constructor. If you declare it final it becomes very clear to the reader that this collection cannot be null.
First and foremost, all non-final instance variables must be declared private if you want to retain control!
Consider lazy instantiation as well -- this also avoids "bad state" but only initializes upon use:
class Foo {
private List<X> stuff;
public void add(X x) {
if (stuff == null)
stuff = new ArrayList<X>();
stuff.add(x);
}
public List<X> getStuff() {
if (stuff == null)
return Collections.emptyList();
return Collections.unmodifiableList(stuff);
}
}
(Note the use of Collections.unmodifiableList -- unless you really want a caller to be able to add/remove from your list, you should make it immutable)
Think about how many instances of the object in question will be created. If there are many, and you always create the lists (and might end up with many empty lists), you could be creating many more objects than you need.
Other than that, it's really a matter of taste and if you can have meaningful values when you construct.
If you're working with a DI/IOC, you want the framework to do the work for you (though you could do it through constructor injection; I prefer setters)
-- Scott
I would say that is totally fine - just as long as you remember that you have "empty" placeholder values there and not real data.
Keeping them null has the advantage of forcing you to deal with them - otherwise the program crashes. If you create empty objects, but forget them you get undefined results.
And just to comment on the defencive coding - If you are the one creating the objects and are never setting them null, there is no need to check for null every time. If for some reason you get null value, then you know something has gone catastrophically wrong and the program should crash anyway.
I would make them final if possible. Then they have to be initialized in the constructor and cannot become null.
You should also make them private in any case, to prevent other classes from assigning null to them. If you can check that null is never assigned in your class then the approach will work.
I have come across some cases where this causes problems.
During deserialization, some frameworks will not call the constructor, I don't know how or why they choose to do this but it happens. This can result in your values being null. I have also come across the case where the constructor is called but for some reason member variables are not initialized.
In actual fact I'd use the following code instead of yours:
public class ItemManager {
ItemMaster itemMaster = new ItemMaster();
List<ItemComponentManager> components = new ArrayList<ItemComponentManager>();
ItemManager() {
...
}
...
}
The way I deal with any variable I declare is to decide if it will change over the lifetime of the object (or class if it is static). If the answer is "no" then I make it final.
Making it final forces you to give it a value when the object is created... personally I would do the following unless I knew that I would be changing what the point at:
private final ItemMaster itemMaster;
private final List components;
// instance initialization block - happens at construction time
{
itemMaster = new ItemMaster();
components = new ArrayList();
}
The way your code is right now you must check for null all the time because you didn't mark the variables as private (which means that any class in the same package can change the values to null).
Yes, it is very good idea to initialize all class variables in the constructor.
The point is mainly to avoid the
tedious checking for null before using
a class variable somewhere in the
code.
You still have to check for null. Third party libraries and even the Java API will sometimes return null.
Also, instantiating an object that may never be used is wasteful, but that would depend on the design of your class.
An object should be 100% ready for use after it's constructed. Users should not have to be checking for nulls. Defensive programming is the way to go - keep the checks.
In the interest of DRY, you can put the checks in the setters and simply have the constructor call them. That way you don't code the checks twice.
If it's all your code and you want to set that convention, it should be a nice thing to have. I agree with Paul's comment, though, that nothing prevents some errant code from accidentally setting one of your class variables to null. As a general rule, I always check for null. Yeah, it's a PITA, but defensive coding can be a good thing.
From the name of the class "ItemManager", ItemManager sounds like a singleton in some app. If so you should investigate and really, really, know Dependency Injection. Use something like Spring ( http://www.springsource.org/ ) to create and inject the list of ItemComponentManagers into ItemManager.
Without DI, Initialization by hand in serious apps is a nightmare to debug and connecting up various "manager" classes to make narrow tests is hell.
Use DI always (even when constructing tests). For data objects, create a get() method that creates the list if it doesn't exist. However if the object is complex, almost certainly will find your life better using the Factory or Builder pattern and have the F/B set the member variables as needed.
What happens if in one of your methods you set
itemMaster = null;
or you return a reference to the ItemManager to some other class and it sets itemMaster as null.
(You can guard against this easily return a clone of your ItemManager etc)
I would keep the checks as this is possible.