Should I keep instance variables in Java always initialized or not? - java

I recently started a new project and I'm trying to keep my instance variables always initialized to some value, so none of them is at any time null. Small example below:
public class ItemManager {
ItemMaster itemMaster;
List<ItemComponentManager> components;
ItemManager() {
itemMaster = new ItemMaster();
components = new ArrayList<ItemComponentManager>();
}
...
}
The point is mainly to avoid the tedious checking for null before using an instance variable somewhere in the code. So far, it's working good and you mostly don't need the null-value as you can check also for empty string or empty list, etc. I'm not using this approach for method scoped variables as their scope is very limited and so doesn't affect other parts of the code.
This all is kind of experimental, so I'd like to know if this approach could work or if there are some pitfalls which I'm not seeing yet. Is it generally a good idea to keep instance variables initialized?

I usually treat an empty collection and a null collection as two separate things:
An empty collection implies that I know there are zero items available. A null collection will tell me that I don't know the state of the collection, which is a different thing.
So I really do not think it's an either/or. And I would declare the variable final if I initialize them in the constructor. If you declare it final it becomes very clear to the reader that this collection cannot be null.

First and foremost, all non-final instance variables must be declared private if you want to retain control!
Consider lazy instantiation as well -- this also avoids "bad state" but only initializes upon use:
class Foo {
private List<X> stuff;
public void add(X x) {
if (stuff == null)
stuff = new ArrayList<X>();
stuff.add(x);
}
public List<X> getStuff() {
if (stuff == null)
return Collections.emptyList();
return Collections.unmodifiableList(stuff);
}
}
(Note the use of Collections.unmodifiableList -- unless you really want a caller to be able to add/remove from your list, you should make it immutable)
Think about how many instances of the object in question will be created. If there are many, and you always create the lists (and might end up with many empty lists), you could be creating many more objects than you need.
Other than that, it's really a matter of taste and if you can have meaningful values when you construct.
If you're working with a DI/IOC, you want the framework to do the work for you (though you could do it through constructor injection; I prefer setters)
-- Scott

I would say that is totally fine - just as long as you remember that you have "empty" placeholder values there and not real data.
Keeping them null has the advantage of forcing you to deal with them - otherwise the program crashes. If you create empty objects, but forget them you get undefined results.
And just to comment on the defencive coding - If you are the one creating the objects and are never setting them null, there is no need to check for null every time. If for some reason you get null value, then you know something has gone catastrophically wrong and the program should crash anyway.

I would make them final if possible. Then they have to be initialized in the constructor and cannot become null.
You should also make them private in any case, to prevent other classes from assigning null to them. If you can check that null is never assigned in your class then the approach will work.

I have come across some cases where this causes problems.
During deserialization, some frameworks will not call the constructor, I don't know how or why they choose to do this but it happens. This can result in your values being null. I have also come across the case where the constructor is called but for some reason member variables are not initialized.
In actual fact I'd use the following code instead of yours:
public class ItemManager {
ItemMaster itemMaster = new ItemMaster();
List<ItemComponentManager> components = new ArrayList<ItemComponentManager>();
ItemManager() {
...
}
...
}

The way I deal with any variable I declare is to decide if it will change over the lifetime of the object (or class if it is static). If the answer is "no" then I make it final.
Making it final forces you to give it a value when the object is created... personally I would do the following unless I knew that I would be changing what the point at:
private final ItemMaster itemMaster;
private final List components;
// instance initialization block - happens at construction time
{
itemMaster = new ItemMaster();
components = new ArrayList();
}
The way your code is right now you must check for null all the time because you didn't mark the variables as private (which means that any class in the same package can change the values to null).

Yes, it is very good idea to initialize all class variables in the constructor.

The point is mainly to avoid the
tedious checking for null before using
a class variable somewhere in the
code.
You still have to check for null. Third party libraries and even the Java API will sometimes return null.
Also, instantiating an object that may never be used is wasteful, but that would depend on the design of your class.

An object should be 100% ready for use after it's constructed. Users should not have to be checking for nulls. Defensive programming is the way to go - keep the checks.
In the interest of DRY, you can put the checks in the setters and simply have the constructor call them. That way you don't code the checks twice.

If it's all your code and you want to set that convention, it should be a nice thing to have. I agree with Paul's comment, though, that nothing prevents some errant code from accidentally setting one of your class variables to null. As a general rule, I always check for null. Yeah, it's a PITA, but defensive coding can be a good thing.

From the name of the class "ItemManager", ItemManager sounds like a singleton in some app. If so you should investigate and really, really, know Dependency Injection. Use something like Spring ( http://www.springsource.org/ ) to create and inject the list of ItemComponentManagers into ItemManager.
Without DI, Initialization by hand in serious apps is a nightmare to debug and connecting up various "manager" classes to make narrow tests is hell.
Use DI always (even when constructing tests). For data objects, create a get() method that creates the list if it doesn't exist. However if the object is complex, almost certainly will find your life better using the Factory or Builder pattern and have the F/B set the member variables as needed.

What happens if in one of your methods you set
itemMaster = null;
or you return a reference to the ItemManager to some other class and it sets itemMaster as null.
(You can guard against this easily return a clone of your ItemManager etc)
I would keep the checks as this is possible.

Related

What is wrong in sharing Mutable State? [duplicate]

This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?
For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.
With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.
This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.

Can a class be nullified from within the class itself?

For example, is this code valid?.
class abc{
int x,y;
abc(int x,int y){
this.x=x;
this.y=y;
while(true)
update();
}
public void update(){
x--;
y--;
if(y==0)
this=null;
}
}
If the above is not valid, then please explain why. I am in need of a class that after certain iterations ceases to exist. Please suggest alternatives to the above approach.
No, this code is not valid.
Furthermore, I don't see what meaningful semantics it could have had if it were valid.
Please suggest alternatives to the above approach.
The object exists for as long as there are references to it. To make the object eligible for garbage collection you simply need to ensure that there are no references pointing to it (in your case, this should happen as soon as y reaches zero).
No. The reason is that you do not make object null. When you say obj = null; You just put null to variable that previously hold reference to object. There are probably a lot of other references to the same object.
I think that what you want to do is to kind of invalidate object and make it garbage collected but take this decision inside the class. If this is the problem I'd recommend you to take a look on weak references.
Other possible solution is to implement kind of "smart reference" in java. You can create your class SmartReference that will hold the real reference to the object. The object should hold callback to this smart reference and call its method invalidate() that is something like your syntactically wrong expression this = null. You have to care not to refer to such objects directly but only via smart reference.
The only question is "why do you want to do this?". Really, this will cause the code to be more complicated and unstable. Imagine: the object decides to invalidate itself, so the reference that "smart reference" is holding becomes null. Now all holders of this smart reference will get NPE when trying to use the object! This is exactly the reason the such mechanism does not exist in java and that application programmer cannot mange the memory directly.
Bottom line: remove all object references and let GC to do its hard job. Trust it. It knows to clean the garbage.
I think this is a good question.
I've had loads of cases where I'd like Objects to validate themselves after/during construction and if it finds reason to, to just return an empty value or go back up the stack and skip over creating that object.
Mostly in the case of where you are creating a list of objects from a list of other values. If a value is garbage and you want your object to recognise this.
Rather then have to code a function outside the Class itself to validate the creation, it would be much neater to allow the object to do it.
It's a shame java doesn't allow for things like this on the assumption the programmer is probably going to mess it up. If you code well it would be a nice feature.
I think you need to rethink why you want to do this, because what you're suggesting doesn't even exist as a concept in Java.
The this variable always refers to the object itself. You can't "nullify" an object, only a reference (since after all, what you're doing is assigning a reference to point to null instead of its previous object). It wouldn't make sense to do that with this, as it's always a pointer to the current object in scope.
Are you trying to force an object to be destroyed/garbage collected? If so, you can't do that while other parts of your code still have references to it (and if they don't have references, it will be garbage collected anyway).
What did you hope/think this would do, anyway?
your code must be get compile time error..
Coz..
The left-hand side of an assignment must be a variable
this is not a variable its a keyword..
this=null;

java - is initializing a temporary variable for simple getters better or not?

A very unimportant question about Java performance, but it made me wondering today.
Say I have simple getter:
public Object getSomething() {
return this.member;
}
Now, say I need the result of getSomething() twice (or more) in some function/algorithm. My question: is there any difference in either calling getSomething() twice (or more) or in declaring a temporary, local variable and use this variable from then on?
That is, either
public void algo() {
Object o = getSomething();
... use o ...
}
or
public void algo() {
... call getSomething() multiple times ...
}
I tend to mix both options, for no specific reason. I know it doesn't matter, but I am just wondering.
Thanks!
Technically, it's faster to not call the method multiple times, however this might not always be the case. The JVM might optimize the method calls to be inline and you won't see the difference at all. In any case, the difference is negligible.
However, it's probably safer to always use a getter. What if the value of the state changes between your calls? If you want to use a consistent version, then you can save the value from the first call. Otherwise, you probably want to always use the getter.
In any case, you shouldn't base this decision on performance because it's so negligible. I would pick one and stick with it consistently. I would recommend always going through your getters/setters.
Getters and setters are about encapsulation and abstraction. When you decide to invoke the getter multiple times, you are making assumptions about the inner workings of that class. For example that it does no expensive calculations, or that the value is not changed by other threads.
I'd argue that its better to call the getter once and store its result in a temporary variable, thus allowing you to freely refactor the implementing class.
As an anecdote, I was once bitten by a change where a getter returned an array, but the implementing class was changed from an array property to using a list and doing the conversion in the getter.
The compiler should optimize either one to be basically the same code.

Why would one want to use the public constructors on Boolean and similar immutable classes?

(For the purposes of this question, let us assume that one is intentionally not using auto(un)boxing, either because one is writing pre-Java 1.5 code, or because one feels that autounboxing makes it too easy to create NullPointerExceptions.)
Take Boolean, for example. The documentation for the Boolean(boolean) constructor says:
Note: It is rarely appropriate to use this constructor. Unless a new
instance is required, the static factory valueOf(boolean) is generally
a better choice. It is likely to yield significantly better space and time
performance.
My question is, why would you ever want to get a new instance in the first place? It seems like things would be simpler if constructors like that were private. For example, if they were, you could write this with no danger (even if myBoolean were null):
if (myBoolean == Boolean.TRUE)
It'd be safe because all true Booleans would be references to Boolean.TRUE and all false Booleans would be references to Boolean.FALSE. But because the constructors are public, someone may have used them, which means that you have to write this instead:
if (Boolean.TRUE.equals(myBoolean))
But where it really gets bad is when you want to check two Booleans for equality. Something like this:
if (myBooleanA == myBooleanB)
...becomes this:
if (
myBooleanA == myBooleanB ||
(myBooleanA != null && myBooleanA.equals(myBooleanB))
)
UPDATE: With the release of Java 7, java.util.Objects makes this simpler construct possible:
if (Objects.equals(myBooleanA, myBooleanB))
I can't think of any reason to have separate instances of these objects which is more compelling than not having to do the nonsense above. What say you?
The cached values are never garbage collected, so use the constructors whenever you'd like to use them as soft/weak references, so that it can be garbage collected anyway whenever needed. The same applies on Long#valueOf(), Integer#valueOf() and consorts with values within cacheable ranges.
Doing a reference search in Eclipse learns me that under each java.lang.Thread uses new Boolean() as a soft-reference based cache, it's even explicitly commented (in isCCLOverridden() method):
/*
* Note: only new Boolean instances (i.e., not Boolean.TRUE or
* Boolean.FALSE) must be used as cache values, otherwise cache
* entry will pin associated class.
*/
The constructors are public because of backwards compatibility... .valueOf() only got added in java 1.4...
Also using a Boolean as a tri-state variable in your example (null/TRUE/FALSE) is probably a bad idea -- better to use an enum (UNKNOWN,TRUE,FALSE), or if null is not a valid value, check for it, and manually unbox for testing equality.
These object types were needed because the Collection class only accepted objects, hence you couldn't use the native types.
This introduced the design flaw you are talking about and hence autoboxing was introduced.
EDIT
And the constructors are public because they were always public. Before the world of autoboxing in some very poor code you wanted new Integer(0) != new Integer(0) to be true. It was a flaw more than anything of the original design, however since its a part of the public interface now they don't want to break old code.
I bet they could deprecate it now and most people would be ok with it since autoboxing just works.

Should Java method arguments be used to return multiple values?

Since arguments sent to a method in Java point to the original data structures in the caller method, did its designers intend for them to used for returning multiple values, as is the norm in other languages like C ?
Or is this a hazardous misuse of Java's general property that variables are pointers ?
A long time ago I had a conversation with Ken Arnold (one time member of the Java team), this would have been at the first Java One conference probably, so 1996. He said that they were thinking of adding multiple return values so you could write something like:
x, y = foo();
The recommended way of doing it back then, and now, is to make a class that has multiple data members and return that instead.
Based on that, and other comments made by people who worked on Java, I would say the intent is/was that you return an instance of a class rather than modify the arguments that were passed in.
This is common practice (as is the desire by C programmers to modify the arguments... eventually they see the Java way of doing it usually. Just think of it as returning a struct. :-)
(Edit based on the following comment)
I am reading a file and generating two
arrays, of type String and int from
it, picking one element for both from
each line. I want to return both of
them to any function which calls it
which a file to split this way.
I think, if I am understanding you correctly, tht I would probably do soemthing like this:
// could go with the Pair idea from another post, but I personally don't like that way
class Line
{
// would use appropriate names
private final int intVal;
private final String stringVal;
public Line(final int iVal, final String sVal)
{
intVal = iVal;
stringVal = sVal;
}
public int getIntVal()
{
return (intVal);
}
public String getStringVal()
{
return (stringVal);
}
// equals/hashCode/etc... as appropriate
}
and then have your method like this:
public void foo(final File file, final List<Line> lines)
{
// add to the List.
}
and then call it like this:
{
final List<Line> lines;
lines = new ArrayList<Line>();
foo(file, lines);
}
In my opinion, if we're talking about a public method, you should create a separate class representing a return value. When you have a separate class:
it serves as an abstraction (i.e. a Point class instead of array of two longs)
each field has a name
can be made immutable
makes evolution of API much easier (i.e. what about returning 3 instead of 2 values, changing type of some field etc.)
I would always opt for returning a new instance, instead of actually modifying a value passed in. It seems much clearer to me and favors immutability.
On the other hand, if it is an internal method, I guess any of the following might be used:
an array (new Object[] { "str", longValue })
a list (Arrays.asList(...) returns immutable list)
pair/tuple class, such as this
static inner class, with public fields
Still, I would prefer the last option, equipped with a suitable constructor. That is especially true if you find yourself returning the same tuple from more than one place.
I do wish there was a Pair<E,F> class in JDK, mostly for this reason. There is Map<K,V>.Entry, but creating an instance was always a big pain.
Now I use com.google.common.collect.Maps.immutableEntry when I need a Pair
See this RFE launched back in 1999:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4222792
I don't think the intention was to ever allow it in the Java language, if you need to return multiple values you need to encapsulate them in an object.
Using languages like Scala however you can return tuples, see:
http://www.artima.com/scalazine/articles/steps.html
You can also use Generics in Java to return a pair of objects, but that's about it AFAIK.
EDIT: Tuples
Just to add some more on this. I've previously implemented a Pair in projects because of the lack within the JDK. Link to my implementation is here:
http://pbin.oogly.co.uk/listings/viewlistingdetail/5003504425055b47d857490ff73ab9
Note, there isn't a hashcode or equals on this, which should probably be added.
I also came across this whilst doing some research into this questions which provides tuple functionality:
http://javatuple.com/
It allows you to create Pair including other types of tuples.
You cannot truly return multiple values, but you can pass objects into a method and have the method mutate those values. That is perfectly legal. Note that you cannot pass an object in and have the object itself become a different object. That is:
private void myFunc(Object a) {
a = new Object();
}
will result in temporarily and locally changing the value of a, but this will not change the value of the caller, for example, from:
Object test = new Object();
myFunc(test);
After myFunc returns, you will have the old Object and not the new one.
Legal (and often discouraged) is something like this:
private void changeDate(final Date date) {
date.setTime(1234567890L);
}
I picked Date for a reason. This is a class that people widely agree should never have been mutable. The the method above will change the internal value of any Date object that you pass to it. This kind of code is legal when it is very clear that the method will mutate or configure or modify what is being passed in.
NOTE: Generally, it's said that a method should do one these things:
Return void and mutate its incoming objects (like Collections.sort()), or
Return some computation and don't mutate incoming objects at all (like Collections.min()), or
Return a "view" of the incoming object but do not modify the incoming object (like Collections.checkedList() or Collections.singleton())
Mutate one incoming object and return it (Collections doesn't have an example, but StringBuilder.append() is a good example).
Methods that mutate incoming objects and return a separate return value are often doing too many things.
There are certainly methods that modify an object passed in as a parameter (see java.io.Reader.read(byte[] buffer) as an example, but I have not seen parameters used as an alternative for a return value, especially with multiple parameters. It may technically work, but it is nonstandard.
It's not generally considered terribly good practice, but there are very occasional cases in the JDK where this is done. Look at the 'biasRet' parameter of View.getNextVisualPositionFrom() and related methods, for example: it's actually a one-dimensional array that gets filled with an "extra return value".
So why do this? Well, just to save you having to create an extra class definition for the "occasional extra return value". It's messy, inelegant, bad design, non-object-oriented, blah blah. And we've all done it from time to time...
Generally what Eddie said, but I'd add one more:
Mutate one of the incoming objects, and return a status code. This should generally only be used for arguments that are explicitly buffers, like Reader.read(char[] cbuf).
I had a Result object that cascades through a series of validating void methods as a method parameter. Each of these validating void methods would mutate the result parameter object to add the result of the validation.
But this is impossible to test because now I cannot stub the void method to return a stub value for the validation in the Result object.
So, from a testing perspective it appears that one should favor returning a object instead of mutating a method parameter.

Categories