Mutable or immutable class? - java

I had read in some design book that immutable class improves scalability and its good practice to write immutable class wherever possible. But I think so immutable class increase object proliferation. So is it good of going immutable class or better go for static class (A class with all the methods static) for improve scalability ?

The main benefit of immutable classes however is that you can expose internal data members that are immutable because the caller can't modify them. This is a huge problem with, say, java.util.Date. It's mutable so you can't return it directly from a method. This means you end up doing all sorts of defensive copying. That increases object proliferation.
The other major benefit is that immutable objects do not have synchronization issues, by definition. That's where the scalability issues come in. Writing multithreaded code is hard. Immutable objects are a good way of (mostly) circumventing the problem.
As for "static classes", by your comment I take it to mean classes with factory methods, which is how it's usually described. That's an unrelated pattern. Both mutable and immutable classes can either have public constructors or private constructors with static factory methods. That has no impact on the (im)mutability of the class since a mutable class is one whose state can be changed after creation whereas an immutable class's state cannot be changed after instantiation.
Static factory methods can have other benefits however. The idea is to encapsulate object creation.

Immutable classes do promote object proliferation, but if you want safety, mutable objects will promote more object proliferation because you have to return copies rather than the original to prevent the user from changing the object you return.
As for using classes with all static methods, that's not really an option in most cases where immutability could be used. Take this example from an RPG:
public class Weapon
{
final private int attackBonus;
final private int accuracyBonus;
final private int range;
public Weapon(int attackBonus, int accuracyBonus, int range)
{
this.attackBonus = attackBonus;
this.accuracyBonus = accuracyBonus;
this.range = range;
}
public int getAttackBonus() { return this.attackBonus; }
public int getAccuracyBonus() { return this.accuracyBonus; }
public int getRange() { return this.range; }
}
How exactly would you implement this with a class that contains only static methods?

As cletus said, immutable classes simplify class design and handling in synchronized methods.
They also simplify handling in collections, even in single-threaded applications. An immutable class will never change, so the key and hashcode won't change, so you won't screw up your collections.
But you should keep in mind the lifecycle of the thing you're modeling and the "weight" of the constructor. If you need to change the thing, immutable objects become more complex to deal with. You have to replace them, rather than modify them. Not terrible, but worth considering. And if the constructor takes nontrivial time, that's a factor too.

One thing to consider: If you intend to use instances of a class as keys in a HashMap, or if you're going to put them in a HashSet, it's safer to make them immutable.
HashMap and HashSet count on the fact that the hash code for an object remains constant as long as the object is in the map or set. If you use an object as a key in a HashMap, or if you put it in a HashSet, and then change the state of the object so that hashCode() would return a different value, then you're confusing the HashMap or HashSet and you'll get strange things; for example, when you iterate the map or set the object is there, but when you try to get it, it's as if it is not there.
This is due to how HashMap and HashSet work internally - they organize objects by hash code.
This article by Java concurrency guru Brian Goetz gives a good overview of the pros and cons of immutable objects.

Immutability is generally used to achieve scalability, since immutability is one of the enablers when it comes to concurrent programming in java. So while, as you point out, there may be more objects in an "immutable" solution, it may be a necessary step to improve concurrency.
The other, equally important use og immutability is to consume a design intention; whoever made an immutable class intended you to put mutable state elsewhere. If you start mutating instances of that class, you are probably breaking the original intention of the design - and who knows what the consequences may be.

Consider string objects, as an example. Some languages or class libraries provide mutable strings, some don't.
A system that uses immutable strings can do certain optimizations that one with mutable strings cannot. For example, you can ensure that there is only one copy of any unique string. Since the size of the object "overhead" is generally much smaller than the size of any non-trivial string, this is a potentially massive memory savings. There are other potential space savings, like interning substrings.
Besides the potential memory savings, immutable objects can improve scalability by reducing contention. If you have a large number of threads accessing the same data, then immutable objects don't require elaborate synchronization processes for safe access.

Just one more consideration about the subject. Using immutable object allow you to cache them and not re-create them everytime (ie Strings) it helps a lot on your application performance.

I think, if you want to share the same object among different variables, it needs to be immutable.
For instance:
String A = "abc";
String B = "abc";
String object in Java is immutable. Now both A & B point to the same "abc" string.
Now
A = A + "123";
System.out.println(B);
it should output:
abc
Because String is immutable, A will simply point to new "abc123" string object instead of modifying the previous string object.

Related

Understanding enumset for enumerators

As far as I understood it would be much easier and clearler to use EnumSet and that the Set was specifically designed for using with Enumerators. My question is whether we should consider to use the EnumSet every time we need to maintain some collection of enumerators. For instance, I have he following enum:
public enum ReportColumns{
ID,
NAME,
CURRENCY
//It may contain much more enumerators than I mentioned here
public int getMaintenanceValue(){
//impl
}
}
And I need to use some collection ofthe enum in the method:
public void persist(Collection<ReportColumns> cols){
List<Integer> ints = new LinkedList<>();
for(ReportColumn c: cols){
ints.add(c.getMaintenanceValue());
}
//do some persistance-related operations
}
so, if I don't care about if the collection is ordered or not should I use EnumSet<E> every time to improve performance?
As a rule of thumb, whenever you create a collection of enums and don't care about their order, you should create an EnumSet. As you mentioned, it would give you a slight increase in performance (in fact, most static code analysis tools I know actually warn about not using it).
For a method declaration though, I wouldn't. As another rule of thumb, methods should use the "highest" type possible that still makes sense. The public contract of this method should be "give me a bunch of enums, and I'll persist them". The method shouldn't care what collection is passed, so it makes no sense forcing the parameter type to be an EnumSet. If the concrete type you pass is indeed an EnumSet you'll get all the performance benefits anyway.

Is it bad design to pass reference of collections in constructor?

I encounter many times of similar code:
class AClass{
private Iterable<String> list;
public AClass(Iterable<String> list){ this.list = list; }
...
}
In this code, a reference of Iterable is passed to AClass directly. The end result is equivalent to directly expose list reference to outside. Even if you make AClass.list final, it still allows code from outside AClass to modify the content of the list, which is bad.
To counter this, we will do a defensive copy in the constructor.
However, this kind of code is very common. Besides performance consideration, what's the intension for people to write this kind of code?
I don't see anything wrong with that pattern. If the class represents objects that operate on a list (or an iterable) then it's natural to provide that list to the constructor. If your class can't handle changes to the underlying collection, then it needs to be fixed or documented. Making a copy of the collection is one way to fix that.
Another option is to change the interface so that only immutable collections are allowed:
public AClass(ImmutableList<MyObject> objects) {
this.objects = objects;
...
You would need some kind of ImmutableList-class or interface of course.
Depending on the use and users of your classes you could also avoid making copies by documenting the known "weakness":
/**
* ...
* #param objects list of objects this AClass-object operates on.
* The list should not be modified during the lifetime
* of this object
*/
public AClass(List<MyObject> objects) ...
Simple answer, if it is your own code/small team, it is often just quicker, easier and less memory and CPU intensive to do things this way. Also, some people just don't know any better!
You might want to take a look at the copy constructor for a familiar idiom.
Its always good practice to make a copy, not only because other people can then modify your values, but also for security reasons.
If the code is being used internally as is pointed by other answers it should not be a problem. But if you are exposing as an API then there are two options:
First is to create a defensive copy and then return it
Second would be to create a UnmodifiableCollection and then return it and document the fact that trying to change anything in the collection may result in exception.
But the first option is more preferable.

Best Practice for Returning Object References

Consider this code snippet:
class MyClass{
private List myList;
//...
public List getList(){
return myList;
}
}
As Java passes object references by value, my understanding is that any object calling getList() will obtain a reference to myList, allowing it to modify myList despite it being private. Is that correct?
And, if it is correct, should I be using
return new LinkedList(myList);
to create a copy and pass back a reference to the copy, rather than the original, in order to prevent unauthorised access to the list referenced bymyList?
I do that. Better yet, sometimes I return an unmodifiable copy using the Collections API.
If you don't, your reference is not private. Anyone that has a reference can alter your private state. Same holds true for any mutable reference (e.g., Date).
It depends on what you want.
Do you want to expose the list and make it so people can edit it?
Or do you want to let people look at it, but not modify it?
There is no right or wrong way in this case. It just depends on your design needs.
There can be some cases when one would want to return the "raw" list to the caller. But in general, i think that it is a bad practice as it breaks the encapsulation and therefore is against OO.
If you must return the "raw" list and not a copy then it should be explicitly clear to the users of MyClass.
Yes, and it has a name.. "Defensive copy". Copying at the receiving end is also recommended. As Tom has noted, behavior of the program is much easier to predict if the collection is immutable. So unless you have a very good reason, you should use an immutable collection.
When Google Guava becomes part of the Java standard library (I totally think it should), this would probably become the preferred idiom:
return ImmutableList.copyOf(someList);
and
void (List someList){
someList = ImmutableList.copyOf(someList);
This has an added bonus of performance, because the copyOf() method checks whether the collection is already an instance of immutable collection (instanceof ImmutableList) and if so, skips the copying.
I think that the pattern of making fields private and providing accessors is simply meant for data encapsulation. If you want something to be truly private, don't give it accessor methods! You can then write other methods that return immutable versions of your private data or copies thereof.

is there a performance hit when using enum.values() vs. String arrays?

I'm using enumerations to replace String constants in my java app (JRE 1.5).
Is there a performance hit when I treat the enum as a static array of names in a method that is called constantly (e.g. when rendering the UI)?
My code looks a bit like this:
public String getValue(int col) {
return ColumnValues.values()[col].toString();
}
Clarifications:
I'm concerned with a hidden cost related to enumerating values() repeatedly (e.g. inside paint() methods).
I can now see that all my scenarios include some int => enum conversion - which is not Java's way.
What is the actual price of extracting the values() array? Is it even an issue?
Android developers
Read Simon Langhoff's answer below, which has pointed out earlier by Geeks On Hugs in the accepted answer's comments. Enum.values() must do a defensive copy
For enums, in order to maintain immutability, they clone the backing array every time you call the Values() method. This means that it will have a performance impact. How much depends on your specific scenario.
I have been monitoring my own Android app and found out that this simple call used 13.4% CPU time! in my specific case.
In order to avoid cloning the values array, I decided to simple cache the values as a private field and then loop through those values whenever needed:
private final static Protocol[] values = Protocol.values();
After this small optimisation my method call only hogged a negligible 0.0% CPU time
In my use case, this was a welcome optimisation, however, it is important to note that using this approach is a tradeoff of mutability of your enum. Who knows what people might put into your values array once you give them a reference to it!?
Enum.values() gives you a reference to an array, and iterating over an array of enums costs the same as iterating over an array of strings. Meanwhile, comparing enum values to other enum values can actually be faster that comparing strings to strings.
Meanwhile, if you're worried about the cost of invoking the values() method versus already having a reference to the array, don't worry. Method invocation in Java is (now) blazingly fast, and any time it actually matters to performance, the method invocation will be inlined by the compiler anyway.
So, seriously, don't worry about it. Concentrate on code readability instead, and use Enum so that the compiler will catch it if you ever try to use a constant value that your code wasn't expecting to handle.
If you're curious about why enum comparisons might be faster than string comparisons, here are the details:
It depends on whether the strings have been interned or not. For Enum objects, there is always only one instance of each enum value in the system, and so each call to Enum.equals() can be done very quickly, just as if you were using the == operator instead of the equals() method. In fact, with Enum objects, it's safe to use == instead of equals(), whereas that's not safe to do with strings.
For strings, if the strings have been interned, then the comparison is just as fast as with an Enum. However, if the strings have not been interned, then the String.equals() method actually needs to walk the list of characters in both strings until either one of the strings ends or it discovers a character that is different between the two strings.
But again, this likely doesn't matter, even in Swing rendering code that must execute quickly. :-)
#Ben Lings points out that Enum.values() must do a defensive copy, since arrays are mutable and it's possible you could replace a value in the array that is returned by Enum.values(). This means that you do have to consider the cost of that defensive copy. However, copying a single contiguous array is generally a fast operation, assuming that it is implemented "under the hood" using some kind of memory-copy call, rather than naively iterating over the elements in the array. So, I don't think that changes the final answer here.
As a rule of thumb : before thinking about optimizing, have you any clue that this code could slow down your application ?
Now, the facts.
enum are, for a large part, syntactic sugar scattered across the compilation process. As a consequence, the values method, defined for an enum class, returns a static collection (that's to say loaded at class initialization) with performances that can be considered as roughly equivalent to an array one.
If you're concerned about performance, then measure.
From the code, I wouldn't expect any surprises but 90% of all performance guesswork is wrong. If you want to be safe, consider to move the enums up into the calling code (i.e. public String getValue(ColumnValues value) {return value.toString();}).
use this:
private enum ModelObject { NODE, SCENE, INSTANCE, URL_TO_FILE, URL_TO_MODEL,
ANIMATION_INTERPOLATION, ANIMATION_EVENT, ANIMATION_CLIP, SAMPLER, IMAGE_EMPTY,
BATCH, COMMAND, SHADER, PARAM, SKIN }
private static final ModelObject int2ModelObject[] = ModelObject.values();
If you're iterating through your enum values just to look for a specific value, you can statically map the enum values to integers. This pushes the performance impact on class load, and makes it easy/low impact to get specific enum values based on a mapped parameter.
public enum ExampleEnum {
value1(1),
value2(2),
valueUndefined(Integer.MAX_VALUE);
private final int enumValue;
private static Map enumMap;
ExampleEnum(int value){
enumValue = value;
}
static {
enumMap = new HashMap<Integer, ExampleEnum>();
for (ExampleEnum exampleEnum: ExampleEnum.values()) {
enumMap.put(exampleEnum.value, exampleEnum);
}
}
public static ExampleEnum getExampleEnum(int value) {
return enumMap.contains(value) ? enumMap.get(value) : valueUndefined;
}
}
I think yes. And it is more convenient to use Constants.

hashCode uniqueness

Is it possible for two instances of Object to have the same hashCode()?
In theory an object's hashCode is derived from its memory address, so all hashCodes should be unique, but what if objects are moved around during GC?
I think the docs for object's hashCode method state the answer.
"As much as is reasonably practical,
the hashCode method defined by class
Object does return distinct integers
for distinct objects. (This is
typically implemented by converting
the internal address of the object
into an integer, but this
implementation technique is not
required by the JavaTM programming
language.)"
Given a reasonable collection of objects, having two with the same hash code is quite likely. In the best case it becomes the birthday problem, with a clash with tens of thousands of objects. In practice objects a created with a relatively small pool of likely hash codes, and clashes can easily happen with merely thousands of objects.
Using memory address is just a way of obtaining a slightly random number. The Sun JDK source has a switch to enable use of a Secure Random Number Generator or a constant. I believe IBM (used to?) use a fast random number generator, but it was not at all secure. The mention in the docs of memory address appears to be of a historical nature (around a decade ago it was not unusual to have object handles with fixed locations).
Here's some code I wrote a few years ago to demonstrate clashes:
class HashClash {
public static void main(String[] args) {
final Object obj = new Object();
final int target = obj.hashCode();
Object clash;
long ct = 0;
do {
clash = new Object();
++ct;
} while (clash.hashCode() != target && ct<10L*1000*1000*1000L);
if (clash.hashCode() == target) {
System.out.println(ct+": "+obj+" - "+clash);
} else {
System.out.println("No clashes found");
}
}
}
RFE to clarify docs, because this comes up way too frequently: CR 6321873
Think about it. There are an infinite number of potential objects, and only 4 billion hash codes. Clearly, an infinity of potential objects share each hash code.
The Sun JVM either bases the Object hash code on a stable handle to the object or caches the initial hash code. Compaction during GC will not alter the hashCode(). Everything would break if it did.
Is it possible?
Yes.
Does it happen with any reasonable degree of frequency?
No.
I assume the original question is only about the hash codes generated by the default Object implementation. The fact is that hash codes must not be relied on for equality testing and are only used in some specific hash mapping operations (such as those implemented by the very useful HashMap implementation).
As such they have no need of being really unique - they only have to be unique enough to not generate a lot of clashes (which will render the HashMap implementation inefficient).
Also it is expected that when developer implement classes that are meant to be stored in HashMaps they will implement a hash code algorithm that has a low chance of clashes for objects of the same class (assuming you only store objects of the same class in application HashMaps), and knowing about the data makes it much easier to implement robust hashing.
Also see Ken's answer about equality necessitating identical hash codes.
Are you talking about the actual class Object or objects in general? You use both in the question. (And real-world apps generally don't create a lot of instances of Object)
For objects in general, it is common to write a class for which you want to override equals(); and if you do that, you must also override hashCode() so that two different instances of that class that are "equal" must also have the same hash code. You are likely to get a "duplicate" hash code in that case, among instances of the same class.
Also, when implementing hashCode() in different classes, they are often based on something in the object, so you end up with less "random" values, resulting in "duplicate" hash codes among instances of different classes (whether or not those objects are "equal").
In any real-world app, it is not unusual to find to different objects with the same hash code.
If there were as many hashcodes as memory addresses, then it would took the whole memory to store the hash itself. :-)
So, yes, the hash codes should sometimes happen to coincide.

Categories