Concept of Immutability of String in Java

Concept of Immutability of String in Java - java

i know the reason, why String is immutable but my question is why the concept of immutability
sticks to only String class in java , why it won't apply to others.
one of the reason i found why it is immutable is HERE
but why not all other classes like Integer and so on..
how java people decided changing their values might not effect ..
please explain .

The point is that you could confuse a String (which is an Object) with a primitive type. In fact, The way strings typically appear in Java programs can lead to confusion.
There are shortcuts available for strings that are not available for other objects in Java. These
shortcuts are highly used and they give strings the appearance of a primitive data type. But,
strings are objects and they possess all the attributes and behaviours of objects in Java

When designing a class, to make it immutable or not is a design choice and usually a tradeoff. The classes String and Integer were made immutable for many reasons including security.
A class like java.util.Random is all about managing some state, so it has to be mutable.
A class like java.awt.Rectangle is mutable, which has the advantage of saving memory management costs when you modify one rather than create a new one, but unfortunately it means that when a Rectangle instance that's shared (e.g. by two different window objects), changing it for one of them will change it out from under the others, causing unhappy surprises. Modern garbage collectors are very efficient [I hear they're more efficient than C's malloc() and free()] so mutability may not actually save memory management overhead, certainly not if you often have to copy objects to protect against accidental shared changes.
You can freely share an immutable object without worrying about it changing out from under you.

Integer is immutable, so each operation on it creates a new instance.
Immutable does not mean that a refernece can never assign another value. For example, String is immutable too, but we can do this:
String ex = "good"; // ex == "good"
ex = ex + "morning"; // now ex == "goodmorning"
So what happened there? Since String is immutable, clearly "ex" was not changed. But it now being assigned something different. This is because "ex" is now a completely newly instantiated object.
So this is same the case of Integer.

Related

adavantages and/or disadvantages oof mutable and immutable classes [duplicate]

I'm trying to get my head around mutable vs immutable objects. Using mutable objects gets a lot of bad press (e.g. returning an array of strings from a method) but I'm having trouble understanding what the negative impacts are of this. What are the best practices around using mutable objects? Should you avoid them whenever possible?

Well, there are a few aspects to this.
Mutable objects without reference-identity can cause bugs at odd times. For example, consider a Person bean with a value-based equals method:
Map<Person, String> map = ...
Person p = new Person();
map.put(p, "Hey, there!");
p.setName("Daniel");
map.get(p); // => null
The Person instance gets "lost" in the map when used as a key because its hashCode and equality were based upon mutable values. Those values changed outside the map and all of the hashing became obsolete. Theorists like to harp on this point, but in practice I haven't found it to be too much of an issue.
Another aspect is the logical "reasonability" of your code. This is a hard term to define, encompassing everything from readability to flow. Generically, you should be able to look at a piece of code and easily understand what it does. But more important than that, you should be able to convince yourself that it does what it does correctly. When objects can change independently across different code "domains", it sometimes becomes difficult to keep track of what is where and why ("spooky action at a distance"). This is a more difficult concept to exemplify, but it's something that is often faced in larger, more complex architectures.
Finally, mutable objects are killer in concurrent situations. Whenever you access a mutable object from separate threads, you have to deal with locking. This reduces throughput and makes your code dramatically more difficult to maintain. A sufficiently complicated system blows this problem so far out of proportion that it becomes nearly impossible to maintain (even for concurrency experts).
Immutable objects (and more particularly, immutable collections) avoid all of these problems. Once you get your mind around how they work, your code will develop into something which is easier to read, easier to maintain and less likely to fail in odd and unpredictable ways. Immutable objects are even easier to test, due not only to their easy mockability, but also the code patterns they tend to enforce. In short, they're good practice all around!
With that said, I'm hardly a zealot in this matter. Some problems just don't model nicely when everything is immutable. But I do think that you should try to push as much of your code in that direction as possible, assuming of course that you're using a language which makes this a tenable opinion (C/C++ makes this very difficult, as does Java). In short: the advantages depend somewhat on your problem, but I would tend to prefer immutability.

Immutable Objects vs. Immutable Collections
One of the finer points in the debate over mutable vs. immutable objects is the possibility of extending the concept of immutability to collections. An immutable object is an object that often represents a single logical structure of data (for example an immutable string). When you have a reference to an immutable object, the contents of the object will not change.
An immutable collection is a collection that never changes.
When I perform an operation on a mutable collection, then I change the collection in place, and all entities that have references to the collection will see the change.
When I perform an operation on an immutable collection, a reference is returned to a new collection reflecting the change. All entities that have references to previous versions of the collection will not see the change.
Clever implementations do not necessarily need to copy (clone) the entire collection in order to provide that immutability. The simplest example is the stack implemented as a singly linked list and the push/pop operations. You can reuse all of the nodes from the previous collection in the new collection, adding only a single node for the push, and cloning no nodes for the pop. The push_tail operation on a singly linked list, on the other hand, is not so simple or efficient.
Immutable vs. Mutable variables/references
Some functional languages take the concept of immutability to object references themselves, allowing only a single reference assignment.
In Erlang this is true for all "variables". I can only assign objects to a reference once. If I were to operate on a collection, I would not be able to reassign the new collection to the old reference (variable name).
Scala also builds this into the language with all references being declared with var or val, vals only being single assignment and promoting a functional style, but vars allowing a more C-like or Java-like program structure.
The var/val declaration is required, while many traditional languages use optional modifiers such as final in java and const in C.
Ease of Development vs. Performance
Almost always the reason to use an immutable object is to promote side effect free programming and simple reasoning about the code (especially in a highly concurrent/parallel environment). You don't have to worry about the underlying data being changed by another entity if the object is immutable.
The main drawback is performance. Here is a write-up on a simple test I did in Java comparing some immutable vs. mutable objects in a toy problem.
The performance issues are moot in many applications, but not all, which is why many large numerical packages, such as the Numpy Array class in Python, allow for In-Place updates of large arrays. This would be important for application areas that make use of large matrix and vector operations. This large data-parallel and computationally intensive problems achieve a great speed-up by operating in place.

Immutable objects are a very powerful concept. They take away a lot of the burden of trying to keep objects/variables consistent for all clients.
You can use them for low level, non-polymorphic objects - like a CPoint class - that are used mostly with value semantics.
Or you can use them for high level, polymorphic interfaces - like an IFunction representing a mathematical function - that is used exclusively with object semantics.
Greatest advantage: immutability + object semantics + smart pointers make object ownership a non-issue, all clients of the object have their own private copy by default. Implicitly this also means deterministic behavior in the presence of concurrency.
Disadvantage: when used with objects containing lots of data, memory consumption can become an issue. A solution to this could be to keep operations on an object symbolic and do a lazy evaluation. However, this can then lead to chains of symbolic calculations, that may negatively influence performance if the interface is not designed to accommodate symbolic operations. Something to definitely avoid in this case is returning huge chunks of memory from a method. In combination with chained symbolic operations, this could lead to massive memory consumption and performance degradation.
So immutable objects are definitely my primary way of thinking about object-oriented design, but they are not a dogma.
They solve a lot of problems for clients of objects, but also create many, especially for the implementers.

Check this blog post: http://www.yegor256.com/2014/06/09/objects-should-be-immutable.html. It explains why immutable objects are better than mutable. In short:
immutable objects are simpler to construct, test, and use
truly immutable objects are always thread-safe
they help to avoid temporal coupling
their usage is side-effect free (no defensive copies)
identity mutability problem is avoided
they always have failure atomicity
they are much easier to cache

You should specify what language you're talking about. For low-level languages like C or C++, I prefer to use mutable objects to conserve space and reduce memory churn. In higher-level languages, immutable objects make it easier to reason about the behavior of the code (especially multi-threaded code) because there's no "spooky action at a distance".

A mutable object is simply an object that can be modified after it's created/instantiated, vs an immutable object that cannot be modified (see the Wikipedia page on the subject). An example of this in a programming language is Pythons lists and tuples. Lists can be modified (e.g., new items can be added after it's created) whereas tuples cannot.
I don't really think there's a clearcut answer as to which one is better for all situations. They both have their places.

Shortly:
Mutable instance is passed by reference.
Immutable instance is passed by value.
Abstract example. Lets suppose that there exists a file named txtfile on my HDD. Now, when you are asking me to give you the txtfile file, I can do it in the following two modes:
I can create a shortcut to the txtfile and pass shortcut to you, or
I can do a full copy of the txtfile file and pass copied file to you.
In the first mode, the returned file represents a mutable file, because any change into the shortcut file will be reflected into the original one as well, and vice versa.
In the second mode, the returned file represents an immutable file, because any change into the copied file will not be reflected into the original one, and vice versa.

If a class type is mutable, a variable of that class type can have a number of different meanings. For example, suppose an object foo has a field int[] arr, and it holds a reference to a int[3] holding the numbers {5, 7, 9}. Even though the type of the field is known, there are at least four different things it can represent:
A potentially-shared reference, all of whose holders care only that it encapsulates the values 5, 7, and 9. If foo wants arr to encapsulate different values, it must replace it with a different array that contains the desired values. If one wants to make a copy of foo, one may give the copy either a reference to arr or a new array holding the values {1,2,3}, whichever is more convenient.
The only reference, anywhere in the universe, to an array which encapsulates the values 5, 7, and 9. set of three storage locations which at the moment hold the values 5, 7, and 9; if foo wants it to encapsulate the values 5, 8, and 9, it may either change the second item in that array or create a new array holding the values 5, 8, and 9 and abandon the old one. Note that if one wanted to make a copy of foo, one must in the copy replace arr with a reference to a new array in order for foo.arr to remain as the only reference to that array anywhere in the universe.
A reference to an array which is owned by some other object that has exposed it to foo for some reason (e.g. perhaps it wants foo to store some data there). In this scenario, arr doesn't encapsulate the contents of the array, but rather its identity. Because replacing arr with a reference to a new array would totally change its meaning, a copy of foo should hold a reference to the same array.
A reference to an array of which foo is the sole owner, but to which references are held by other object for some reason (e.g. it wants to have the other object to store data there--the flipside of the previous case). In this scenario, arr encapsulates both the identity of the array and its contents. Replacing arr with a reference to a new array would totally change its meaning, but having a clone's arr refer to foo.arr would violate the assumption that foo is the sole owner. There is thus no way to copy foo.
In theory, int[] should be a nice simple well-defined type, but it has four very different meanings. By contrast, a reference to an immutable object (e.g. String) generally only has one meaning. Much of the "power" of immutable objects stems from that fact.

Mutable collections are in general faster than their immutable counterparts when used for in-place
operations.
However, mutability comes at a cost: you need to be much more careful sharing them between
different parts of your program.
It is easy to create bugs where a shared mutable collection is updated
unexpectedly, forcing you to hunt down which line in a large codebase is performing the unwanted update.
A common approach is to use mutable collections locally within a function or private to a class where there
is a performance bottleneck, but to use immutable collections elsewhere where speed is less of a concern.
That gives you the high performance of mutable collections where it matters most, while not sacrificing
the safety that immutable collections give you throughout the bulk of your application logic.

If you return references of an array or string, then outside world can modify the content in that object, and hence make it as mutable (modifiable) object.

Immutable means can't be changed, and mutable means you can change.
Objects are different than primitives in Java. Primitives are built in types (boolean, int, etc) and objects (classes) are user created types.
Primitives and objects can be mutable or immutable when defined as member variables within the implementation of a class.
A lot of people people think primitives and object variables having a final modifier infront of them are immutable, however, this isn't exactly true. So final almost doesn't mean immutable for variables. See example here
http://www.siteconsortium.com/h/D0000F.php.

General Mutable vs Immutable
Unmodifiable - is a wrapper around modifiable. It guarantees that it can not be changed directly(but it is possibly using backing object)
Immutable - state of which can not be changed after creation. Object is immutable when all its fields are immutable. It is a next step of Unmodifiable object
Thread safe
The main advantage of Immutable object is that it is a naturally for concurrent environment. The biggest problem in concurrency is shared resource which can be changed any of thread. But if an object is immutable it is read-only which is thread safe operation. Any modification of an original immutable object return a copy
source of truth, side-effects free
As a developer you are completely sure that immutable object's state can not be changed from any place(on purpose or not). For example if a consumer uses immutable object he is able to use an original immutable object
compile optimisation
Improve performance
Disadvantage:
Copying of object is more heavy operation than changing a mutable object, that is why it has some performance footprint
To create an immutable object you should use:
1. Language level
Each language contains tools to help you with it. For example:
Java has final and primitives
Swift has let and struct[About].
Language defines a type of variable. For example:
Java has primitive and reference type,
Swift has value and reference type[About].
For immutable object more convenient is primitives and value type which make a copy by default. As for reference type it is more difficult(because you are able to change object's state out of it) but possible. For example you can use clone pattern on a developer level to make a deep(instead of shallow) copy.
2. Developer level
As a developer you should not provide an interface for changing state
[Swift] and [Java] immutable collection

Is it correct to call java.lang.String immutable?

This Java tutorial
says that an immutable object cannot change its state after creation.
java.lang.String has a field
/** Cache the hash code for the string */
private int hash; // Default to 0
which is initialized on the first call of the hashCode() method, so it changes after creation:
String s = new String(new char[] {' '});
Field hash = s.getClass().getDeclaredField("hash");
hash.setAccessible(true);
System.out.println(hash.get(s));
s.hashCode();
System.out.println(hash.get(s));
output
0
32
Is it correct to call String immutable?

A better definition would be not that the object does not change, but that it cannot be observed to have been changed. It's behavior will never change: .substring(x,y) will always return the same thing for that string ditto for equals and all the other methods.
That variable is calculated the first time you call .hashcode() and is cached for further calls. This is basically what they call "memoization" in functional programming languages.
Reflection isn't really a tool for "programming" but rather for meta-programming (ie programming programs for generating programs) so it doesn't really count. It's the equivalent of changing a constant's value using a memory debugger.

The term "Immutable" is vague enough to not allow for a precise definition.
I suggest reading Kinds of Immutability from Eric Lippert's blog. Although it's technically a C# article, it's quite relevant to the question posed. In particular:
Observational immutability:
Suppose you’ve got an object which has the property that every time
you call a method on it, look at a field, etc, you get the same
result. From the point of view of the caller such an object would be
immutable. However you could imagine that behind the scenes the object
was doing lazy initialization, memoizing results of function calls in
a hash table, etc. The “guts” of the object might be entirely mutable.
What does it matter? Truly deeply immutable objects never change their
internal state at all, and are therefore inherently threadsafe. An
object which is mutable behind the scenes might still need to have
complicated threading code in order to protect its internal mutable
state from corruption should the object be called on two threads “at
the same time”.

Once created, all the methods on a String instance (called with the same parameters) will always provide the same result. You cannot change its behavoiur (with any public method), so it will always represent the same entity. Also it is final and cannot be subclassed, so it is guaranteed that all instances will behave like this.
Therefore from public view the object is considered immutable. The internal state does not really matter in this case.

Yes it is correct to call them immutable.
While it is true that you can reach in and modify private ... and final ... variables of a class, it is an unnecessary and incredibly unwise thing to do on a String object. It is generally assumed that nobody is going to be crazy enough do it.
From a security standpoint, the reflection calls needed to modify the state of a String all perform security checks. Unless you've miss-implement your sandbox, the calls will be blocked for non-trusted code. So you should have to worry about this as a way that untrusted code can break sandbox security.
It is also worth noting that the JLS states that using reflection to change final, may break things (e.g. in multi-threading) or may not have any effect.

From the viewpoint of a developer who is using reflection, it is not correct to call String immutable. There are actual Java developers using reflection to write real software every day. Dismissing reflection as a "hack" is preposterous. However, from the viewpoint of a developer who is not using reflection, it is correct to call String immutable. Whether or not it is valid to assume that String is immutable depends on context.
Immutability is an abstract concept and therefore cannot apply in an absolute sense to anything with a physical form (see the ship of Theseus). Programming language constructs like objects, variables, and methods exist physically as bits in a storage medium. Data degradation is a physical process which happens to all storage media, so no data can ever be said to be truly immutable. In addition, it is almost always possible in practice to subvert the programming language features intended to prevent the mutation of a particular datum. In contrast, the number 3 is 3, has always been 3, and will always be 3.
As applied to program data, immutability should be considered a useful assumption rather than a fundamental property. For example, if one assumes that a String is immutable, one may cache its hash code for reuse and avoid the cost of ever recomputing its hash code again later. Virtually all non-trivial software relies on assumptions that certain data will not mutate for certain durations of time. Software developers generally assume that the code segment of a program will not change while it is executing, unless they are writing self-modifying code. Understanding what assumptions are valid in a particular context is an important aspect of software development.

It can not be modified from outside and it is a final class, so it can not be subclassed and made mutable. Theese are two requirments for immutability. Reflection is considered as a hack, its not a normal way of development.

A class can be immutable while still having mutable fields, as long as it doesn't provide access to its mutable fields.
It's immutable by design. If you use Reflection (getting the declared Field and resetting its accessibility), you are circumventing its design.

Reflection will allow you to change the contents of any private field. Is it therefore correct to call any object in Java immutable?
Immutability refers to changes that are either initiated by or perceivable by the application.
In the case of string, the fact that a particular implementation chooses to lazily calculate the hashcode is not perceptible to the application. I would go a step further, and say that an internal variable that is incremented by the object -- but never exposed and never used in any other way -- would also be acceptable in an "immutable" object.

Yes it is correct. When you modified a String like you do in your example, a new String is created but the older one maintain its value.

non-technical benefits of having string-type immutable

I am wondering about the benefits of having the string-type immutable from the programmers point-of-view.
Technical benefits (on the compiler/language side) can be summarized mostly that it is easier to do optimisations if the type is immutable. Read here for a related question.
Also, in a mutable string type, either you have thread-safety already built-in (then again, optimisations are harder to do) or you have to do it yourself. You will have in any case the choice to use a mutable string type with built-in thread safety, so that is not really an advantage of immutable string-types. (Again, it will be easier to do the handling and optimisations to ensure thread-safety on the immutable type but that is not the point here.)
But what are the benefits of immutable string-types in the usage? What is the point of having some types immutable and others not? That seems very inconsistent to me.
In C++, if I want to have some string to be immutable, I am passing it as const reference to a function (const std::string&). If I want to have a changeable copy of the original string, I am passing it as std::string. Only if I want to have it mutable, I am passing it as reference (std::string&). So I just have the choice about what I want to do. I can just do this with every possible type.
In Python or in Java, some types are immutable (mostly all primitive types and strings), others are not.
In pure functional languages like Haskell, everything is immutable.
Is there a good reason why it make sense to have this inconsistency? Or is it just purely for technical lower level reasons?

What is the point of having some
types immutable and others not?
Without some mutable types, you'd have to go the whole hog to pure functional programming -- a completely different paradigm than the OOP and procedural approaches which are currently most popular, and, while extremely powerful, apparently very challenging to a lot of programmers (what happens when you do need side effects in a language where nothing is mutable, and in real-world programming of course you inevitably do, is part of the challenge -- Haskell's Monads are a very elegant approach, for example, but how many programmers do you know that fully and confidently understand them and can use them as well as typical OOP constructs?-).
If you don't understand the enormous value of having multiple paradigms available (both FP one and ones crucially relying on mutable data), I recommend studying Haridi's and Van Roy's masterpiece, Concepts, Techniques, and Models of Computer Programming -- "a SICP for the 21st Century", as I once described it;-).
Most programmers, whether familiar with Haridi and Van Roy or not, will readily admit that having at least some mutable data types is important to them. Despite the sentence I've quoted above from your Q, which takes a completely different viewpoint, I believe that may also be the root of your perplexity: not "why some of each", but rather "why some immutables at all".
The "thoroughly mutable" approach was once (accidentally) obtained in a Fortran implementation. If you had, say,
SUBROUTINE ZAP(I)
I = 0
RETURN
then a program snippet doing, e.g.,
PRINT 23
ZAP(23)
PRINT 23
would print 23, then 0 -- the number 23 had been mutated, so all references to 23 in the rest of the program would in fact refer to 0. Not a bug in the compiler, technically: Fortran had subtle rules about what your program is and is not allowed to do in passing constants vs variables to procedures that assign to their arguments, and this snippet violates those little-known, non-compiler-enforceable rules, so it's a but in the program, not in the compiler. In practice, of course, the number of bugs caused this way was unacceptably high, so typical compilers soon switched to less destructive behavior in such situations (putting constants in read-only segments to get a runtime error, if the OS supported that; or, passing a fresh copy of the constant rather than the constant itself, despite the overhead; and so forth) even though technically they were program bugs allowing the compiler to display undefined behavior quite "correctly";-).
The alternative enforced in some other languages is to add the complication of multiple ways of parameter passing -- most notably perhaps in C++, what with by-value, by-reference, by constant reference, by pointer, by constant pointer, ... and then of course you see programmers baffled by declarations such as const foo* const bar (where the rightmost const is basically irrelevant if bar is an argument to some function... but crucial instead if bar is a local variable...!-).
Actually Algol-68 probably went farther along this direction (if you can have a value and a reference, why not a reference to a reference? or reference to reference to reference? &c -- Algol 68 put no limitations on this, and the rules to define what was going on are perhaps the subtlest, hardest mix ever found in an "intended for real use" programming language). Early C (which only had by-value and by-explicit-pointer -- no const, no references, no complications) was no doubt in part a reaction to it, as was the original Pascal. But const soon crept in, and complications started mounting again.
Java and Python (among other languages) cut through this thicket with a powerful machete of simplicity: all argument passing, and all assignment, is "by object reference" (never reference to a variable or other reference, never semantically implicit copies, &c). Defining (at least) numbers as semantically immutable preserves programmers' sanity (as well as this precious aspect of language simplicity) by avoiding "oopses" such as that exhibited by the Fortran code above.
Treating strings as primitives just like numbers is quite consistent with the languages' intended high semantic level, because in real life we do need strings that are just as simple to use as numbers; alternatives such as defining strings as lists of characters (Haskell) or as arrays of characters (C) poses challenges to both the compiler (keeping efficient performance under such semantics) and the programmer (effectively ignoring this arbitrary structuring to enable use of strings as simple primitives, as real life programming often requires).
Python went a bit further by adding a simple immutable container (tuple) and tying hashing to "effective immutability" (which avoids certain surprises to the programmer that are found, e.g., in Perl, with its hashes allowing mutable strings as keys) -- and why not? Once you have immutability (a precious concept that saves the programmer from having to learn about N different semantics for assignment and argument passing, with N tending to increase with time;-), you might as well get full mileage out of it;-).

I am not sure if this qualifies as non-technical, nevertheless: if strings are mutable, then most(*) collections need to make private copies of their string keys.
Otherwise a "foo" key changed externally to "bar" would result in "bar" sitting in the internal structures of the collection where "foo" is expected. This way "foo" lookup would find "bar", which is less of a problem (return nothing, reindex the offending key) but "bar" lookup would find nothing, which is a bigger problem.
(*) A dumb collection that does a linear scan of all keys on each lookup would not have to do that, since it would naturally accomodate key changes.

There is no overarching, fundamental reason not to have strings mutable. The best explanation I have found for their immutability is that it promotes a more functional, less side-effectsy way of programming. This ends up being cleaner, more elegant, and more Pythonic.
Semantically, they should be immutable, no? The string "hello" should always represent "hello". You can't change it any more than you can change the number three!

Not sure if you would count this as a 'technical low level' benefit, but the fact that immutable string is implicitly threadsafe saves you a lot of effort of coding for thread safety.
Slightly toy example...
Thread A - Check user with login name FOO has permission to do something, return true
Thread B - Modify user string to login name BAR
Thread A - Perform some operation with login name BAR due to previous permission check passing against FOO.
The fact that the String can't change saves you the effort of guarding against this.

If you want full consistency you can only make everything immutable, because mutable Bools or Ints would simply make no sense at all. Some functional languages do that in fact.
Python's philosophy is "Simple is better than complex." In C you need to be aware that strings can change and think about how that can affect you. Python assumes that the default use case for strings is "put text together" - there is absolutely nothing you need to know about strings to do that. But if you want your strings to change, you just have to use a more appropriate type (ie lists, StringIO, templates, etc).

In a language with reference semantics for user-defined types, having mutable strings would be a desaster, because every time you assign a string variable, you would alias a mutable string object, and you would have to do defensive copies all over the place. That's why strings are immutable in Java and C# -- if the string object is immutable, it does not matter how many variables point to it.
Note that in C++, two string variables never share state (at least conceptionally -- technically, there might be copy-on-write going on, but that is getting out of fashion due to inefficiencies in multi-threading scenarios).

If strings are mutable, then many consumers of a string will have to to make copies of it. If strings are immutable, this is far less important (unless immutability is enforced by hardware interlocks, it might not be a bad idea for some security-conscious consumers of a string to make their own copies in case the strings they're given aren't as immutable as they should be).
The StringBuilder class is pretty good, though I think it would be nicer if it had a "Value" property (read would be equivalent to ToString, but it would show up in object inspectors; write would allow direct setting of the whole content) and a default widening conversion to a string. It would have been nice in theory to have MutableString type descended from a common ancestor with String, so a mutable string could be passed to a function which didn't care whether a string was mutable, though I suspect that optimizations which rely on the fact that Strings have a certain fixed implementation would have been less effective.

The main advantage for the programmer is that with mutable strings, you never need to worry about who might alter your string. Therefore, you never have to consciously decide "Should I copy this string here?".

Downsides to immutable objects in Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
The advantages of immutable objects in Java seem clear:
consistent state
automatic thread safety
simplicity
You can favour immutability by using private final fields and constructor injection.
But, what are the downsides to favouring immutable objects in Java?
i.e.
incompatibility with ORM or web presentation tools?
Inflexible design?
Implementation complexities?
Is it possible to design a large-scale system (deep object graph) that predominately uses immutable objects?

But, what are the downsides to
favouring immutable objects in Java?
incompatibility with ORM or web
presentation tools?
Reflection based frameworks are complicated by immutable objects since they requires constructor injection:
there are no default arguments in Java, which forces us to ALWAYS provide all of the necessary dependencies
constructor overriding can be messy
constructor argument names are not usually available through reflection, which forces us to depend on argument order for dependency resolution
Implementation complexities?
Creating immutable objects is still a boring task; the compiler should take care of the implementation details, as in groovy
Is it possible to design a large-scale system (deep object graph) that predominately uses immutable objects?
definitely yes; immutable objects makes great building blocks for other objects (they favor composition) since it's much easier to maintain the invariant of a complex object when you can rely on its immutable components. The only true downside to me is about creating many temporary objects (e.g. String concat was a problem in the past).

With immutability, any time you need to modify data, you need to create a new object. This can be expensive.
Imagine needing to modify one bit in an object that consumes several megabytes of memory: you would need to instantiate a whole new object, allocate memory, etc. If you need to do this many times, mutability becomes very attractive.

If you go for mutability then you will find that whenever you need to call a method that you don't want to have the object change, or you need to return an object that is part of the internal state, you need to make a defensive copy.
If you really look at programs that make use of mutible objects you will find that they are prone to "attack" by modifying:
objects passed to constructors
objects passed to methods
objects returned from methods.
The issue doesn't show up very often because most programs don't change the data (they are in reality immutable by virtue of them never changing).
I personally make every thing I possibly can final. I probably have 90%-95% of all variables (parameters, local, instance, static, exceptions, etc...) marked as final. There are some cases where it has to be mutable, but the vast majority of cases it does not.
I think it might depend on your focus. If you are writing libraries for 3rd parties to use you think about this much more than if you are writing an application that only you (or your team) will maintain.
I find that you can write large scale applications using immutable objects for the majority of the system without too much pain.

Fundamentally, in the real world, the state associated with many particular identities will change. If I ask what is "the present position of Joe's Buick", today it might be a location in Seattle, and tomorrow it might be a location in Los Alamos. It would be possible to define and create a GeographicLocation object whose value will always represent the location where Joe's Buick was at some particular moment in time and would never changes--if today it represents a spot in Seattle, then it will always do so. Such an object, however, would have no continuing identity as "the present location of Joe's Buick".
It may also be possible to define things so that there is a VehicleLocation object which is connected to Joe's Buick such that the object always represents "the present location of Joe's Buick". Such an object could retains its identity as "the present location of Joe's Buick", even as the car moves around, but would not represent a constant geographical location. Defining "identity" may be tricky if one considers the scenario where Joe sells his Buick to Bob and buys a Ford--should the object track "the present location of Joe's Ford" or "the present location of Bob's Buick"--but in many cases such issues may be avoided by using a data model that guarantees that some aspects of object identity will never change.
It isn't possible for everything about an object to be immutable. If an object is immutable, then it cannot have an immutable identity that encapsulates anything beyond its current state. If an object is mutable, however, it can have an immutable identity whose meaning transcends its present state. In many situations, having an immutable identity is more useful than having an immutable state, and in such situations mutable objects are nearly essential. While it is possible in some cases to "simulate" mutable objects by having an immutable object which would search through the most recent version of an immutable objects to find information that may "change" between one version and the next, such an approaches are often extremely inefficient. Even if one could magically receive once per minute a bound book that gave the location of every vehicle everywhere, looking up "Joe's Buick" in the book would take a lot longer than merely asking a "present location of Joe's Buick" object which would always know where the car was.

You pretty much answered your own question. The JavaBean specification, I don't believe, mentions anything about immutability, yet JavaBeans are the bread and butter of many Java frameworks.

The concept of immutable types is somewhat uncommon for people used to imperative programming styles. However, for many situations immutability has serious advantages, you named the most important ones already.
There are good ways to implement immutable balanced trees, queues, stacks, dequeues and other data structures. And in fact many modern programming languages / frameworks only support immutable strings because of their advantages and sometimes also other objects.

With an immutable object, if the value needs to be changed, then it must be replaced with a new instance. Depending on the lifecycle of the object, replacing it with a different instance can potentially increase the tenured (long) garbage collection time. This becomes more critical if the object is kept around in memory long enough to be placed in the tenured generation.

The problem in java is that one has to live with all those objects, where the class looks like:
class Mutable {
State1 f1;
MoreState f2;
void doSomething() { // mutate the state, but don't document it }
void doSomethingElse() /// mutate the state heavily, do not mention in doc
}
(Note the missing Cloneable interface).
The problem with the garbage collector is not such a big one nowadays. The VM's are happy with short living objects.
Advances in Compiler/JIT technology will make it possible, sooner or later, to optimize intermediate temporary object creation away. For example:
BigInteger three =, two =, i1 = ...;
BigInteger i2 = i1.mul(three).div(two);
The JIT could notice that the intermediate object i1.mul(three) can be used for the end result and call a variant of the div method that works on a mutable accumulator.

See Functional Java to attain a comprehensive answer to your question.

Immutability, as every other design pattern, should only be used when you need it. You give the example of thread safety: In a highly threaded application, you could favor immutability over the added expense of making it thread safe yourself.
However, if your design requires objects to be mutable, don't go out of your way to make them immutable, just because "it's a design pattern".
As for your graph, you could choose to make your nodes immutable and let another class take care of the connections between them, or you could make a mutable node that takes care of its own children and has an immutable value class.

Probably the biggest cost of using immutabile objects in Java is that future developers won't be expecting it or used to that style. Expect to either document heavily or watch alot of your objects spawn mutable peers over time.
That being said, the only real technical reason I can think of to avoid immutable objects is GC churn. For most applications, I don't think this is a compelling reason to avoid them.
The biggest thing I've ever done with a ~90% immutable objects was a toy scheme-esque interpreter, so its certainly possible to do complex Java projects.

in immutable data you dont set things twice... see haskell and scala vals (and clojure of cource)...
for example.. for a data structure.. like a tree, when you perform write operation to the tree, in fact you are adding elements outside of the immutable tree.. after you done.. the tree and the branch are recombined in a new tree.. so like this you could perform concurrent reads and writes very safelly..
in tradicional model, you must lock a value cause it could be reseted any time.. so.. you end up with a very heat zone for threads..since they act sequentially there anyway..
with imuttable data, you dont set things more than once.. its a whole new way of programming.. you may end up using a little bit more memory.. but parallelizing is natural and painless..

As with any tool, you have to know when to use it and when not to.
Like Tehblanx points out that if you want to change the state of a variable that holds an immutable object, you have to create a new object, which can be expensive, especially if the object is big and complex. Absolutely true, but that simply means that you have to intelligently decide which objects should be mutable and which should be immutable. If someone is saying that ALL objects should be immutable, well, that's just crazy talk.
I'd tend to say that objects that represent a single logical "fact" should be immutable, while objects that represent multiple facts should be mutable. Like, an Integer or a String should be immutable. A "Customer" object that contains name, address, current amount, date of last purchase, etc should be mutable. Of course I can immediately think of a hundred exceptions to such a general rule. An exception I make all the time is when I have a class that just exists as a wrapper to hold a primitive in some case where a primitive is not legal, like in a collection, but I need to update it constantly.

In Java, a method can't return multiple objects, like return a, b, c. Returning an array of objects makes the code look ugly. In this situation, I have to pass mutable objects to the method and let it change the states of these objects. However, I don't know whether returning multiple objects is a code smell or not.

The answer is none. There are not any good reasons to be mutable.
You do run in to problems with lots of frameworks(or framework versions) that require mutable objects in order to work with them(Spring I am glaring in your direction). As you work with them and fish through the code you will shake your fist in anger that you need to introduce dirty mutability into an otherwise glorious block of code when it could have been easily avoided.
I'm sure there are limited corner cases(probably more hypothetical that anything) where the overhead of object creation and collection is uncceptable. But I urge the people that would make this argument to look at languages like scala where included collections are immutable by default and then look at the bevy of performance critical apps built on top of that concept.
This is of course hyperbole. In reality, you should go with immutability first, see if it causes you any measurable problems, if it does then introduce mutability, but make sure you can prove it solves your problem. Otherwise you've just created liability for no benefit. In doing this I think you'll find objective cases for "Implementation Complexity" and "Inflexibility" very hard to make.

Some implementations of immutable objects have transactional means to update an immutable object. Similar to how databases provide safe commits and rollbacks. But in apparent contrast with many of the answers here. Immutable objects are never changed. A typical operation would be.
B = append(A,C)
B is a new object. Just like A and C. No modification was made to A or C. Internally a red black tree implementation makes such semantics fast enough to be usable.
The downside is that it is not as fast as making the operations in place. But that only compares a single part of the system. When evaluating possible downsides we need to look at the system as a whole. And I personally don't have a clear picture of the entire impact. Although I suspect immutability wins out at the end.
I know some experts contend there is contention at the top level of the red black tree. And that has a negative effect in throught-put.

My biggest worry with immutable data structures is how to save/reconstitute them. That is, if a class has final fields, I can't instantiate it and then set its fields.

Considering object encapsulation, should getters return an immutable property?

When a getter returns a property, such as returning a List of other related objects, should that list and it's objects be immutable to prevent code outside of the class, changing the state of those objects, without the main parent object knowing?
For example if a Contact object, has a getDetails getter, which returns a List of ContactDetails objects, then any code calling that getter:
can remove ContactDetail objects from that list without the Contact object knowing of it.
can change each ContactDetail object without the Contact object knowing of it.
So what should we do here? Should we just trust the calling code and return easily mutable objects, or go the hard way and make a immutable class for each mutable class?

It's a matter of whether you should be "defensive" in your code. If you're the (sole) user of your class and you trust yourself then by all means no need for immutability. However, if this code needs to work no matter what, or you don't trust your user, then make everything that is externalized immutable.
That said, most properties I create are mutable. An occasional user botches this up, but then again it's his/her fault, since it is clearly documented that mutation should not occur via mutable objects received via getters.

It depends on the context. If the list is intended to be mutable, there is no point in cluttering up the API of the main class with methods to mutate it when List has a perfectly good API of its own.
However, if the main class can't cope with mutations, then you'll need to return an immutable list - and the entries in the list may also need to be immutable themselves.
Don't forget, though, that you can return a custom List implementation that knows how to respond safely to mutation requests, whether by firing events or by performing any required actions directly. In fact, this is a classic example of a good time to use an inner class.

If you have control of the calling code then what matters most is that the choice you make is documented well in all the right places.

Joshua Bloch in his excellent "Effective Java" book says that you should ALWAYS make defensive copies when returning something like this. That may be a little extreme, especially if the ContactDetails objects are not Cloneable, but it's always the safe way. If in doubt always favour code safety over performance - unless profiling has shown that the cloneing is a real performance bottleneck.
There are actually several levels of protection you can add. You can simply return the member, which is essentially giving any other class access to the internals of your class. Very unsafe, but in fairness widely done. It will also cause you trouble later if you want to change the internals so that the ContactDetails are stored in a Set. You can return a newly-created list with references to the same objects in the internal list. This is safer - another class can't remove or add to the list, but it can modify the existing objects. Thirdly return a newly created list with copies of the ContactDetails objects. That's the safe way, but can be expensive.
I would do this a better way. Don't return a list at all - instead return an iterator over a list. That way you don't have to create a new list (List has a method to get an iterator) but the external class can't modify the list. It can still modify the items, unless you write your own iterator that clones the elements as needed. If you later switch to using another collection internally it can still return an iterator, so no external changes are needed.

In the particular case of a Collection, List, Set, or Map in Java, it is easy to return an immutable view to the class using return Collections.unmodifiableList(list);
Of course, if it is possible that the backing-data will still be modified then you need to make a full copy of the list.

Depends on the context, really. But generally, yes, one should write as defensive code as possible (returning array copies, returning readonly wrappers around collections etc.). In any case, it should be clearly documented.

I used to return a read-only version of the list, or at least, a copy. But each object contained in the list must be editable, unless they are immutable by design.

I think you'll find that it's very rare for every gettable to be immutable.
What you could do is to fire events when a property is changed within such objects. Not a perfect solution either.
Documentation is probably the most pragmatic solution ;)

Your first imperative should be to follow the Law of Demeter or ‘Tell don't ask’; tell the object instance what to do e.g.
contact.print( printer ) ; // or
contact.show( new Dialog() ) ; // or
contactList.findByName( searchName ).print( printer ) ;
Object-oriented code tells objects to do things. Procedural code gets information then acts on that information. Asking an object to reveal the details of its internals breaks encapsulation, it is procedural code, not sound OO programming and as Will has already said it is a flawed design.
If you follow the Law of Demeter approach any change in the state of an object occurs through its defined interface, therefore side-effects are known and controlled. Your problem goes away.

When I was starting out I was still heavily under the influence of HIDE YOUR DATA OO PRINCIPALS LOL. I would sit and ponder what would happen if somebody changed the state of one of the objects exposed by a property. Should I make them read only for external callers? Should I not expose them at all?
Collections brought out these anxieties to the extreme. I mean, somebody could remove all the objects in the collection while I'm not looking!
I eventually realized that if your objects' hold such tight dependencies on their externally visible properties and their types that, if somebody touches them in a bad place you go boom, your architecture is flawed.
There are valid reasons to make your external properties readonly and their types immutable. But that is the corner case, not the typical one, imho.

First of all, setters and getters are an indication of bad OO. Generally the idea of OO is you ask the object to do something for you. Setting and getting is the opposite. Sun should have figured out some other way to implement Java beans so that people wouldn't pick up this pattern and think it's "Correct".
Secondly, each object you have should be a world in itself--generally, if you are going to use setters and getters they should return fairly safe independent objects. Those objects may or may not be immutable because they are just first-class objects. The other possibility is that they return native types which are always immutable. So saying "Should setters and getters return something immutable" doesn't make too much sense.
As for making immutable objects themselves, you should virtually always make the members inside your object final unless you have a strong reason not to (Final should have been the default, "mutable" should be a keyword that overrides that default). This implies that wherever possible, objects will be immutable.
As for predefined quasi-object things you might pass around, I recommend you wrap stuff like collections and groups of values that go together into their own classes with their own methods. I virtually never pass around an unprotected collection simply because you aren't giving any guidance/help on how it's used where the use of a well-designed object should be obvious. Safety is also a factor since allowing someone access to a collection inside your class makes it virtually impossible to ensure that the class will always be valid.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.