Immutable function arguments in java

Immutable function arguments in java - java

I am new to java and I've come across this question:
in C/C++ we have const modifier which makes function parameters Immutable, Therefore user is confident that the arguments they pass wont change.
But I could not find the same thing in java. Sure final modifier makes the field assignable only once and it works fine to some extent. But what about Objects that I need to modify before sending?(I could make a final copy of the Object but I don't find it good enough. correct me if I'm wrong). what about Object's fields? how we can pass an Object in a way that we would be confident of the integrity of the Object?

In Java immutability is an intrinsic property of the type.
For instance String is immutable. No need for const. StringBuilder, CharSequence are, or maybe, mutable. So will need a copy. In these cases a quick .toString(). will do the job.
C++ defaults to an implicit copy for arguments. Copying is in someways a better version of immutability. const & is kind of effectively a way of getting a copy without paying for it. (I'm sure many people will strongly disagree with this paragraph.)

Related

Final fields and Immutable Classes

According to this: A Strategy for Defining Immutable Objects
One of the conditions for a class to be immutable, is making all its fields final and private.
Why final??? The other conditions aren't sufficient?

Without making the field final we can make an immutable class/object if other conditions are available.
But I think the final is useful while dealing with concurrency and synchronization.

Per the definition for an immutable object (courtesy of Wikipedia) "In object-oriented and functional programming, an immutable object is an object whose state cannot be modified after it is created."
Once an final object has been created it cannot be re-assigned. Without the final key work you could still change an object after it has been created.
See also
final object in java

Counter question "Why not final?".
final means for primitive types you'll not be able to change the value once assigned which is enough to make them Immmutable,
while for non-primitive types the reference can't be changed (1st step towards Immutability) once assigned and you need to do some more as mentioned in the link shared by you.

The key to the linked document is this quote
Not all classes documented as "immutable" follow these rules....However, such strategies require sophisticated analysis and are not for beginners.
This is a tutorial for beginners. It's easier to tell them "make everything private and final" then have to explain all the edge cases with how to properly handle mutable references and making sure not to let your references escape.

Is it correct to call java.lang.String immutable?

This Java tutorial
says that an immutable object cannot change its state after creation.
java.lang.String has a field
/** Cache the hash code for the string */
private int hash; // Default to 0
which is initialized on the first call of the hashCode() method, so it changes after creation:
String s = new String(new char[] {' '});
Field hash = s.getClass().getDeclaredField("hash");
hash.setAccessible(true);
System.out.println(hash.get(s));
s.hashCode();
System.out.println(hash.get(s));
output
0
32
Is it correct to call String immutable?

A better definition would be not that the object does not change, but that it cannot be observed to have been changed. It's behavior will never change: .substring(x,y) will always return the same thing for that string ditto for equals and all the other methods.
That variable is calculated the first time you call .hashcode() and is cached for further calls. This is basically what they call "memoization" in functional programming languages.
Reflection isn't really a tool for "programming" but rather for meta-programming (ie programming programs for generating programs) so it doesn't really count. It's the equivalent of changing a constant's value using a memory debugger.

The term "Immutable" is vague enough to not allow for a precise definition.
I suggest reading Kinds of Immutability from Eric Lippert's blog. Although it's technically a C# article, it's quite relevant to the question posed. In particular:
Observational immutability:
Suppose you’ve got an object which has the property that every time
you call a method on it, look at a field, etc, you get the same
result. From the point of view of the caller such an object would be
immutable. However you could imagine that behind the scenes the object
was doing lazy initialization, memoizing results of function calls in
a hash table, etc. The “guts” of the object might be entirely mutable.
What does it matter? Truly deeply immutable objects never change their
internal state at all, and are therefore inherently threadsafe. An
object which is mutable behind the scenes might still need to have
complicated threading code in order to protect its internal mutable
state from corruption should the object be called on two threads “at
the same time”.

Once created, all the methods on a String instance (called with the same parameters) will always provide the same result. You cannot change its behavoiur (with any public method), so it will always represent the same entity. Also it is final and cannot be subclassed, so it is guaranteed that all instances will behave like this.
Therefore from public view the object is considered immutable. The internal state does not really matter in this case.

Yes it is correct to call them immutable.
While it is true that you can reach in and modify private ... and final ... variables of a class, it is an unnecessary and incredibly unwise thing to do on a String object. It is generally assumed that nobody is going to be crazy enough do it.
From a security standpoint, the reflection calls needed to modify the state of a String all perform security checks. Unless you've miss-implement your sandbox, the calls will be blocked for non-trusted code. So you should have to worry about this as a way that untrusted code can break sandbox security.
It is also worth noting that the JLS states that using reflection to change final, may break things (e.g. in multi-threading) or may not have any effect.

From the viewpoint of a developer who is using reflection, it is not correct to call String immutable. There are actual Java developers using reflection to write real software every day. Dismissing reflection as a "hack" is preposterous. However, from the viewpoint of a developer who is not using reflection, it is correct to call String immutable. Whether or not it is valid to assume that String is immutable depends on context.
Immutability is an abstract concept and therefore cannot apply in an absolute sense to anything with a physical form (see the ship of Theseus). Programming language constructs like objects, variables, and methods exist physically as bits in a storage medium. Data degradation is a physical process which happens to all storage media, so no data can ever be said to be truly immutable. In addition, it is almost always possible in practice to subvert the programming language features intended to prevent the mutation of a particular datum. In contrast, the number 3 is 3, has always been 3, and will always be 3.
As applied to program data, immutability should be considered a useful assumption rather than a fundamental property. For example, if one assumes that a String is immutable, one may cache its hash code for reuse and avoid the cost of ever recomputing its hash code again later. Virtually all non-trivial software relies on assumptions that certain data will not mutate for certain durations of time. Software developers generally assume that the code segment of a program will not change while it is executing, unless they are writing self-modifying code. Understanding what assumptions are valid in a particular context is an important aspect of software development.

It can not be modified from outside and it is a final class, so it can not be subclassed and made mutable. Theese are two requirments for immutability. Reflection is considered as a hack, its not a normal way of development.

A class can be immutable while still having mutable fields, as long as it doesn't provide access to its mutable fields.
It's immutable by design. If you use Reflection (getting the declared Field and resetting its accessibility), you are circumventing its design.

Reflection will allow you to change the contents of any private field. Is it therefore correct to call any object in Java immutable?
Immutability refers to changes that are either initiated by or perceivable by the application.
In the case of string, the fact that a particular implementation chooses to lazily calculate the hashcode is not perceptible to the application. I would go a step further, and say that an internal variable that is incremented by the object -- but never exposed and never used in any other way -- would also be acceptable in an "immutable" object.

Yes it is correct. When you modified a String like you do in your example, a new String is created but the older one maintain its value.

Why is String.length() a method?

If a String object is immutable (and thus obviously cannot change its length), why is length() a method, as opposed to simply being public final int length such as there is in an array?
Is it simply a getter method, or does it make some sort of calculation?
Just trying to see the logic behind this.

Java is a standard, not just an implementation. Different vendors can license and implement Java differently, as long as they adhere to the standard. By making the standard call for a field, that limits the implementation quite severely, for no good reason.
Also a method is much more flexible in terms of the future of a class. It is almost never done, except in some very early Java classes, to expose a final constant as a field that can have a different value with each instance of the class, rather than as a method.
The length() method well predates the CharSequence interface, probably from its first version. Look how well that worked out. Years later, without any loss of backwards compatibility, the CharSequence interface was introduced and fit in nicely. This would not have been possible with a field.
So let's really inverse the question (which is what you should do when you design a class intended to remain unchanged for decades): What does a field gain here, why not simply make it a method?

This is a fundamental tenet of encapsulation.
Part of encapsulation is that the class should hide its implementation from its interface (in the "design by contract" sense of an interface, not in the Java keyword sense).
What you want is the String's length -- you shouldn't care if this is cached, calculated, delegates to some other field, etc. If the JDK people want to change the implementation down the road, they should be able to do so without you having to recompile.

Perhaps a .length() method was considered more consistent with the corresponding method for a StringBuffer, which would obviously need more than a final member variable.
The String class was probably one of the very first classes defined for Java, ever. It's possible (and this is just speculation) that the implementation used a .length() method before final member variables even existed. It wouldn't take very long before the use of the method was well-embedded into the body of Java code existing at the time.

Perhaps because length() comes from the CharSequence interface. A method is a more sensible abstraction than a variable if its going to have multiple implementations.

You should always use accessor methods in public classes rather than public fields, regardless of whether they are final or not (see Item 14 in Effective Java).
When you allow a field to be accessed directly (i.e. is public) you lose the benefit of encapsulation, which means you can't change the representation without changing the API (you break peoples code if you do) and you can't perform any action when the field is accessed.
Effective Java provides a really good rule of thumb:
If a class is accessible outside its package, provide accessor methods, to preserve the flexibility to change the class's internal representation. If a public class exposes its data fields, all hope of changing its representation is lost, as client code can be distributed far and wide.
Basically, it is done this way because it is good design practice to do so. It leaves room to change the implementation of String at a later stage without breaking code for everyone.

String is using encapsulation to hide its internal details from you. An immutable object is still free to have mutable internal values as long as its externally visible state doesn't change. Length could be lazily computed. I encourage you to take a look as String's source code.

Checking the source code of String in Open JDK it's only a getter.
But as #SteveKuo points out this could differ dependent on the implementation.

In most current jvm implementations a Substring references the char array of the original String for content and it needs start and length fields to define their own content, so the length() method is used as a getter. However this is not the only possible way to implement String.
In a different possible implementation each String could have its own char array and since char arrays already have a length field with the correct length it would be redundant to have one for the String object, since String.length() is a method we don't have to do that and can just reference the internal array.length .
These are two possible implementations of String, both with their own good and bad parts and they can replace each other because the length() method hides where the length is stored (internal array or in own field).

If Java is pass-by-value only, can I mandate a final modifier in formal parameters?

If Java is strictly pass-by-value for non primitive, isn't it better to establish coding standards like make all formal parameters of methods and constructors final? - to avoid confusion?

If Java is strictly pass-by-value for non primitive
Java is strictly pass-by-value for ALL arguments and ALL results, irrespective of type.
And if you don't understand, or don't believe me, read Java is Pass-by-Value, Dammit!
... can I mandate a final modifier in formal parameters?
Yes you could, but it would be a bad idea to do it for the reasons that you have given.
Declaring a method or constructor's formal parameters to be final means something different to the meaning that you are trying to place on it. Specifically, it means that the parameter can't be assigned to in the body of the method / constructor.
Doing this has the following consequences:
It will actually change the meaning of the code in a way that could cause compilation errors in some cases. (These are good compilation errors ... because they force the programmer to stop doing something that generally makes his code less clear. But fixing takes effort and requires retesting, etc.)
Seasoned Java programmers are not going to read into this the meaning that you intend. (Not that they should need to be reminded ...).
Novice Java programmers are likely to be more confused ... especially if they pick up the incorrect notion that declaring something to be final alters the argument passing semantics!
So this does not achieve your aim.
The real solution is to educate people that Java ALWAYS uses pass-by-value. (And that includes beating up people who persist in spreading false information and doubt about this ... like your question does!)
(In nearly all cases, it is good practice to not update formal parameters, and declaring them final prevents you doing this by accident. So this is good practice. But the reason it is good practice is not what you stated ... and if you did put this into a coding standard with the reasoning that you gave, you would deserve to be roundly criticised.)
And if you don't understand, or don't believe me, read Java is Pass-by-Value, Dammit!

Shahzeb commented that Java does not pass by value for non-primitives; it is pass-by-value for non-primitives in the sense that it passes the value of a reference to the non-primitive. It does not, however, copy objects when you pass them to functions.
The final modifier in formal parameter lists shouldn't affect callers one way or the other because methods receive copies of references and copies of primitives; they cannot affect reference and primitive values in caller code. The final modifier does not prevent mutable objects from being altered through the final reference. final can only be of any help to the code in the method it is in.
(Someone correct me if I'm wrong, but I believe this is right.)

One of the use of final in parameters is to avoid your parameter value to be changed by mistake when you write a method. Consider this
private method1(final String name){
...
...
String oldname = this.name;
name = name;
}
You forget to add this.name in name.The compiler will give error. This help you /developer to trade a logic error into a compiler error, if you make sure the input parameter cannot be modified. Thus reduce your error prone and debug time :)
Hope it helps

Just to cover other bases, because I think this is what you are thinking about:
In C++, it is possible to declare a parameter as "const", which basically means that the data contained in this object will not change as a result of calling this function (I think).
http://pages.cs.wisc.edu/~hasti/cs368/CppTutorial/NOTES/PARAMS.html#constRef
Java does not have such a modifier. (In Java 7 maybe? Anyone know?)

What is the reason for these PMD rules?

DataflowAnomalyAnalysis: Found
'DD'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DataflowAnomalyAnalysis: Found
'DU'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DD and DU sound familiar...I want to say in things like testing and analysis relating to weakest pre and post conditions, but I don't remember the specifics.
NullAssignment: Assigning an Object to
null is a code smell. Consider
refactoring.
Wouldn't setting an object to null assist in garbage collection, if the object is a local object (not used outside of the method)? Or is that a myth?
MethodArgumentCouldBeFinal: Parameter
'param' is not assigned and could be
declared final
LocalVariableCouldBeFinal: Local
variable 'variable' could be declared
final
Are there any advantages to using final parameters and variables?
LooseCoupling: Avoid using
implementation types like
'LinkedList'; use the interface
instead
If I know that I specifically need a LinkedList, why would I not use one to make my intentions explicitly clear to future developers? It's one thing to return the class that's highest up the class path that makes sense, but why would I not declare my variables to be of the strictest sense?
AvoidSynchronizedAtMethodLevel: Use
block level rather than method level
synchronization
What advantages does block-level synchronization have over method-level synchronization?
AvoidUsingShortType: Do not use the
short type
My first languages were C and C++, but in the Java world, why should I not use the type that best describes my data?

DD and DU anomalies (if I remember correctly—I use FindBugs and the messages are a little different) refer to assigning a value to a local variable that is never read, usually because it is reassigned another value before ever being read. A typical case would be initializing some variable with null when it is declared. Don't declare the variable until it's needed.
Assigning null to a local variable in order to "assist" the garbage collector is a myth. PMD is letting you know this is just counter-productive clutter.
Specifying final on a local variable should be very useful to an optimizer, but I don't have any concrete examples of current JITs taking advantage of this hint. I have found it useful in reasoning about the correctness of my own code.
Specifying interfaces in terms of… well, interfaces is a great design practice. You can easily change implementations of the collection without impacting the caller at all. That's what interfaces are all about.
I can't think of many cases where a caller would require a LinkedList, since it doesn't expose any API that isn't declared by some interface. If the client relies on that API, it's available through the correct interface.
Block level synchronization allows the critical section to be smaller, which allows as much work to be done concurrently as possible. Perhaps more importantly, it allows the use of a lock object that is privately controlled by the enclosing object. This way, you can guarantee that no deadlock can occur. Using the instance itself as a lock, anyone can synchronize on it incorrectly, causing deadlock.
Operands of type short are promoted to int in any operations. This rule is letting you know that this promotion is occurring, and you might as well use an int. However, using the short type can save memory, so if it is an instance member, I'd probably ignore that rule.

DataflowAnomalyAnalysis: Found
'DD'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DataflowAnomalyAnalysis: Found
'DU'-anomaly for variable 'variable'
(lines 'n1'-'n2').
No idea.
NullAssignment: Assigning an Object to
null is a code smell. Consider
refactoring.
Wouldn't setting an object to null assist in garbage collection, if the object is a local object (not used outside of the method)? Or is that a myth?
Objects in local methods are marked to be garbage collected once the method returns. Setting them to null won't do any difference.
Since it would make less experience developers what is that null assignment all about it may be considered a code smell.
MethodArgumentCouldBeFinal: Parameter
'param' is not assigned and could be
declared final
LocalVariableCouldBeFinal: Local
variable 'variable' could be declared
final
Are there any advantages to using final parameters and variables?
It make clearer that the value won't change during the lifecycle of the object.
Also, if by any chance someone try to assign a value, the compiler will prevent this coding error at compile type.
consider this:
public void businessRule( SomeImportantArgument important ) {
if( important.xyz() ){
doXyz();
}
// some fuzzy logic here
important = new NotSoImportant();
// add for/if's/while etc
if( important.abc() ){ // <-- bug
burnTheHouse();
}
}
Suppose that you're assigned to solve some mysterious bug that from time to time burns the house.
You know what wast the parameter used, what you don't understand is WHY the burnTHeHouse method is invoked if the conditions are not met ( according to your findings )
It make take you a while to findout that at some point in the middle, somone change the reference, and that you are using other object.
Using final help to prevent this kind of things.
LooseCoupling: Avoid using
implementation types like
'LinkedList'; use the interface
instead
If I know that I specifically need a LinkedList, why would I not use one to make my intentions explicitly clear to future developers? It's one thing to return the class that's highest up the class path that makes sense, but why would I not declare my variables to be of the strictest sense?
There is no difference, in this case. I would think that since you are not using LinkedList specific functionality the suggestion is fair.
Today, LinkedList could make sense, but by using an interface you help your self ( or others ) to change it easily when it wont.
For small, personal projects this may not make sense at all, but since you're using an analyzer already, I guess you care about the code quality already.
Also, helps less experienced developer to create good habits. [ I'm not saying you're one but the analyzer does not know you ;) ]
AvoidSynchronizedAtMethodLevel: Use
block level rather than method level
synchronization
What advantages does block-level synchronization have over method-level synchronization?
The smaller the synchronized section the better. That's it.
Also, if you synchronize at the method level you'll block the whole object. When you synchronize at block level, you just synchronize that specific section, in some situations that's what you need.
AvoidUsingShortType: Do not use the
short type
My first languages were C and C++, but in the Java world, why should I not use the type that best describes my data?
I've never heard of this, and I agree with you :) I've never use short though.
My guess is that by not using it, you'll been helping your self to upgrade to int seamlessly.
Code smells are more oriented to code quality than performance optimizations. So the advice are given for less experienced programmers and to avoid pitfalls, than to improve program speed.
This way, you could save a lot of time and frustrations when trying to change the code to fit a better design.
If it the advise doesn't make sense, just ignore them, remember, you are the developer at charge, and the tool is just that a tool. If something goes wrong, you can't blame the tool, right?

Just a note on the final question.
Putting "final" on a variable results in it only be assignable once. This does not necessarily mean that it is easier to write, but it most certainly means that it is easier to read for a future maintainer.
Please consider these points:
any variable with a final can be immediately classified in "will not change value while watching".
by implication it means that if all variables which will not change are marked with final, then the variables NOT marked with final actually WILL change.
This means that you can see already when reading through the definition part which variables to look out for, as they may change value during the code, and the maintainer can spend his/her efforts better as the code is more readable.

Wouldn't setting an object to null
assist in garbage collection, if the
object is a local object (not used
outside of the method)? Or is that a
myth?
The only thing it does is make it possible for the object to be GCd before the method's end, which is rarely ever necessary.
Are there any advantages to using final parameters and variables?
It makes the code somewhat clearer since you don't have to worry about the value being changed somwhere when you analyze the code. More often then not you don't need or want to change a variable's value once it's set anyway.
If I know that I specifically need a
LinkedList, why would I not use one to
make my intentions explicitly clear to
future developers?
Can you think of any reason why you would specifically need a
LinkedList?
It's one thing to
return the class that's highest up the
class path that makes sense, but why
would I not declare my variables to be
of the strictest sense?
I don't care much about local variables or fields, but if you declare a method parameter of type LinkedList, I will hunt you down and hurt you, because it makes it impossible for me to use things like Arrays.asList() and Collections.emptyList().
What advantages does block-level synchronization have over method-level synchronization?
The biggest one is that it enables you to use a dedicated monitor object so that only those critical sections are mutually exclusive that need to be, rather than everything using the same monitor.
in the Java world, why should I not
use the type that best describes my
data?
Because types smaller than int are automtically promoted to int for all calculations and you have to cast down to assign anything to them. This leads to cluttered code and quite a lot of confustion (especially when autoboxing is involved).

AvoidUsingShortType: Do not use the short type
List item
short is 16 bit, 2's compliment in java
a short mathmatical operaion with anything in the Integer family outside of another short will require a runtime sign extension conversion to the larger size. operating against a floating point requires sign extension and a non-trivial conversion to IEEE-754.
can't find proof, but with a 32 bit or 64 bit register, you're no longer saving on 'processor instructions' at the bytecode level. You're parking a compact car in a a semi-trailer's parking spot as far as the processor register is concerned.
If your are optimizing your project at the byte code level, wow. just wow. ;P
I agree on the design side of ignoring this pmd warning, just weigh accurately describing your object with a 'short' versus the incurred performance conversions.
in my opinion, the incurred performance hits are miniscule on most machines. ignore the error.

What advantages does block-level
synchronization have over method-level
synchronization?
Synchronize a method is like do a synchronize(getClass()) block, and blocks all the class.
Maybe you don't want that

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.