Publication/Escape and Encapsulation in Java - java

I'm reading the "Java Concurrency in Practice" book and there is a part that I don't quite understand. I know is an old book, and probably is the reason why there is no mention of my doubt.
In the "Sharing object" chapter there is a the section called "Publication and escape":
Publishing an object means making it available to code outside of its
current scope, such as by storing a reference to it where other code
can find it, returning it from a nonprivate method, or passing it to a
method in another class ...
An object that is published when it
should not have been is said to have escaped.
And there is an example:
class UnsafeStates
{
private String[] states = new String []
{"AK", "AL" ....};
public String [] getStates(){return states;}
}
Publishing states in this way is problematic because any caller can
modify its contents. In this case, the states array has escaped its
intend scope, because what was supposed to be private state has been
effectively made public.
And is specified to don't do never in this way.
I suddendly thought about the encapsulation concept where, basically, say to do exactly this.
I misunderstood the concepts or what?

The problem is this method:
public String [] getStates(){return states;}
Return your reference to states array.
Even though you declare states as private but if you get it via getStates() method, you still can modify the states variable:
UnsafeStates us = new UnsafeStates();
String[] copyStates = us.getStates();
copyStates[0] = "You don't know"; // Here the change to `copyStates` also affect the `states` variable since they're both reference to the same array.
Normally, for objects and array properties, you often return a deep copy of that properties to prevent modify internal properties.
Ex: For array:
String[] copied = new String[states.length];
System.arraycopy(states, 0, copied, 0, states.length);
return copied;
For List:
return Collections.unmodifiableList(internalList);
For Object:
return internalObject.clone(); // Require deep clone

Related

What is wrong in sharing Mutable State? [duplicate]

This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?
For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.
With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.
This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.

Do final fields really prevent mutability in Java?

From the famous book Java Concurrency in Practice chapter 3.4.1 Final fields
Just as it is a good practice to make all fields private unless they
need greater visibility[EJ Item 12] , it is a good practice to make
all fields final unless they need to be mutable.
My understanding of final references in Java : A final reference/ field just prevents the the field from getting re initialized but if it references a mutable object , we can still change its state rendering it mutable . So I am having difficulty understanding the above quote . What do you think ?
final fields prevent you from changing the field itself (by making it "point" to some other instance), but if the field is a reference to a mutable object, nothing will stop you from doing this:
public void someFunction (final Person p) {
p = new Person("mickey","mouse"); //cant do this - its final
p.setFirstName("donald");
p.setLastName("duck");
}
the reference p above is immutable, but the actual Person pointed to by the reference is mutable.
you can, of course, make class Person an immutable class, like so:
public class Person {
private final String firstName;
private final String lastName;
public Person(String firstName, String lastName) {
this.firstName = firstName;
this.lastName = lastName;
}
//getters and other methods here
}
such classes once created, cannot be modified in any way.
This quote says only what is says:
make all fields final unless they need to be mutable
mutable field is a field that you can later change to point to another object. If the field is final, it can still reference mutable object (e.g. java.util.Date). Thus the field is immutable (always points to the same object), but this object is mutable.
in java final is similer to this:
int *const p=&something //constant pointer.in c++. you can't use this pointer to change something but that something can be changed with its own.
in java final fields can't be changed but the object to who they refer may change by its own .
Ex. 'this' is final you can't assign anything to this but you can assign to object to whom this refer.
From your Ques:
it is a good practice to make all fields final unless they need to be mutable.
to just avoid any modification(either logically or accidently) to original it's always good to make them final.
As per Java code convention final variables are treated as constant and written in all Caps e.g.
private final int COUNT=10; // immutable -> cannot be changed
Making a collection reference variable final means only reference can not be changed but you can add, remove or change object inside collection. For example:
private final List Loans = new ArrayList();
list.add(“home loan”); //valid
list.add("personal loan"); //valid
loans = new Vector(); //not valid
Read more: http://javarevisited.blogspot.com/2011/12/final-variable-method-class-java.html#ixzz2KP5juxm0
To answer your question: Yes, if you use them correctly.
Disregard the quote. There's little to no benefit from making your code blotted with final's every now and then. Author aimed for a good practice, but in large code-bases, seeing everything made final rather obscures the intent, making reading code harder and more obscure - and may give little to no benefit.
Two approaches
Generally though, there are two schools. Choose yours:
AGAINST, unless really needed
One is NOT to use final, unless you REALLY want those reading the code you wrote to know that this field is special in some way and is not to be trifled with.
For, there's never a bad place to use final!
The other is enamored with final and wants it everywhere. It's less prevalent. Renauld Waldura makes for an excellent propagator of the idea. Read his linked entry, titled "Final word on final keyword", it's a good and in-depth read.
May depend on Java you use
On my part I'd just want you to know that final changed, and doesn't ALWAYS mean unmutable, as you can read in Heinz Kabutz' Java Specialist Newsletter. He takes you through how it worked in different Javas, feel free to take his code and check the Java you use.

Altering passed parameters in Java (call by ref,value)

I actually thought that I had a good idea of how passing values in Java actually work, since that was part of the SCJP cert which I have passed. That was until today, when I at work discovered a method like this:
public void toCommand(Stringbuffer buf) {
buf.append("blablabla");
}
Then the caller of that method used the function like this:
StringBuffer buf = new StringBuffer();
toCommand(buf);
String str = buf.toString();
Now I thought that that code would give str the value "", but it actually give it the value from the mehod. How is this possible? I thought things didnt work like that in Java?
Either way... it should be considered a bad practice to write code like this in Java, right? Because I can imagine it can bring some confusion with it.
I actually spent some time searching on this, but my interpretation of what these sources are saying, is that it shouldnt work. What am I missing?
http://www.yoda.arachsys.com/java/passing.html
http://javadude.com/articles/passbyvalue.htm
Sebastian
Java is pass-by-value. The value of an object reference is an object reference, not the object itself. And so the toCommand method receives a copy of the value, which is a reference to the object — the same object that the caller is referencing.
This is exactly the same as when you're referencing an object from two variables:
StringBuffer buf1;
StringBuffer buf2;
buf1 = new StringBuffer();
buf2 = buf1; // Still ONE object; there are two references to it
buf1.append("Hi there");
System.out.println(buf2.toString()); // "Hi there"
Gratuitous ASCII art:
+--------------------+
buf1--------->| |
| === The Object === |
| |
buf2--------->| Data: |
| * foo = "bar" |
| * x = 27 |
| |
+--------------------+
Another way to think of it is that the JVM has a master list of all objects, indexed by an ID. We create an object (buf1 = new StringBuffer();) and the JVM assigns the object the ID 42 and stores that ID in buf1 for us. Whenever we use buf1, the JVM gets the value 42 from it and looks up the object in its master list, and uses the object. When we do buf2 = buf1;, the variable buf2 gets a copy of the value 42, and so when we use buf2, the JVM sees object reference #42 and uses that same object. This is not a literal explanation (though from a stratospheric viewpoint, and if you read "JVM" as "JVM and memory manager and OS", it's not a million miles off), but helpful for thinking about what object references actually are.
With that background, you can see how toCommand gets a reference (42 or whatever), not the actual StringBuffer object data. And so operations on it look it up in the master list and alter its state (since it holds state information and allows us to change it). The caller sees the changes to the object's state because the object holds the state, the reference just points to the object.
Either way... it should be considered a bad practice to write code like this in Java, right?
Not at all, it's normal practice. It would be very hard to use Java (or most other OOP languages) without doing this. Objects are big compared to primitives like int and long, and so they're expensive to move around; object references are the size of primitives, so they're easily passed around. Also, having copies of things makes it difficult for various parts of a system to interact. Having references to shared objects makes it quite easy.
StringBuffer is mutable. The toCommand() method gets a reference to that objects which is passed by value (the reference), the reference allows the method to change the mutable StringBuffer.
If you are thinking about why we cannot do this with String it is because String is immutable in which case it will result in the creation of another String object and not having the changes reflected in the object of which the reference is passed.
And I don't see why it should be bad practice.
This is because the in the method, when you pass an object, a copy of a reference to the object is passed by value. Consider the example below:
public class Test {
public static void modifyBuff(StringBuffer b) {
b.append("foo");
}
public static void tryToNullifyBuff(StringBuffer b) {
b = null; // this will not affect the original reference since
// the once passed (by value) is a copy
}
public static void main(String[] args) {
StringBuffer buff = new StringBuffer(); // buff is a reference
// to StringBuffer object
modifyBuff(buff);
System.out.println(buff); // will print "foo"
tryToNullifyBuff(buff); // this has no effect on the original reference 'buff'
System.out.println(buff); // will still print "foo" because a copy of
// reference buff is passed to tryToNullifyBuff()
// which is made to reference null
// inside the method leaving the 'buff' reference intact
}
}
This can be done with other mutable objects like Collection classes for example. And this is
not at all a bad practice, in fact certain designs actively use this pattern.
str has the value "blablabla" because a reference to the StringBuilder instance is passed to toCommand().
There is only one StringBuffer instance created here - and you pass a reference to it to the toCommand() method. Therefore any methods invoked on the StringBuffer in the toCommand() method are invoked on the same instance of the StringBuffer in the calling method.
It is not always bad practise. Consider the task to assemble an email, then you could do something like:
StringBuilder emailBuilder = new StringBuilder();
createHeader(emailBuilder);
createBody(emailBuilder);
createFooter(emailBuilder);
sendEmail(emailBuilder.toString());
It surely could be used to create confusion and for public API one should add a note or two to the javadoc, if the value at the passed reference is changed.
Another prominent example from the Java API:
Collections.sort(list);
And as other already explained and just to complete the answer: The value of the reference of the StringBuffer (<-- start using StringBuilder instead!) is passed to toCommand so outside and inside the toCommand method you access the same StringBuffer instance.
Since you get a reference to the instance you can call all the methods on it, but you can't assign it to anything else.
public void toCommand(Stringbuffer buf) {
buf.append("blablabla"); // okay
buf = new Stringbuffer(); // no "effect" outside this method
}
When you pass an object to a method, unlike in C++, only the reference to the object gets copied. So when a mutable object is passed to the method, you can change it and the change gets reflected in the calling method.
In case you're too much into C/C++ programming, you should know that in java, pass by reference(in C++ lingo) is the default(and the only) way of passing arguments to methods.
Just in response to some of the comments about the SCJP (sorry don't have permissions to leave comments - apologies if leaving an answer is not the right way to do this).
In defence of the SCJP exam, passing object references as method parameters AND the differences between immutable String objects and StringBuffer / StringBuilder objects are part of the exam - see sections 3.1 and 7.3 here:
http://www.javadeveloper.co.in/scjp/scjp-exam-objectives.html
Both topics are covered quite extensively in the Kathy Sierra / Bert Bates study guide to the exam (which is the de facto official study guide to the exam).

Should Java method arguments be used to return multiple values?

Since arguments sent to a method in Java point to the original data structures in the caller method, did its designers intend for them to used for returning multiple values, as is the norm in other languages like C ?
Or is this a hazardous misuse of Java's general property that variables are pointers ?
A long time ago I had a conversation with Ken Arnold (one time member of the Java team), this would have been at the first Java One conference probably, so 1996. He said that they were thinking of adding multiple return values so you could write something like:
x, y = foo();
The recommended way of doing it back then, and now, is to make a class that has multiple data members and return that instead.
Based on that, and other comments made by people who worked on Java, I would say the intent is/was that you return an instance of a class rather than modify the arguments that were passed in.
This is common practice (as is the desire by C programmers to modify the arguments... eventually they see the Java way of doing it usually. Just think of it as returning a struct. :-)
(Edit based on the following comment)
I am reading a file and generating two
arrays, of type String and int from
it, picking one element for both from
each line. I want to return both of
them to any function which calls it
which a file to split this way.
I think, if I am understanding you correctly, tht I would probably do soemthing like this:
// could go with the Pair idea from another post, but I personally don't like that way
class Line
{
// would use appropriate names
private final int intVal;
private final String stringVal;
public Line(final int iVal, final String sVal)
{
intVal = iVal;
stringVal = sVal;
}
public int getIntVal()
{
return (intVal);
}
public String getStringVal()
{
return (stringVal);
}
// equals/hashCode/etc... as appropriate
}
and then have your method like this:
public void foo(final File file, final List<Line> lines)
{
// add to the List.
}
and then call it like this:
{
final List<Line> lines;
lines = new ArrayList<Line>();
foo(file, lines);
}
In my opinion, if we're talking about a public method, you should create a separate class representing a return value. When you have a separate class:
it serves as an abstraction (i.e. a Point class instead of array of two longs)
each field has a name
can be made immutable
makes evolution of API much easier (i.e. what about returning 3 instead of 2 values, changing type of some field etc.)
I would always opt for returning a new instance, instead of actually modifying a value passed in. It seems much clearer to me and favors immutability.
On the other hand, if it is an internal method, I guess any of the following might be used:
an array (new Object[] { "str", longValue })
a list (Arrays.asList(...) returns immutable list)
pair/tuple class, such as this
static inner class, with public fields
Still, I would prefer the last option, equipped with a suitable constructor. That is especially true if you find yourself returning the same tuple from more than one place.
I do wish there was a Pair<E,F> class in JDK, mostly for this reason. There is Map<K,V>.Entry, but creating an instance was always a big pain.
Now I use com.google.common.collect.Maps.immutableEntry when I need a Pair
See this RFE launched back in 1999:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4222792
I don't think the intention was to ever allow it in the Java language, if you need to return multiple values you need to encapsulate them in an object.
Using languages like Scala however you can return tuples, see:
http://www.artima.com/scalazine/articles/steps.html
You can also use Generics in Java to return a pair of objects, but that's about it AFAIK.
EDIT: Tuples
Just to add some more on this. I've previously implemented a Pair in projects because of the lack within the JDK. Link to my implementation is here:
http://pbin.oogly.co.uk/listings/viewlistingdetail/5003504425055b47d857490ff73ab9
Note, there isn't a hashcode or equals on this, which should probably be added.
I also came across this whilst doing some research into this questions which provides tuple functionality:
http://javatuple.com/
It allows you to create Pair including other types of tuples.
You cannot truly return multiple values, but you can pass objects into a method and have the method mutate those values. That is perfectly legal. Note that you cannot pass an object in and have the object itself become a different object. That is:
private void myFunc(Object a) {
a = new Object();
}
will result in temporarily and locally changing the value of a, but this will not change the value of the caller, for example, from:
Object test = new Object();
myFunc(test);
After myFunc returns, you will have the old Object and not the new one.
Legal (and often discouraged) is something like this:
private void changeDate(final Date date) {
date.setTime(1234567890L);
}
I picked Date for a reason. This is a class that people widely agree should never have been mutable. The the method above will change the internal value of any Date object that you pass to it. This kind of code is legal when it is very clear that the method will mutate or configure or modify what is being passed in.
NOTE: Generally, it's said that a method should do one these things:
Return void and mutate its incoming objects (like Collections.sort()), or
Return some computation and don't mutate incoming objects at all (like Collections.min()), or
Return a "view" of the incoming object but do not modify the incoming object (like Collections.checkedList() or Collections.singleton())
Mutate one incoming object and return it (Collections doesn't have an example, but StringBuilder.append() is a good example).
Methods that mutate incoming objects and return a separate return value are often doing too many things.
There are certainly methods that modify an object passed in as a parameter (see java.io.Reader.read(byte[] buffer) as an example, but I have not seen parameters used as an alternative for a return value, especially with multiple parameters. It may technically work, but it is nonstandard.
It's not generally considered terribly good practice, but there are very occasional cases in the JDK where this is done. Look at the 'biasRet' parameter of View.getNextVisualPositionFrom() and related methods, for example: it's actually a one-dimensional array that gets filled with an "extra return value".
So why do this? Well, just to save you having to create an extra class definition for the "occasional extra return value". It's messy, inelegant, bad design, non-object-oriented, blah blah. And we've all done it from time to time...
Generally what Eddie said, but I'd add one more:
Mutate one of the incoming objects, and return a status code. This should generally only be used for arguments that are explicitly buffers, like Reader.read(char[] cbuf).
I had a Result object that cascades through a series of validating void methods as a method parameter. Each of these validating void methods would mutate the result parameter object to add the result of the validation.
But this is impossible to test because now I cannot stub the void method to return a stub value for the validation in the Result object.
So, from a testing perspective it appears that one should favor returning a object instead of mutating a method parameter.

How is a StringBuffer passing data through voids with no fields in the Class?

Given: Class has no fields, every variable is local. littleString was created by refactoring bigString in Eclipse:
public String bigString()
{
StringBuffer bob = new StringBuffer();
this.littleString(bob);
return bob.toString();
}
private void littleString(final StringBuffer bob)
{
bob.append("Hello, I'm Bob");
}
The method littleString should not be passing the StringBuffer back, but yet is is. What kind of Black Magic goes on here? This is breaking all rules of encapsulation that I know. I'm in shock, words fail me.
littleString isn't passing the object back -- it's just using the same object. Both the local variable bob in bigString() and the parameter bob in littleString() refer to the same object, so if you change one of those objects, the changes will appear instantaneously in the other because they're both references to the same object.
The issue is that StringBuffers are mutable and have internal state associated with them. Some types of objects (such as Strings) are immutable, so you can safely pass them around as method parameters, and you know they won't ever get modified. Note that the addition of the final keyword doesn't help here -- it just makes sure that bob never gets assigned to refer to a different StringBuffer object.
It's not passing anything back. It's modifying the StringBuffer you passed a reference to. Objects in Java are not passed by value.
If you meant why does the string buffer get modified, it's because you were passing a reference to the string buffer, which allows you to call the public method append which modifies the string buffer object.
the answers above pretty much got it, except one little thing hasnt been mentioned: java lacks "const-ness", meaning you want an object to be unmodifiable.
"final" is close, but it still doesnt do the job properly. What the code snippet showed is the kind of error that can happen if you have mutable objects passed in as parameters to other methods. This can be fixed either by having immutable objects, or if there is some kind of new keyword added for deep const-ness.

Categories