What's the preferred way to assign a collection from a parameter? - java

I have this class:
public MyClass {
public void initialize(Collection<String> data) {
this.data = data; // <-- Bad!
}
private Collection<String> data;
}
This is obviously bad style, because I'm introducing a shared mutable state. What's the preferred way to handle this?
Ignore it?
Clone the collection?
...?
EDIT: To clarify why this is bad, imagine this:
MyClass myObject = new MyClass();
List<String> data = new ArrayList<String>();
myObject.initialize(data); // myObject.data.size() == 0
data.add("Test"); // myObject.data.size() == 1
Just storing the reference poses a way to inject data into the private field myObject.data, although it should be completely private.
Depending on the nature of MyClass this could have serious impacts.

The best way is to deep clone the parameter. For performance reasons, this is usually not possible. On top of that, not all objects can be cloned, so deep copying might throw exceptions and cause all kinds of headache.
The next best way would be a "copy-on-write" clone. There is no support for this in the Java runtime.
If you think that it's possible someone mutates the collection, do a shallow copy using the copy constructor:
this.data = new HashSet<String> (data);
This will solve your problem (since String is immutable) but it will fail when the type in the set is mutable.
Another solution is to always make the sets immutable as soon as you store them somewhere:
Set<String> set = ...
...build the set...
// Freeze the set
set = Collections.unmodifiableSet(set);
// Now you can safely pass it elsewhere
obj.setData (set);
The idea here is turn collections into "value objects" as soon as possible. Anyone who wants to change the collection must copy it, change it and then save it back.
Within a class, you can keep the set mutable and wrap it in the getter (which you should do anyway).
Problems with this approach: Performance (but it's probably not as bad as you'd expect) and discipline (breaks if you forget it somewhere).

Null check (if you want to restrict null)
Either defensive copy (if you don't want shared state)
or as you did (if a live view on data is useful)
Depends heavily on your requirements.
Edited:
Ignoring should be no option. Silent fail is, well... a debugging nightmare.

public class Foo {
private final Collection collection = new ArrayList();
public void initialise(final Collection collection) {
this.collection.addAll(collection);
}
}

Sorry for not addressing your concern directly, but I would never directly pass a Collection to a setXxx() bean setter method. Instead, I would do:
private final List<MyClass> theList;
public void addXxx(MyClass item) { ... }
public void removeXxx(MyClass item) { ... } // or index.
public void Iterator<MyClass> iterateXxx() {
return Collections.unmodifiableList(theList).iterator();
}
I would go for defensive copying / deep cloning only if I am sure there would be no side effects from using it, and as for the speed, I wouldn't concern myself with it, since in business applications reliability has 10 times more priority than speed. ;-)

An idea will be to pass the data as a String array and create the Set inside MyClass. Of course MyClass should test that the input data is valid. I believe that this is a good practice anyway.
If both the caller of MyClass and MyClass itself actually work with a Set<String>, then you could consider cloning the collection. The Set however needs to be constructed somehow. I would prefer to move this responsibility to MyClass.

Related

What is wrong in sharing Mutable State? [duplicate]

This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?
For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.
With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.
This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.

Is it possible to redirect a method call to another instance of the same object at runtime?

Situation: I have multiple states of the same object represented by different instances (which are made using a deep-copy). Now I want to make sure that, no matter which of these grouped instances is accessed, all operations that perform modifications are redirected onto the youngest of these instances[1].
Example:[2]
//Let's create an object
MyObject mObj = new MyObject(...);
//Let's create a list of past states
List<MyObject> pastStates = new ArrayList<MyObject>();
//doing some operations on mObj ....
mObj.modify(...);
//done modifying mObj, now let's save it's state and then create a copy to begin again
pastStates.add(mObj.copy());
//more of this...
mObj.modify(...);
pastStates.add(mObj.copy());
//let's compare some old states for whatever reason (e.g. part of an algorithm)
compare(MyObject o1, MyObject o2) {
if(o1.getA() == o2.getA()) {
o2.modify(...); //wait, we modified an old state...
}
Now this is a rather obvious example and probably a classic case of programmer's fault. They modified something that is clearly advertised as being a past state whatsoever... But say we still want to be nice and try to help and thus intercept the method call and perform it on the correct instance namely the youngest/master instance.[3]
Question: Is there a way to do this with standard java?
Bonus: Is there a way that doesn't have a horrible impact on performance?
Background: I'm experimenting around with different ways to make a library/engine, I'm writing for fun, harder to misuse by the enduser. As I will need these states internally anyways (snapshots in time for certain background functionalities), I would like to make them available to the enduser as well so they can profit of my statekeeping, e.g. for use in analytical algorithms.
[1] There can be multiple groups of instances of an object that are not related to each other; relation will presumably be kept by a one way link to the youngest instance which simply won't ever change.
[2] This code is meant as an example, it is clear that this mistake could be prevented by the enduser paying more attention when writing code.
[3] Now an easy way to prevent modification is to wrap the object into an immutable version which throws exception when trying to modify it > but we do not write this object ourselves and don't want to force it upon the enduser to write two versions of their own object if we don't have to...
I would probably create two classes: an "inner" one which is immutable and an "outer" one that maintains a list of inners. (Note: I don't mean inner classes in the JLS sense, just an object that is fully controlled by its wrapper.)
Something like this:
public final class Outer {
private final List<Inner> history = new ArrayList<>(); //history is inverted for brevity, 0 is the latest one
public Outer(int x) {
this.history.add(new Inner(x));
}
public void add(int x) {
history.add( 0, new Inner(history.get(0).x+x);
}
public Inner current() {
return history.get(0);
}
public static final class Inner {
private final int x;
private Inner(int x) {
this.x = x;
}
public int getX() {
return x;
}
}
}
With this setup clients can only instantiate Outer, can only mutate Outer but have access to a read-only copy of all the past states. There is no way to accidentally modify a past state. There is no need for separate grouping logic either because each instance of Outer naturally only records its own history.
Method interception can be done with AOP by using an around advice. AspectJ is a good tool for solving such problems. The impact on performance should also be no problem.
In an around advice in most cases you call proceed to execute the target method on the target object, but you can also prevent the method execution and instead do a method call on another object.
Yes, it is possible using bytecode modification.
Actually, if it was done by AspectJ or other library, it would be implemented using proxies or byte code modification. But I'm not sure that this specific task is possible with Aspect programming libraries API.
You can find working example for your task in this repo.
This test from repository works fine:
//Let's create an object
MyObject mObj = new MyObject();
MyObjectActiveRepository.INSTANCE.putToGroup(mObj, "group1");
MyObjectActiveRepository.INSTANCE.registerActiveForItsGroup(mObj);
//Let's create a list of past states
List<MyObject> pastStates = new ArrayList<MyObject>();
//doing some operations on mObj ....
mObj.modify("state1");
//done modifying mObj, now let's save it's state and then create a copy to begin again
pastStates.add(mObj.copy());
//more of this...
mObj.modify("state2");
pastStates.add(mObj.copy());
mObj.modify("state3");
assertEquals("state1", pastStates.get(0).getState());
assertEquals("state2", pastStates.get(1).getState());
assertEquals("state3", mObj.getState());
pastStates.get(0).modify("stateNew");
assertEquals("state1", pastStates.get(0).getState());
assertEquals("state2", pastStates.get(1).getState());
assertEquals("stateNew", mObj.getState());
Shortly -
I use ByteBuddy (Bytecode generation and modification tool) to redefine class bytecode before it has been load to:
remove final from class (if we have)
add field to save MyObject's "group" to address your (1) note
intercept call to copy(we need to copy "group" field additionally) and modify (to retarget call)
replace class code in classloader
TypePool typePool = TypePool.Default.ofClassPath();
new ByteBuddy()
.rebase(typePool.describe("MyObject").resolve(), ClassFileLocator.ForClassLoader.ofClassPath())
.modifiers(TypeManifestation.PLAIN) //our class can be final and we have no access to it - so remove final
.defineField("group", String.class, Visibility.PUBLIC)
.method(named("modify")).intercept(MethodDelegation.to(typePool.describe("Interceptors").resolve()))
.method(named("copy")).intercept(MethodDelegation.to(typePool.describe("Interceptors").resolve()))
.make()
.load(InterceptorsInitializer.class.getClassLoader(), ClassLoadingStrategy.Default.INJECTION);
Implemented MyObjectActiveRepository which contains information about active object for group and "group" field related functionality.Interceptors with simple copy redefinition which add "group" setting and modify, which makes our retargeting.
I think it should be lite code, the most expensive part is reflection call to setter on group-to-object assignment after object creation (this part can be improved; if we use ByteBuddy - we can replace reflection with implementing new interface with getGroup() and setGroup(String) methods during byte code generation with delegating them to FieldAccessor.ofField("group"), so we will have fine effective invokevirtual thru interface). modify() should have near the same performance, because it doesn't use reflection, only fully generated bytecode. I didn't make any benchmarking.

Java: Should I construct lightweight objects each time or cache instance?

During code review, a colleague of mine looked at this piece of code:
public List<Item> extractItems(List<Object[]> results) {
return Lists.transform(results, new Function<Object[], Item>() {
#Override
public Item apply(Object[] values) {
...
}
});
}
He suggests changing it to this:
public List<Item> extractItems(List<Object[]> results) {
return Lists.transform(results, getTransformer());
}
private Function<Object[], Item> transformer;
private Function<Object[], Item> getTransformer() {
if(transformer == null) {
transformer = new Function<Object[], Item>() {
#Override
public Item apply(Object[] values) {
...
}
};
}
return transformer;
}
So we are looking at taking the new Function() construction, and moving it over to be a member variable and re-used next time.
While I understand his logic and reasoning, I guess I'm not sold that I should do this for every possible object that I create that follows this pattern. It seems like there would be some good reasons not to do this, but I'm not sure.
What are your thoughts? Should we always cache duplicately created objects like this?
UPDATE
The Function is a google guava thing, and holds no state. A couple people have pointed out the non-thread-safe aspect of this change, which is perfectly valid, but isn't actually a concern here. I'm more asking about the practice of constructing vs caching small objects, which is better?
Your colleague's proposal is not thread safe. It also reeks of premature optimization. Is the construction of a Function object a known (tested) CPU bottleneck? If not, there's no reason to do this. It's not a memory problem - you're not keeping a reference, so GC will sweep it away, probably from Eden.
As already said, it's all premature optimization. The gain is probably not measurable and the whole story should be forgotten.
However, with the transformer being stateless, I'd go for it for readability reasons. Anonymous functions as an argument rather pollute the code.
Just drop the lazy initialization - you're gonna use the transformer whenever you use the class, right? (*) So put it in a static final field and maybe you can reuse it somewhere else.
(*) And even if not, creating and holding a cheap object during the whole application lifetime doesn't matter.

Is doing something else in setter methods considered having side effects?

Recently I have read some articles saying that methods having side effects is not good. So I just want to ask if my implementation here can be categorized as having side effect.
Suppose I have a SecurityGuard which checks to see if he should allow a customer to go to the club or not.
The SecurityGuard either has only list of validNames or list of invalidNames, not both.
if the SecurityGuard has only validNames, he only allows customer whose name on the list.
if the SecurityGuard has only invalidNames, he only allows customer whose name NOT on the list.
if the SecurityGuard has no lists at all, he allows everyone.
So to enforce the logic, on setter of each list, I reset the other list if the new list has value.
class SecurityGaurd {
private List<String> validNames = new ArrayList<>();
private List<String> invalidNames = new ArrayList<>();
public void setValidNames(List<String> newValidNames) {
this.validNames = new ArrayList<>(newValidNames);
// empty the invalidNames if newValidNames has values
if (!this.validNames.isEmpty()) {
this.invalidNames = new ArrayList<>();
}
}
public void setInvalidNames(List<String> newInvalidNames) {
this.invalidNames = new ArrayList<>(newInvalidNames);
// empty the validNames if newInvalidNames has values
if (!this.invalidNames.isEmpty()) {
this.validNames = new ArrayList<>(); //empty the validNames
}
}
public boolean allowCustomerToPass(String customerName) {
if (!validNames.isEmpty()) {
return validNames.contains(customerName);
}
return !invalidNames.contains(customerName);
}
}
So here you can see the setter methods have an implicit action, it resets the other list.
The question is what I'm doing here could be considered having a side effect? Is it bad enough so that we have to change it? And if yes, how can I improve this?
Thanks in advance.
Well, setters themselves have side effects (A value in that instance is left modified after the function ends). So, no, I wouldn't consider it something bad that needs to be changed.
Imagine that the guard just had one SetAdmissionPolicy which accepted a reference to an AdmissionPolicy defined:
interface AdmissionPolicy {
boolean isAcceptable(String customerName) {
}
and set the guard's admissionPolicy field to the passed-in reference. The guard's own allowCustomerToPass method simply called admissionPolicy.isAcceptable(customerName);.
Given the above definitions, one can imagine three classes that implement AdmissionPolicy: one would accept a list in its constructor, and isAcceptable would return true for everyone on the list, another would also accept a list in its constructor, but its isAcceptable would return true only for people not on the list. A third would simply return true unconditionally. If the club needs to close occasionally, one might also have a fourth implementation that returned false unconditionally.
Viewed in such a way, setInvalidNames and setValidNames could both be implemented as:
public void setAdmissionPolicyAdmitOnly(List<String> newValidNames) {
admissionPolicy = new AdmitOnlyPolicy(newValidNames);
}
public void setAdmissionPolicyAdmitAllBut(List<String> newInvalidNames) {
admissionPolicy = new AdmitAllButPolicy(newInvalidNames);
}
With such an implementation, it would be clear that each method was only "setting" one thing; such an implementation is how I would expect a class such as yours to behave.
The behavior of your class as described, however, I would regard as dubious at best. The issue isn't so much that adding admitted items clears out the rejected items, but rather that the behavior when a passed-in list is empty depends upon the earlier state in a rather bizarre fashion. It's hardly intuitive that if everyone but Fred is allowed access, calling setValidNames to nothing should have no effect, but if it's set to only allow George access that same call should grant access to everyone. Further, while it would not be unexpected that setValidNames would remove from invalidNames anyone who was included in the valid-names list nor vice versa, given the way the functions are named, the fact that setting one list removes everyone from the other list is somewhat unexpected (the different behavior with empty lists makes it especially so).
It does not have any side effect although , its assumed by developers that getters and setters may not have any underlying code apart from getting and setting the variable. Hence when another developer tries to maintain the code , he would probably overlook at your code of the Bean and do the same checks as done by you in the setters - Possible Boiler Plate code as you would call it
I'd not consider it as a side effect. You are maintaining the underlying assumptions of your object. I'm not sure it's the best design, but it's certainly a working one.
In this case I don't think changing the other linkedlist will be a side affect, since the scope is within this class.
However, based on your description, maybe it is better design to have one linkedList (called nameList) and a boolean (isValid) that differentiate between a whitelist and a blacklist. This way it is clear that only one type of list be filled at any time.
I think it's OK. E.g. if you want your class to be immutable the best place to do it is setter:
public void setNames(List<String> names) {
this.names = names == null ? Collections.emptyList() : Collections.unmodifiableList(names);
}

can I add to a private list directly through the getter?

I realize I'm going to get flamed for not simply writing a test myself... but I'm curious about people's opinions, not just the functionality, so... here goes...
I have a class that has a private list. I want to add to that private list through the public getMyList() method.
so... will this work?
public class ObA{
private List<String> foo;
public List<String> getFoo(){return foo;}
}
public class ObB{
public void dealWithObAFoo(ObA obA){
obA.getFoo().add("hello");
}
}
Yes, that will absolutely work - which is usually a bad thing. (This is because you're really returning a reference to the collection object, not a copy of the collection itself.)
Very often you want to provide genuinely read-only access to a collection, which usually means returning a read-only wrapper around the collection. Making the return type a read-only interface implemented by the collection and returning the actual collection reference doesn't provide much protection: the caller can easily cast to the "real" collection type and then add without any problems.
Indeed, not a good idea. Do not publish your mutable members outside, make a copy if you cannot provide a read-only version on the fly...
public class ObA{
private List<String> foo;
public List<String> getFoo(){return Collections.unmodifiableList(foo);}
public void addString(String value) { foo.add(value); }
}
If you want an opinion about doing this, I'd remove the getFoo() call and add an add(String msg) and remove(String msg) methods (or whatever other functionality you want to expose) to ObA
Giving access to collection always seems to be a bad thing in my experience--mostly because they are virtually impossible to control once they get out. I've taken to the habit of NEVER allowing direct access to collections outside the class that contains them.
The main reasoning behind this is that there is almost always some sort of business logic attached to the collection of data--for instance, validation on addition or perhaps some day you'll need to add a second closely-related collection.
If you allow access like you are talking about, it will be very difficult in the future to make a modification like this.
Oh, also, I often find that I eventually have to store a little more data with the object I'm storing--so I create a new object (only known inside the "Container" that houses the collection) and I put the object inside that before putting it in the collection.
If you've kept your collection locked down, this is a trivial refactor. Try to imagine how difficult it would be in some case you've worked on where you didn't keep the collection locked down...
If you wanted to support add and remove functions to Foo, I would suggest the methods addFoo() and removeFoo(). I ideally you could eliminate the getFoo at together by creating a method for each piece of functionality you need. This make it clear as to the functions a caller will preform on the list.

Categories