Recently I have read some articles saying that methods having side effects is not good. So I just want to ask if my implementation here can be categorized as having side effect.
Suppose I have a SecurityGuard which checks to see if he should allow a customer to go to the club or not.
The SecurityGuard either has only list of validNames or list of invalidNames, not both.
if the SecurityGuard has only validNames, he only allows customer whose name on the list.
if the SecurityGuard has only invalidNames, he only allows customer whose name NOT on the list.
if the SecurityGuard has no lists at all, he allows everyone.
So to enforce the logic, on setter of each list, I reset the other list if the new list has value.
class SecurityGaurd {
private List<String> validNames = new ArrayList<>();
private List<String> invalidNames = new ArrayList<>();
public void setValidNames(List<String> newValidNames) {
this.validNames = new ArrayList<>(newValidNames);
// empty the invalidNames if newValidNames has values
if (!this.validNames.isEmpty()) {
this.invalidNames = new ArrayList<>();
}
}
public void setInvalidNames(List<String> newInvalidNames) {
this.invalidNames = new ArrayList<>(newInvalidNames);
// empty the validNames if newInvalidNames has values
if (!this.invalidNames.isEmpty()) {
this.validNames = new ArrayList<>(); //empty the validNames
}
}
public boolean allowCustomerToPass(String customerName) {
if (!validNames.isEmpty()) {
return validNames.contains(customerName);
}
return !invalidNames.contains(customerName);
}
}
So here you can see the setter methods have an implicit action, it resets the other list.
The question is what I'm doing here could be considered having a side effect? Is it bad enough so that we have to change it? And if yes, how can I improve this?
Thanks in advance.
Well, setters themselves have side effects (A value in that instance is left modified after the function ends). So, no, I wouldn't consider it something bad that needs to be changed.
Imagine that the guard just had one SetAdmissionPolicy which accepted a reference to an AdmissionPolicy defined:
interface AdmissionPolicy {
boolean isAcceptable(String customerName) {
}
and set the guard's admissionPolicy field to the passed-in reference. The guard's own allowCustomerToPass method simply called admissionPolicy.isAcceptable(customerName);.
Given the above definitions, one can imagine three classes that implement AdmissionPolicy: one would accept a list in its constructor, and isAcceptable would return true for everyone on the list, another would also accept a list in its constructor, but its isAcceptable would return true only for people not on the list. A third would simply return true unconditionally. If the club needs to close occasionally, one might also have a fourth implementation that returned false unconditionally.
Viewed in such a way, setInvalidNames and setValidNames could both be implemented as:
public void setAdmissionPolicyAdmitOnly(List<String> newValidNames) {
admissionPolicy = new AdmitOnlyPolicy(newValidNames);
}
public void setAdmissionPolicyAdmitAllBut(List<String> newInvalidNames) {
admissionPolicy = new AdmitAllButPolicy(newInvalidNames);
}
With such an implementation, it would be clear that each method was only "setting" one thing; such an implementation is how I would expect a class such as yours to behave.
The behavior of your class as described, however, I would regard as dubious at best. The issue isn't so much that adding admitted items clears out the rejected items, but rather that the behavior when a passed-in list is empty depends upon the earlier state in a rather bizarre fashion. It's hardly intuitive that if everyone but Fred is allowed access, calling setValidNames to nothing should have no effect, but if it's set to only allow George access that same call should grant access to everyone. Further, while it would not be unexpected that setValidNames would remove from invalidNames anyone who was included in the valid-names list nor vice versa, given the way the functions are named, the fact that setting one list removes everyone from the other list is somewhat unexpected (the different behavior with empty lists makes it especially so).
It does not have any side effect although , its assumed by developers that getters and setters may not have any underlying code apart from getting and setting the variable. Hence when another developer tries to maintain the code , he would probably overlook at your code of the Bean and do the same checks as done by you in the setters - Possible Boiler Plate code as you would call it
I'd not consider it as a side effect. You are maintaining the underlying assumptions of your object. I'm not sure it's the best design, but it's certainly a working one.
In this case I don't think changing the other linkedlist will be a side affect, since the scope is within this class.
However, based on your description, maybe it is better design to have one linkedList (called nameList) and a boolean (isValid) that differentiate between a whitelist and a blacklist. This way it is clear that only one type of list be filled at any time.
I think it's OK. E.g. if you want your class to be immutable the best place to do it is setter:
public void setNames(List<String> names) {
this.names = names == null ? Collections.emptyList() : Collections.unmodifiableList(names);
}
Related
This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?
For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.
With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.
This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.
In Java, assume you have a data object object with an attribute bar that you need to set with a value that is returned from a complex operation done in an external source. Assume you have a method sendRequestToExternalSource that send a request based on 'object' to the external source and gets an object back holding (among other things) the needed value.
Which one of these ways to set the value is the better practice?
void main(MyObject object) {
bar = sendRequestToExternalSource(object);
object.setBar(bar);
}
String sendRequestToExternalSource(MyObject object) {
// Send request to external source
Object response = postToExternalSource(object);
//Do some validation and logic based on response
...
//Return only the attribute we are interested in
return response.getBar();
}
or
void main(MyObject object) {
sendRequestToExternalSourceAndUpdateObject(object);
}
void sendRequestToExternalSourceAndUpdateObject(MyObject object) {
// Send request to external source
Object response = postToExternalSource(object);
//Do some validation and logic based on response
...
//Set the attribute on the input object
object.setBar(response.getBar());
}
I know they both work, but what is the best practice?
It depends on a specific scenario. Side-effects are not bad practice but there are also scenarios where a user simply won't expect them.
In any case your documentation of such a method should clearly state if you manipulate arguments. The user must be informed about that since it's his object that he passes to your method.
Note that there are various examples where side-effects intuitively are to be expected and that's also totally fine. For example Collections#sort (documentation):
List<Integer> list = ...
Collections.sort(list);
However if you write a method like intersection(Set, Set) then you would expect the result being a new Set, not for example the first one. But you can rephrase the name to intersect and use a structure like Set#intersect(Set). Then the user would expect a method with void as return type where the resulting Set is the Set the method was invoked on.
Another example would be Set#add. You would expect that the method inserts your element and not a copy of it. And that is also what it does. It would be confusing for people if it instead creates copies. They would need to call it differently then, like CloneSet or something like that.
In general I would tend to giving the advice to avoid manipulating arguments. Except if side-effects are to be expected by the user, as seen in the example. Otherwise the risk is too high that you confuse the user and thus create nasty bugs.
I would choose the first one if I have only these two choices. And the reason of that is "S" in SOLID principles, single responsibility. I think the job of doComplicatedStuff method is not setting new or enriched value of bar to MyObject instance.
Of course I don't know use case that you are trying to implement, but I suggest looking at decorator pattern to modify MyObject instance
I personally prefer the variant barService.doComplicatedStuff(object); because I avoid making copies
As it might be clear from the title which approach should we prefer?
Intention is to pass a few method parameters and get something as output. We can pass another parameter and method will update it and method need not to return anything now, method will just update output variable and it will be reflected to the caller.
I am just trying to frame the question through this example.
List<String> result = new ArrayList<String>();
for (int i = 0; i < SOME_NUMBER_N; i++) {
fun(SOME_COLLECTION.get(i), result);
}
// in some other class
public void fun(String s, List<String> result) {
// populates result
}
versus
List<String> result = new ArrayList<String>();
for (int i = 0; i < SOME_NUMBER_N; i++) {
List<String> subResult = fun(SOME_COLLECTION.get(i));
// merges subResult into result
mergeLists(result, subResult);
}
// in some other class
public List<String> fun(String s) {
List<String> res = new ArrayList<String>();
// some processing to populate res
return res;
}
I understand that one passes the reference and another doesn't.
Which one should we prefer (in different situations) and why?
Update: Consider it only for mutable objects.
Returning a value from the function is generally a cleaner way of writing code. Passing a value and modifying it is more C/C++ style due to the nature of creating and destroying pointers.
Developers generally don't expect that their values will be modified by passing it through a function, unless the function explicitly states it modifies the value (and we often skim documentation anyway).
There are exceptions though.
Consider the example of Collections.sort, which does actually do an in place sort of a list. Imagine a list of 1 million items and you are sorting that. Maybe you don't want to create a second list that has another 1 million entries (even though these entries are pointing back to the original).
It is also good practice to favor having immutable objects. Immutable objects cause far fewer problems in most aspects of development (such as threading). So by returning a new object, you are not forcing the parameter to be mutable.
The important part is to be clear about your intentions in the methods. My recommendation is to avoid modifying the parameter when possible since it not the most typical behavior in Java.
You should return it. The second example you provided is the way to go.
First of all, its more clear. When other people read your code, there's no gotcha that they might not notice that the parameter is being modified as output. You can try to name the variables, but when it comes to code readability, its preferable.
The BIG reason why you should return it rather than pass it, is with immutable objects.
Your example, the List, is mutable, so it works okay.
But if you were to try to use a String that way, it would not work.
As strings are immutable, if you pass a string in as a parameter, and then the function were to say:
public void fun(String result){
result = "new string";
}
The value of result that you passed in would not be altered. Instead, the local scope variable 'result' now points to a new string inside of fun, but the result in your calling method still points to the original string.
If you called:
String test = "test";
fun(test);
System.out.println(test);
It will print: "test", not "new string"!
So definitely, it is superior to return. :)
This is more about best practices and your own method to program. I would say if you know this is going to be a one value return type function like:
function IsThisNumberAPrimeNumber{ }
Then you know that this is only going to ever return a boolean. I usually use functions as helper programs and not as large sub procedures. I also apply naming conventions that help dictate what I expect the sub\function will return.
Examples:
GetUserDetailsRecords
GetUsersEmailAddress
IsEmailRegistered
If you look at those 3 names, you can tell the first is going to give you some list or class of multiple user detail records, the second will give you a string value of a email and the third will likely give you a boolean value. If you change the name, you change the meaning, so I would say consider this in addition.
The reason I don't think we understand is that those are two totally different types of actions. Passing a variable to a function is a means of giving a function data. Returning it from the function is a way of passing data out of a function.
If you mean the difference between these two actions:
public void doStuff(int change) {
change = change * 2;
}
and
public void doStuff() {
int change = changeStorage.acquireChange();
change = change * 2;
}
Then the second is generally cleaner, however there are several reasons (security, function visibilty, etc) that can prevent you from passing data this way.
It's also preferable because it makes reusing code easier, as well as making it more modular.
according to guys recommendation and java code convention and also syntax limitation this is a bad idea and makes code harder to understand
BUT you can do it by implementing a reference holder class
public class ReferenceHolder<T>{
public T value;
}
and pass an object of ReferenceHolder into method parameter to be filled or modified by method.
on the other side that method must assign its return into Reference value instead of returning it.
here is the code for getting result of an average method by a ReferenceHolder instead of function return.
public class ReferenceHolderTest {
public static void main(String[] args) {
ReferenceHolder<Double> out = new ReferenceHolder<>();
average(new int[]{1,2,3,4,5,6,7,8},out);
System.out.println(out.value);
}
public static void average(int[] x, ReferenceHolder<Double> out ) {
int sum=0;
for (int a : x) {
sum+=a;
}
out.value=sum/(double)x.length;
}
}
Returning it will keep your code cleaner and cause less coupling between methods/classes.
It is generally preferable to return it.
Specially from a unit testing standpoint. If you are unit testing it
is easier to assert a returned value from a method than verifying if
your object was modified or interacted correctly. (Using
ArgumentCaptor or ArgumentMatcher to assert interactions isn't as
straight forward as a simple return assertion).
Increased code readability. If I see a method that takes 5 object parameters I
might have no immediate way of knowing you plan on modifying one of
those references for future use downstream. Instead if you are returning an
object, I can easily see you ultimately care about the result of that
method's computation.
I have this class:
public MyClass {
public void initialize(Collection<String> data) {
this.data = data; // <-- Bad!
}
private Collection<String> data;
}
This is obviously bad style, because I'm introducing a shared mutable state. What's the preferred way to handle this?
Ignore it?
Clone the collection?
...?
EDIT: To clarify why this is bad, imagine this:
MyClass myObject = new MyClass();
List<String> data = new ArrayList<String>();
myObject.initialize(data); // myObject.data.size() == 0
data.add("Test"); // myObject.data.size() == 1
Just storing the reference poses a way to inject data into the private field myObject.data, although it should be completely private.
Depending on the nature of MyClass this could have serious impacts.
The best way is to deep clone the parameter. For performance reasons, this is usually not possible. On top of that, not all objects can be cloned, so deep copying might throw exceptions and cause all kinds of headache.
The next best way would be a "copy-on-write" clone. There is no support for this in the Java runtime.
If you think that it's possible someone mutates the collection, do a shallow copy using the copy constructor:
this.data = new HashSet<String> (data);
This will solve your problem (since String is immutable) but it will fail when the type in the set is mutable.
Another solution is to always make the sets immutable as soon as you store them somewhere:
Set<String> set = ...
...build the set...
// Freeze the set
set = Collections.unmodifiableSet(set);
// Now you can safely pass it elsewhere
obj.setData (set);
The idea here is turn collections into "value objects" as soon as possible. Anyone who wants to change the collection must copy it, change it and then save it back.
Within a class, you can keep the set mutable and wrap it in the getter (which you should do anyway).
Problems with this approach: Performance (but it's probably not as bad as you'd expect) and discipline (breaks if you forget it somewhere).
Null check (if you want to restrict null)
Either defensive copy (if you don't want shared state)
or as you did (if a live view on data is useful)
Depends heavily on your requirements.
Edited:
Ignoring should be no option. Silent fail is, well... a debugging nightmare.
public class Foo {
private final Collection collection = new ArrayList();
public void initialise(final Collection collection) {
this.collection.addAll(collection);
}
}
Sorry for not addressing your concern directly, but I would never directly pass a Collection to a setXxx() bean setter method. Instead, I would do:
private final List<MyClass> theList;
public void addXxx(MyClass item) { ... }
public void removeXxx(MyClass item) { ... } // or index.
public void Iterator<MyClass> iterateXxx() {
return Collections.unmodifiableList(theList).iterator();
}
I would go for defensive copying / deep cloning only if I am sure there would be no side effects from using it, and as for the speed, I wouldn't concern myself with it, since in business applications reliability has 10 times more priority than speed. ;-)
An idea will be to pass the data as a String array and create the Set inside MyClass. Of course MyClass should test that the input data is valid. I believe that this is a good practice anyway.
If both the caller of MyClass and MyClass itself actually work with a Set<String>, then you could consider cloning the collection. The Set however needs to be constructed somehow. I would prefer to move this responsibility to MyClass.
Since arguments sent to a method in Java point to the original data structures in the caller method, did its designers intend for them to used for returning multiple values, as is the norm in other languages like C ?
Or is this a hazardous misuse of Java's general property that variables are pointers ?
A long time ago I had a conversation with Ken Arnold (one time member of the Java team), this would have been at the first Java One conference probably, so 1996. He said that they were thinking of adding multiple return values so you could write something like:
x, y = foo();
The recommended way of doing it back then, and now, is to make a class that has multiple data members and return that instead.
Based on that, and other comments made by people who worked on Java, I would say the intent is/was that you return an instance of a class rather than modify the arguments that were passed in.
This is common practice (as is the desire by C programmers to modify the arguments... eventually they see the Java way of doing it usually. Just think of it as returning a struct. :-)
(Edit based on the following comment)
I am reading a file and generating two
arrays, of type String and int from
it, picking one element for both from
each line. I want to return both of
them to any function which calls it
which a file to split this way.
I think, if I am understanding you correctly, tht I would probably do soemthing like this:
// could go with the Pair idea from another post, but I personally don't like that way
class Line
{
// would use appropriate names
private final int intVal;
private final String stringVal;
public Line(final int iVal, final String sVal)
{
intVal = iVal;
stringVal = sVal;
}
public int getIntVal()
{
return (intVal);
}
public String getStringVal()
{
return (stringVal);
}
// equals/hashCode/etc... as appropriate
}
and then have your method like this:
public void foo(final File file, final List<Line> lines)
{
// add to the List.
}
and then call it like this:
{
final List<Line> lines;
lines = new ArrayList<Line>();
foo(file, lines);
}
In my opinion, if we're talking about a public method, you should create a separate class representing a return value. When you have a separate class:
it serves as an abstraction (i.e. a Point class instead of array of two longs)
each field has a name
can be made immutable
makes evolution of API much easier (i.e. what about returning 3 instead of 2 values, changing type of some field etc.)
I would always opt for returning a new instance, instead of actually modifying a value passed in. It seems much clearer to me and favors immutability.
On the other hand, if it is an internal method, I guess any of the following might be used:
an array (new Object[] { "str", longValue })
a list (Arrays.asList(...) returns immutable list)
pair/tuple class, such as this
static inner class, with public fields
Still, I would prefer the last option, equipped with a suitable constructor. That is especially true if you find yourself returning the same tuple from more than one place.
I do wish there was a Pair<E,F> class in JDK, mostly for this reason. There is Map<K,V>.Entry, but creating an instance was always a big pain.
Now I use com.google.common.collect.Maps.immutableEntry when I need a Pair
See this RFE launched back in 1999:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4222792
I don't think the intention was to ever allow it in the Java language, if you need to return multiple values you need to encapsulate them in an object.
Using languages like Scala however you can return tuples, see:
http://www.artima.com/scalazine/articles/steps.html
You can also use Generics in Java to return a pair of objects, but that's about it AFAIK.
EDIT: Tuples
Just to add some more on this. I've previously implemented a Pair in projects because of the lack within the JDK. Link to my implementation is here:
http://pbin.oogly.co.uk/listings/viewlistingdetail/5003504425055b47d857490ff73ab9
Note, there isn't a hashcode or equals on this, which should probably be added.
I also came across this whilst doing some research into this questions which provides tuple functionality:
http://javatuple.com/
It allows you to create Pair including other types of tuples.
You cannot truly return multiple values, but you can pass objects into a method and have the method mutate those values. That is perfectly legal. Note that you cannot pass an object in and have the object itself become a different object. That is:
private void myFunc(Object a) {
a = new Object();
}
will result in temporarily and locally changing the value of a, but this will not change the value of the caller, for example, from:
Object test = new Object();
myFunc(test);
After myFunc returns, you will have the old Object and not the new one.
Legal (and often discouraged) is something like this:
private void changeDate(final Date date) {
date.setTime(1234567890L);
}
I picked Date for a reason. This is a class that people widely agree should never have been mutable. The the method above will change the internal value of any Date object that you pass to it. This kind of code is legal when it is very clear that the method will mutate or configure or modify what is being passed in.
NOTE: Generally, it's said that a method should do one these things:
Return void and mutate its incoming objects (like Collections.sort()), or
Return some computation and don't mutate incoming objects at all (like Collections.min()), or
Return a "view" of the incoming object but do not modify the incoming object (like Collections.checkedList() or Collections.singleton())
Mutate one incoming object and return it (Collections doesn't have an example, but StringBuilder.append() is a good example).
Methods that mutate incoming objects and return a separate return value are often doing too many things.
There are certainly methods that modify an object passed in as a parameter (see java.io.Reader.read(byte[] buffer) as an example, but I have not seen parameters used as an alternative for a return value, especially with multiple parameters. It may technically work, but it is nonstandard.
It's not generally considered terribly good practice, but there are very occasional cases in the JDK where this is done. Look at the 'biasRet' parameter of View.getNextVisualPositionFrom() and related methods, for example: it's actually a one-dimensional array that gets filled with an "extra return value".
So why do this? Well, just to save you having to create an extra class definition for the "occasional extra return value". It's messy, inelegant, bad design, non-object-oriented, blah blah. And we've all done it from time to time...
Generally what Eddie said, but I'd add one more:
Mutate one of the incoming objects, and return a status code. This should generally only be used for arguments that are explicitly buffers, like Reader.read(char[] cbuf).
I had a Result object that cascades through a series of validating void methods as a method parameter. Each of these validating void methods would mutate the result parameter object to add the result of the validation.
But this is impossible to test because now I cannot stub the void method to return a stub value for the validation in the Result object.
So, from a testing perspective it appears that one should favor returning a object instead of mutating a method parameter.