Related
when programming in Java I practically always, just out of habit, write something like this:
public List<String> foo() {
return new ArrayList<String>();
}
Most of the time without even thinking about it. Now, the question is: should I always specify the interface as the return type? Or is it advisable to use the actual implementation of the interface, and if so, under what circumstances?
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList:
List bar = foo();
List myList = bar instanceof LinkedList ? new ArrayList(bar) : bar;
but that just seems horrible and my coworkers would probably lynch me in the cafeteria. And rightfully so.
What do you guys think? What are your guidelines, when do you tend towards the abstract solution, and when do you reveal details of your implementation for potential performance gains?
Return the appropriate interface to hide implementation details. Your clients should only care about what your object offers, not how you implemented it. If you start with a private ArrayList, and decide later on that something else (e.g., LinkedLisk, skip list, etc.) is more appropriate you can change the implementation without affecting clients if you return the interface. The moment you return a concrete type the opportunity is lost.
For instance, if I know that I will
primarily access the data in the list
randomly, a LinkedList would be bad.
But if my library function only
returns the interface, I simply don't
know. To be on the safe side I might
even need to copy the list explicitly
over to an ArrayList.
As everybody else has mentioned, you just mustn't care about how the library has implemented the functionality, to reduce coupling and increasing maintainability of the library.
If you, as a library client, can demonstrate that the implementation is performing badly for your use case, you can then contact the person in charge and discuss about the best path to follow (a new method for this case or just changing the implementation).
That said, your example reeks of premature optimization.
If the method is or can be critical, it might mention the implementation details in the documentation.
Without being able to justify it with reams of CS quotes (I'm self taught), I've always gone by the mantra of "Accept the least derived, return the most derived," when designing classes and it has stood me well over the years.
I guess that means in terms of interface versus concrete return is that if you are trying to reduce dependencies and/or decouple, returning the interface is generally more useful. However, if the concrete class implements more than that interface, it is usually more useful to the callers of your method to get the concrete class back (i.e. the "most derived") rather than aribtrarily restrict them to a subset of that returned object's functionality - unless you actually need to restrict them. Then again, you could also just increase the coverage of the interface. Needless restrictions like this I compare to thoughtless sealing of classes; you never know. Just to talk a bit about the former part of that mantra (for other readers), accepting the least derived also gives maximum flexibility for callers of your method.
-Oisin
Sorry to disagree, but I think the basic rule is as follows:
For input arguments use the most generic.
For output values, the most specific.
So, in this case you want to declare the implementation as:
public ArrayList<String> foo() {
return new ArrayList<String>();
}
Rationale:
The input case is already known and explained by everyone: use the interface, period. However, the output case can look counter-intuitive.
You want to return the implementation because you want the client to have the most information about what is receiving. In this case, more knowledge is more power.
Example 1: the client wants to get the 5th element:
return Collection: must iterate until 5th element vs return List:
return List: list.get(4)
Example 2: the client wants to remove the 5th element:
return List: must create a new list without the specified element (list.remove() is optional).
return ArrayList: arrayList.remove(4)
So it's a big truth that using interfaces is great because it promotes reusability, reduces coupling, improves maintainability and makes people happy ... but only when used as input.
So, again, the rule can be stated as:
Be flexible for what you offer.
Be informative with what you deliver.
So, next time, please return the implementation.
In OO programming, we want to encapsulate as much as possible the data. Hide as much as possible the actual implementation, abstracting the types as high as possible.
In this context, I would answer only return what is meaningful. Does it makes sense at all for the return value to be the concrete class? Aka in your example, ask yourself: will anyone use a LinkedList-specific method on the return value of foo?
If no, just use the higher-level Interface. It's much more flexible, and allows you to change the backend
If yes, ask yourself: can't I refactor my code to return the higher-level interface? :)
The more abstract is your code, the less changes your are required to do when changing a backend. It's as simple as that.
If, on the other hand, you end up casting the return values to the concrete class, well that's a strong sign that you should probably return instead the concrete class. Your users/teammates should not have to know about more or less implicit contracts: if you need to use the concrete methods, just return the concrete class, for clarity.
In a nutshell: code abstract, but explicitly :)
In general, for a public facing interface such as APIs, returning the interface (such as List) over the concrete implementation (such as ArrayList) would be better.
The use of a ArrayList or LinkedList is an implementation detail of the library that should be considered for the most common use case of that library. And of course, internally, having private methods handing off LinkedLists wouldn't necessarily be a bad thing, if it provides facilities that would make the processing easier.
There is no reason that a concrete class shouldn't be used in the implementation, unless there is a good reason to believe that some other List class would be used later on. But then again, changing the implementation details shouldn't be as painful as long as the public facing portion is well-designed.
The library itself should be a black box to its consumers, so they don't really have to worry about what's going on internally. That also means that the library should be designed so that it is designed to be used in the way it is intended.
It doesn't matter all that much whether an API method returns an interface or a concrete class; despite what everyone here says, you almost never change the implementiation class once the code is written.
What's far more important: always use minimum-scope interfaces for your method parameters! That way, clients have maximal freedom and can use classes your code doesn't even know about.
When an API method returns ArrayList, I have absolutely no qualms with that, but when it demands an ArrayList (or, all to common, Vector) parameter, I consider hunting down the programmer and hurting him, because it means that I can't use Arrays.asList(), Collections.singletonList() or Collections.EMPTY_LIST.
As a rule, I only pass back internal implementations if I am in some private, inner workings of a library, and even so only sparingly. For everything that is public and likely to be called from the outside of my module I use interfaces, and also the Factory pattern.
Using interfaces in such a way has proven to be a very reliable way to write reusable code.
The main question has been answered already and you should always use the interface. I however would just like to comment on
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList.
If you are returning a data structure that you know has poor random access performance -- O(n) and typically a LOT of data -- there are other interfaces you should be specifying instead of List, like Iterable so that anyone using the library will be fully aware that only sequential access is available.
Picking the right type to return isn't just about interface versus concrete implementation, it is also about selecting the right interface.
You use interface to abstract away from the actual implementation. The interface is basically just a blueprint for what your implementation can do.
Interfaces are good design because they allow you to change implementation details without having to fear that any of its consumers are directly affected, as long as you implementation still does what your interface says it does.
To work with interfaces you would instantiate them like this:
IParser parser = new Parser();
Now IParser would be your interface, and Parser would be your implementation. Now when you work with the parser object from above, you will work against the interface (IParser), which in turn will work against your implementation (Parser).
That means that you can change the inner workings of Parser as much as you want, it will never affect code that works against your IParser parser interface.
In general use the interface in all cases if you have no need of the functionality of the concrete class. Note that for lists, Java has added a RandomAccess marker class primarily to distinguish a common case where an algorithm may need to know if get(i) is constant time or not.
For uses of code, Michael above is right that being as generic as possible in the method parameters is often even more important. This is especially true when testing such a method.
You'll find (or have found) that as you return interfaces, they permeate through your code. e.g. you return an interface from method A and you have to then pass an interface to method B.
What you're doing is programming by contract, albeit in a limited fashion.
This gives you enormous scope to change implementations under the covers (provided these new objects fulfill the existing contracts/expected behaviours).
Given all of this, you have benefits in terms of choosing your implementation, and how you can substitute behaviours (including testing - using mocking, for example). In case you hadn't guessed, I'm all in favour of this and try to reduce to (or introduce) interfaces wherever possible.
when programming in Java I practically always, just out of habit, write something like this:
public List<String> foo() {
return new ArrayList<String>();
}
Most of the time without even thinking about it. Now, the question is: should I always specify the interface as the return type? Or is it advisable to use the actual implementation of the interface, and if so, under what circumstances?
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList:
List bar = foo();
List myList = bar instanceof LinkedList ? new ArrayList(bar) : bar;
but that just seems horrible and my coworkers would probably lynch me in the cafeteria. And rightfully so.
What do you guys think? What are your guidelines, when do you tend towards the abstract solution, and when do you reveal details of your implementation for potential performance gains?
Return the appropriate interface to hide implementation details. Your clients should only care about what your object offers, not how you implemented it. If you start with a private ArrayList, and decide later on that something else (e.g., LinkedLisk, skip list, etc.) is more appropriate you can change the implementation without affecting clients if you return the interface. The moment you return a concrete type the opportunity is lost.
For instance, if I know that I will
primarily access the data in the list
randomly, a LinkedList would be bad.
But if my library function only
returns the interface, I simply don't
know. To be on the safe side I might
even need to copy the list explicitly
over to an ArrayList.
As everybody else has mentioned, you just mustn't care about how the library has implemented the functionality, to reduce coupling and increasing maintainability of the library.
If you, as a library client, can demonstrate that the implementation is performing badly for your use case, you can then contact the person in charge and discuss about the best path to follow (a new method for this case or just changing the implementation).
That said, your example reeks of premature optimization.
If the method is or can be critical, it might mention the implementation details in the documentation.
Without being able to justify it with reams of CS quotes (I'm self taught), I've always gone by the mantra of "Accept the least derived, return the most derived," when designing classes and it has stood me well over the years.
I guess that means in terms of interface versus concrete return is that if you are trying to reduce dependencies and/or decouple, returning the interface is generally more useful. However, if the concrete class implements more than that interface, it is usually more useful to the callers of your method to get the concrete class back (i.e. the "most derived") rather than aribtrarily restrict them to a subset of that returned object's functionality - unless you actually need to restrict them. Then again, you could also just increase the coverage of the interface. Needless restrictions like this I compare to thoughtless sealing of classes; you never know. Just to talk a bit about the former part of that mantra (for other readers), accepting the least derived also gives maximum flexibility for callers of your method.
-Oisin
Sorry to disagree, but I think the basic rule is as follows:
For input arguments use the most generic.
For output values, the most specific.
So, in this case you want to declare the implementation as:
public ArrayList<String> foo() {
return new ArrayList<String>();
}
Rationale:
The input case is already known and explained by everyone: use the interface, period. However, the output case can look counter-intuitive.
You want to return the implementation because you want the client to have the most information about what is receiving. In this case, more knowledge is more power.
Example 1: the client wants to get the 5th element:
return Collection: must iterate until 5th element vs return List:
return List: list.get(4)
Example 2: the client wants to remove the 5th element:
return List: must create a new list without the specified element (list.remove() is optional).
return ArrayList: arrayList.remove(4)
So it's a big truth that using interfaces is great because it promotes reusability, reduces coupling, improves maintainability and makes people happy ... but only when used as input.
So, again, the rule can be stated as:
Be flexible for what you offer.
Be informative with what you deliver.
So, next time, please return the implementation.
In OO programming, we want to encapsulate as much as possible the data. Hide as much as possible the actual implementation, abstracting the types as high as possible.
In this context, I would answer only return what is meaningful. Does it makes sense at all for the return value to be the concrete class? Aka in your example, ask yourself: will anyone use a LinkedList-specific method on the return value of foo?
If no, just use the higher-level Interface. It's much more flexible, and allows you to change the backend
If yes, ask yourself: can't I refactor my code to return the higher-level interface? :)
The more abstract is your code, the less changes your are required to do when changing a backend. It's as simple as that.
If, on the other hand, you end up casting the return values to the concrete class, well that's a strong sign that you should probably return instead the concrete class. Your users/teammates should not have to know about more or less implicit contracts: if you need to use the concrete methods, just return the concrete class, for clarity.
In a nutshell: code abstract, but explicitly :)
In general, for a public facing interface such as APIs, returning the interface (such as List) over the concrete implementation (such as ArrayList) would be better.
The use of a ArrayList or LinkedList is an implementation detail of the library that should be considered for the most common use case of that library. And of course, internally, having private methods handing off LinkedLists wouldn't necessarily be a bad thing, if it provides facilities that would make the processing easier.
There is no reason that a concrete class shouldn't be used in the implementation, unless there is a good reason to believe that some other List class would be used later on. But then again, changing the implementation details shouldn't be as painful as long as the public facing portion is well-designed.
The library itself should be a black box to its consumers, so they don't really have to worry about what's going on internally. That also means that the library should be designed so that it is designed to be used in the way it is intended.
It doesn't matter all that much whether an API method returns an interface or a concrete class; despite what everyone here says, you almost never change the implementiation class once the code is written.
What's far more important: always use minimum-scope interfaces for your method parameters! That way, clients have maximal freedom and can use classes your code doesn't even know about.
When an API method returns ArrayList, I have absolutely no qualms with that, but when it demands an ArrayList (or, all to common, Vector) parameter, I consider hunting down the programmer and hurting him, because it means that I can't use Arrays.asList(), Collections.singletonList() or Collections.EMPTY_LIST.
As a rule, I only pass back internal implementations if I am in some private, inner workings of a library, and even so only sparingly. For everything that is public and likely to be called from the outside of my module I use interfaces, and also the Factory pattern.
Using interfaces in such a way has proven to be a very reliable way to write reusable code.
The main question has been answered already and you should always use the interface. I however would just like to comment on
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList.
If you are returning a data structure that you know has poor random access performance -- O(n) and typically a LOT of data -- there are other interfaces you should be specifying instead of List, like Iterable so that anyone using the library will be fully aware that only sequential access is available.
Picking the right type to return isn't just about interface versus concrete implementation, it is also about selecting the right interface.
You use interface to abstract away from the actual implementation. The interface is basically just a blueprint for what your implementation can do.
Interfaces are good design because they allow you to change implementation details without having to fear that any of its consumers are directly affected, as long as you implementation still does what your interface says it does.
To work with interfaces you would instantiate them like this:
IParser parser = new Parser();
Now IParser would be your interface, and Parser would be your implementation. Now when you work with the parser object from above, you will work against the interface (IParser), which in turn will work against your implementation (Parser).
That means that you can change the inner workings of Parser as much as you want, it will never affect code that works against your IParser parser interface.
In general use the interface in all cases if you have no need of the functionality of the concrete class. Note that for lists, Java has added a RandomAccess marker class primarily to distinguish a common case where an algorithm may need to know if get(i) is constant time or not.
For uses of code, Michael above is right that being as generic as possible in the method parameters is often even more important. This is especially true when testing such a method.
You'll find (or have found) that as you return interfaces, they permeate through your code. e.g. you return an interface from method A and you have to then pass an interface to method B.
What you're doing is programming by contract, albeit in a limited fashion.
This gives you enormous scope to change implementations under the covers (provided these new objects fulfill the existing contracts/expected behaviours).
Given all of this, you have benefits in terms of choosing your implementation, and how you can substitute behaviours (including testing - using mocking, for example). In case you hadn't guessed, I'm all in favour of this and try to reduce to (or introduce) interfaces wherever possible.
I have been told at some stage at university (and have subsequently read in upteen places) that using instanceof should only be used as a 'last resort'. With this in mind, is anyone able to tell be if the following code I have is a last resort. I have had a look around on stack overflow but cannot quite find a similar scenario - perhaps I have missed it?
private void allocateUITweenManager() {
for(GameObject go:mGameObjects){
if (go instanceof GameGroup) ((GameGroup) go).setUITweenManager(mUITweenManager);
}
}
where
mGameObjects is an array, only some of which are GameGroup type
GameGroup is a subclass of abstract class GameObject.
GameGroup uses interface UITweenable which has method setUITweenManager()
GameObject does not use interface UITweenable
I suppose I could equally (and probably should) replace GameGroup in my code above with UITweenable - I would be asking the same question.
Is there another way of doing this that avoids the instanceof? This code cannot fail, as such (I think, right?), but given the bad press instanceof seems to get, have I committed some cardinal sin of OOP somewhere along the line that has me using instanceof here?
Thanks in advance!
I learned about Visitor pattern in Compiler class at university, I think it might apply in your scenario. Consider code below:
public class GameObjectVisitor {
public boolean visit(GameObject1 obj1) { return true; }
.
.
// one method for each game object
public boolean visit(GameGroup obj1) { return true; }
}
And then you can put a method in GameObject interface like this:
public interface GameObject {
.
.
public boolean visit(GameObjectVisitor visitor);
}
And then each GameObject implements this method:
public class GameGroup implements GameObject {
.
.
.
public boolean visit(GameObjectVisitor visitor) {
visitor.visit(this);
}
}
This is specially useful when you've complex inheritance hierarchy of GameObject. For your case your method will look like this:
private void allocateUITweenManager() {
GameObjectVisitor gameGroupVisitor = new GameObjectVisitor() {
public boolean visit(GameGroup obj1) {
obj1.setUITweenManager(mUITweenManager);
}
};
for(GameObject go:mGameObjects){
go.visit(gameGroupVisitor);
}
}
EDIT
There are two primary things you can do here to relieve yourself of this specific instance of instanceof. (pun?)
Do as my initial answer suggested and move the method you are targeting up to the class you are iterating. This isn't ideal in this case, because the method doesn't make sense to the parent object, and would be polluting as Ted has put it.
Shrink the scope of the objects you are iterating to just the objects that are familiar with the target method. I think this is the more ideal approach, but may not be workable in the current form of your code.
Personally, I avoid instanceof like the plague, because it makes me feel like I completely missed something, but there are times where it is necessary. If your code is laid out this way, and you have no way to shrink the scope of the objects you are iterating, then instanceof will probably work just fine. But this looks like a good opportunity to see how polymorphism can make your code easier to read and maintain in the future.
I am leaving the original answer below to maintain the integrity of the comments.
/EDIT
Personally, I don't think this is a good reason to use instanceof. It seems to me that you could utilize some polymorphism to accomplish your goal.
Have you considered making setUITweenManager(...) a method of GameObject? Does it make sense to do this?
If it does make sense, you could have your default implementation do nothing, and have your GameGroup override the method to do what you want it to do. At this point, your code could just look like this then:
private void allocateUITweenManager() {
for(GameObject go:mGameObjects){
go.setUITweenManager(mUITweenManager);
}
}
This is polymorphism in action, but I am not sure it would be the best approach for your current situation. It would make more sense to iterate the Collection of UITweenable objects instead if possible.
The reason why instanceof is discouraged is because in OOP we should not examine object's types from outside. Instead, the idiomatic way is to let object themselves act using overriden methods. In your case, one possible solution could be to define boolean setUITweenManager(...) on GameObject and let it return true if setting the manager was possible for a particular object. However if this pattern occurs in many places, the top-level classes can get quite polluted. Therefore sometimes instanceof is "lesser evil".
The problem with this OPP approach is that each object must "know" all its possible use cases. If you need a new feature that works on your class hierarchy, you have to add it to the classes themselves, you can't have it somewhere separate, like in a different module. This can be solved in a general way using the visitor pattern, as others suggested. The visitor pattern describes the most general way to examine objects, and becomes even more useful when combined with polymorphism.
Note that other languages (in particular functional languages) use a different principle. Instead of letting objects "know" how they perform every possible action, they declare data types that have no methods on their own. Instead, code that uses them examines how they were constructed using pattern matching on algebraic data types. As far as I know, the closest language to Java that has pattern matching is Scala. There is an interesting paper about how Scala implements pattern matching, which compares several possible approaches: Matching Objects With Patterns. Burak Emir, Martin Odersky, and John Williams.
Data in object-oriented programming is organized in a hierarchy of classes. The problem of object-oriented pattern matching is how to explore this hierarchy from the outside. This usually involves classifying objects by their run-time type, accessing their members, or determining some other characteristic of a group of objects. In this paper we compare six different pattern matching techniques: object-oriented decomposition, visitors, type-tests/typecasts, typecase, case classes, and extractors. The techniques are compared on nine criteria related to conciseness, maintainability and performance. The paper introduces case classes and extractors as two new pattern-matching methods and shows that their combination works well for all of the established criteria.
In summary: In OOP you can easily modify data types (like add subclasses), but adding new functions (methods) requires making changes to many classes. With ADT it's easy to add new functions, but modifying data types requires modifying many functions.
The problem with instanceof is that you can suffer from future object hierarchy changes. The better approach is to use Strategy Pattern for the cases where you are likely to use instanceof. Making a solution with instanceof you are falling into a problem Strategy is trying to solve: to many ifs. Some guys have founded a community. Anti-IF Campaign could be a joke but untipattern is serious. In a long term projects maintaining 10-20 levels of if-else-if could be a pain. In your case you'd better make a common interface for all objects of your array and implement setUITweenManager for all of them through an interface.
interface TweenManagerAware{
setUITweenManager(UITweenManager manager);
}
It is always a bit "fishy" to me to mix objects of different classes in the same Collection. Would it be possible / make sense to split the single Collection of GameObjects into multiple Collections, one of mere GameObjects, another of UITweenables? (e.g. use a MultiMap keyed by a Class). Then you could go something like:
for (UITweenable uit : myMap.get(UITweenable.class)) {
uit.setUITweenManager(mUITweenManager);
}
Now, you still need an instanceof when you insert into the map, but it's better encapsulated - hidden from the client code who doesn't need to know those details
p.s. I'm not a fanatic about all the SW "rules", but Google "Liskov Substitution Principle".
You could declare setUITweenManager in GameObject with an implementation that does nothing.
You could create an method that returns an iterator for all UITweenable instances in array of GameObject instances.
And there are other approaches that effectively hide the dispatching within some abstraction; e.g. the Visitor or Adapter patterns.
... have I committed some cardinal sin of OOP somewhere along the line that has me using instanceof here?
Not really (IMO).
The worst problem with instanceof is when you start using it to test for implementation classes. And the reason that is particularly bad is that it makes it hard to add extra classes, etcetera. Here the instanceof UITweenable stuff doesn't seem to introduce that problem, because UITweenable seems to be more fundamental to the design.
When you make these sorts of judgement, it is best to understand the reasons why the (supposedly) bad construct or usage is claimed to be bad. Then you look at you specific use-case and make up whether these reasons apply, and whether the alternatively you are looking at is really better in your use-case.
You could use the mGameObjects container for when you need to do something on all game objects and keep a separate container only for GameGroup objects.
This will use some more memory, and when you add/remove objects you have to update both containers, but it shouldn't be a noticeable overhead, and it lets you loop very efficiently through all the objects.
The problem with this approach is that it doesn't usually appear at one place only in your code and thus makes it more or less painful to add another implementations of the interface in the future. Whether to avoid it depends on your consideration. Sometimes YAGNI can be applied an this is the most straightforward way.
Alternatives had been suggested by others, for example the Visitor pattern.
I have another suggestion of a way to avoid instanceof.
Unless you are using a generic factory, at the moment when you create a GameObject you know what concrete type it is. So what you can do is pass any GameGroups you create an observable object, and allow them to add listeners to it. It would work like this:
public class Game {
private void makeAGameGroup() {
mGameObjects.add(new GameGroup(mUITweenManagerInformer));
}
private void allocateUITweenManager() {
mUITweenManagerInformer.fire(mUITweenManager);
}
private class OurUITweenManagerInformer extends UITweenManagerInformer {
private ArrayList<UITweenManagerListener> listeners;
public void addUITweenManagerListener(UITweenManagerListener l) {
listeners.add(l);
}
public void fire(UITweenManager next) {
for (UITweenManagerListener l : listeners)
l.changed(next);
}
}
private OurUITweenManagerInformer mUITweenManagerInformer = new OurUITweenManagerInformer();
}
public interface UITweenManagerInformer {
public void addUITweenManagerListener(UITweenManagerListener l);
}
public interface UITweenManagerListener {
public void changed(UITweenManager next);
}
What draws me to this solution is:
Because a UITweenManagerInformer is a constructor parameter to GameGoup, you cannot forget to pass it one, whereas with an instance method you might forget to call it.
It makes intuitive sense to me that information that an object needs (like the way a GameGroup needs knowledge of the current UITweenManager) should be passed as a constructor parameter -- I like to think of these as prerequisites for an object existing. If you don't have knowledge of the current UITweenManager, you shouldn't create a GameGroup, and this solution enforces that.
instanceof is never used.
when programming in Java I practically always, just out of habit, write something like this:
public List<String> foo() {
return new ArrayList<String>();
}
Most of the time without even thinking about it. Now, the question is: should I always specify the interface as the return type? Or is it advisable to use the actual implementation of the interface, and if so, under what circumstances?
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList:
List bar = foo();
List myList = bar instanceof LinkedList ? new ArrayList(bar) : bar;
but that just seems horrible and my coworkers would probably lynch me in the cafeteria. And rightfully so.
What do you guys think? What are your guidelines, when do you tend towards the abstract solution, and when do you reveal details of your implementation for potential performance gains?
Return the appropriate interface to hide implementation details. Your clients should only care about what your object offers, not how you implemented it. If you start with a private ArrayList, and decide later on that something else (e.g., LinkedLisk, skip list, etc.) is more appropriate you can change the implementation without affecting clients if you return the interface. The moment you return a concrete type the opportunity is lost.
For instance, if I know that I will
primarily access the data in the list
randomly, a LinkedList would be bad.
But if my library function only
returns the interface, I simply don't
know. To be on the safe side I might
even need to copy the list explicitly
over to an ArrayList.
As everybody else has mentioned, you just mustn't care about how the library has implemented the functionality, to reduce coupling and increasing maintainability of the library.
If you, as a library client, can demonstrate that the implementation is performing badly for your use case, you can then contact the person in charge and discuss about the best path to follow (a new method for this case or just changing the implementation).
That said, your example reeks of premature optimization.
If the method is or can be critical, it might mention the implementation details in the documentation.
Without being able to justify it with reams of CS quotes (I'm self taught), I've always gone by the mantra of "Accept the least derived, return the most derived," when designing classes and it has stood me well over the years.
I guess that means in terms of interface versus concrete return is that if you are trying to reduce dependencies and/or decouple, returning the interface is generally more useful. However, if the concrete class implements more than that interface, it is usually more useful to the callers of your method to get the concrete class back (i.e. the "most derived") rather than aribtrarily restrict them to a subset of that returned object's functionality - unless you actually need to restrict them. Then again, you could also just increase the coverage of the interface. Needless restrictions like this I compare to thoughtless sealing of classes; you never know. Just to talk a bit about the former part of that mantra (for other readers), accepting the least derived also gives maximum flexibility for callers of your method.
-Oisin
Sorry to disagree, but I think the basic rule is as follows:
For input arguments use the most generic.
For output values, the most specific.
So, in this case you want to declare the implementation as:
public ArrayList<String> foo() {
return new ArrayList<String>();
}
Rationale:
The input case is already known and explained by everyone: use the interface, period. However, the output case can look counter-intuitive.
You want to return the implementation because you want the client to have the most information about what is receiving. In this case, more knowledge is more power.
Example 1: the client wants to get the 5th element:
return Collection: must iterate until 5th element vs return List:
return List: list.get(4)
Example 2: the client wants to remove the 5th element:
return List: must create a new list without the specified element (list.remove() is optional).
return ArrayList: arrayList.remove(4)
So it's a big truth that using interfaces is great because it promotes reusability, reduces coupling, improves maintainability and makes people happy ... but only when used as input.
So, again, the rule can be stated as:
Be flexible for what you offer.
Be informative with what you deliver.
So, next time, please return the implementation.
In OO programming, we want to encapsulate as much as possible the data. Hide as much as possible the actual implementation, abstracting the types as high as possible.
In this context, I would answer only return what is meaningful. Does it makes sense at all for the return value to be the concrete class? Aka in your example, ask yourself: will anyone use a LinkedList-specific method on the return value of foo?
If no, just use the higher-level Interface. It's much more flexible, and allows you to change the backend
If yes, ask yourself: can't I refactor my code to return the higher-level interface? :)
The more abstract is your code, the less changes your are required to do when changing a backend. It's as simple as that.
If, on the other hand, you end up casting the return values to the concrete class, well that's a strong sign that you should probably return instead the concrete class. Your users/teammates should not have to know about more or less implicit contracts: if you need to use the concrete methods, just return the concrete class, for clarity.
In a nutshell: code abstract, but explicitly :)
In general, for a public facing interface such as APIs, returning the interface (such as List) over the concrete implementation (such as ArrayList) would be better.
The use of a ArrayList or LinkedList is an implementation detail of the library that should be considered for the most common use case of that library. And of course, internally, having private methods handing off LinkedLists wouldn't necessarily be a bad thing, if it provides facilities that would make the processing easier.
There is no reason that a concrete class shouldn't be used in the implementation, unless there is a good reason to believe that some other List class would be used later on. But then again, changing the implementation details shouldn't be as painful as long as the public facing portion is well-designed.
The library itself should be a black box to its consumers, so they don't really have to worry about what's going on internally. That also means that the library should be designed so that it is designed to be used in the way it is intended.
It doesn't matter all that much whether an API method returns an interface or a concrete class; despite what everyone here says, you almost never change the implementiation class once the code is written.
What's far more important: always use minimum-scope interfaces for your method parameters! That way, clients have maximal freedom and can use classes your code doesn't even know about.
When an API method returns ArrayList, I have absolutely no qualms with that, but when it demands an ArrayList (or, all to common, Vector) parameter, I consider hunting down the programmer and hurting him, because it means that I can't use Arrays.asList(), Collections.singletonList() or Collections.EMPTY_LIST.
As a rule, I only pass back internal implementations if I am in some private, inner workings of a library, and even so only sparingly. For everything that is public and likely to be called from the outside of my module I use interfaces, and also the Factory pattern.
Using interfaces in such a way has proven to be a very reliable way to write reusable code.
The main question has been answered already and you should always use the interface. I however would just like to comment on
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList.
If you are returning a data structure that you know has poor random access performance -- O(n) and typically a LOT of data -- there are other interfaces you should be specifying instead of List, like Iterable so that anyone using the library will be fully aware that only sequential access is available.
Picking the right type to return isn't just about interface versus concrete implementation, it is also about selecting the right interface.
You use interface to abstract away from the actual implementation. The interface is basically just a blueprint for what your implementation can do.
Interfaces are good design because they allow you to change implementation details without having to fear that any of its consumers are directly affected, as long as you implementation still does what your interface says it does.
To work with interfaces you would instantiate them like this:
IParser parser = new Parser();
Now IParser would be your interface, and Parser would be your implementation. Now when you work with the parser object from above, you will work against the interface (IParser), which in turn will work against your implementation (Parser).
That means that you can change the inner workings of Parser as much as you want, it will never affect code that works against your IParser parser interface.
In general use the interface in all cases if you have no need of the functionality of the concrete class. Note that for lists, Java has added a RandomAccess marker class primarily to distinguish a common case where an algorithm may need to know if get(i) is constant time or not.
For uses of code, Michael above is right that being as generic as possible in the method parameters is often even more important. This is especially true when testing such a method.
You'll find (or have found) that as you return interfaces, they permeate through your code. e.g. you return an interface from method A and you have to then pass an interface to method B.
What you're doing is programming by contract, albeit in a limited fashion.
This gives you enormous scope to change implementations under the covers (provided these new objects fulfill the existing contracts/expected behaviours).
Given all of this, you have benefits in terms of choosing your implementation, and how you can substitute behaviours (including testing - using mocking, for example). In case you hadn't guessed, I'm all in favour of this and try to reduce to (or introduce) interfaces wherever possible.
I keep hearing the statement on most programming related sites:
Program to an interface and not to an Implementation
However I don't understand the implications?
Examples would help.
EDIT: I have received a lot of good answers even so could you'll supplement it with some snippets of code for a better understanding of the subject. Thanks!
You are probably looking for something like this:
public static void main(String... args) {
// do this - declare the variable to be of type Set, which is an interface
Set buddies = new HashSet();
// don't do this - you declare the variable to have a fixed type
HashSet buddies2 = new HashSet();
}
Why is it considered good to do it the first way? Let's say later on you decide you need to use a different data structure, say a LinkedHashSet, in order to take advantage of the LinkedHashSet's functionality. The code has to be changed like so:
public static void main(String... args) {
// do this - declare the variable to be of type Set, which is an interface
Set buddies = new LinkedHashSet(); // <- change the constructor call
// don't do this - you declare the variable to have a fixed type
// this you have to change both the variable type and the constructor call
// HashSet buddies2 = new HashSet(); // old version
LinkedHashSet buddies2 = new LinkedHashSet();
}
This doesn't seem so bad, right? But what if you wrote getters the same way?
public HashSet getBuddies() {
return buddies;
}
This would have to be changed, too!
public LinkedHashSet getBuddies() {
return buddies;
}
Hopefully you see, even with a small program like this you have far-reaching implications on what you declare the type of the variable to be. With objects going back and forth so much it definitely helps make the program easier to code and maintain if you just rely on a variable being declared as an interface, not as a specific implementation of that interface (in this case, declare it to be a Set, not a LinkedHashSet or whatever). It can be just this:
public Set getBuddies() {
return buddies;
}
There's another benefit too, in that (well at least for me) the difference helps me design a program better. But hopefully my examples give you some idea... hope it helps.
One day, a junior programmer was instructed by his boss to write an application to analyze business data and condense it all in pretty reports with metrics, graphs and all that stuff. The boss gave him an XML file with the remark "here's some example business data".
The programmer started coding. A few weeks later he felt that the metrics and graphs and stuff were pretty enough to satisfy the boss, and he presented his work. "That's great" said the boss, "but can it also show business data from this SQL database we have?".
The programmer went back to coding. There was code for reading business data from XML sprinkled throughout his application. He rewrote all those snippets, wrapping them with an "if" condition:
if (dataType == "XML")
{
... read a piece of XML data ...
}
else
{
.. query something from the SQL database ...
}
When presented with the new iteration of the software, the boss replied: "That's great, but can it also report on business data from this web service?" Remembering all those tedious if statements he would have to rewrite AGAIN, the programmer became enraged. "First xml, then SQL, now web services! What is the REAL source of business data?"
The boss replied: "Anything that can provide it"
At that moment, the programmer was enlightened.
An interface defines the methods an object is commited to respond.
When you code to the interface, you can change the underlying object and your code will still work ( because your code is agnostic of WHO do perform the job or HOW the job is performed ) You gain flexibility this way.
When you code to a particular implementation, if you need to change the underlying object your code will most likely break, because the new object may not respond to the same methods.
So to put a clear example:
If you need to hold a number of objects you might have decided to use a Vector.
If you need to access the first object of the Vector you could write:
Vector items = new Vector();
// fill it
Object first = items.firstElement();
So far so good.
Later you decided that because for "some" reason you need to change the implementation ( let's say the Vector creates a bottleneck due to excessive synchronization)
You realize you need to use an ArrayList instad.
Well, you code will break ...
ArrayList items = new ArrayList();
// fill it
Object first = items.firstElement(); // compile time error.
You can't. This line and all those line who use the firstElement() method would break.
If you need specific behavior and you definitely need this method, it might be ok ( although you won't be able to change the implementation ) But if what you need is to simply retrieve the first element ( that is , there is nothing special with the Vector other that it has the firstElement() method ) then using the interface rather than the implementation would give you the flexibility to change.
List items = new Vector();
// fill it
Object first = items.get( 0 ); //
In this form you are not coding to the get method of Vector, but to the get method of List.
It does not matter how do the underlying object performs the method, as long as it respond to the contract of "get the 0th element of the collection"
This way you may later change it to any other implementation:
List items = new ArrayList(); // Or LinkedList or any other who implements List
// fill it
Object first = items.get( 0 ); // Doesn't break
This sample might look naive, but is the base on which OO technology is based ( even on those language which are not statically typed like Python, Ruby, Smalltalk, Objective-C etc )
A more complex example is the way JDBC works. You can change the driver, but most of your call will work the same way. For instance you could use the standard driver for oracle databases or you could use one more sophisticated like the ones Weblogic or Webpshere provide . Of course it isn't magical you still have to test your product before, but at least you don't have stuff like:
statement.executeOracle9iSomething();
vs
statement.executeOracle11gSomething();
Something similar happens with Java Swing.
Additional reading:
Design Principles from Design Patterns
Effective Java Item: Refer to objects by their interfaces
( Buying this book the one of the best things you could do in life - and read if of course - )
My initial read of that statement is very different than any answer I've read yet. I agree with all the people that say using interface types for your method params, etc are very important, but that's not what this statement means to me.
My take is that it's telling you to write code that only depends on what the interface (in this case, I'm using "interface" to mean exposed methods of either a class or interface type) you're using says it does in the documentation. This is the opposite of writing code that depends on the implementation details of the functions you're calling. You should treat all function calls as black boxes (you can make exceptions to this if both functions are methods of the same class, but ideally it is maintained at all times).
Example: suppose there is a Screen class that has Draw(image) and Clear() methods on it. The documentation says something like "the draw method draws the specified image on the screen" and "the clear method clears the screen". If you wanted to display images sequentially, the correct way to do so would be to repeatedly call Clear() followed by Draw(). That would be coding to the interface. If you're coding to the implementation, you might do something like only calling the Draw() method because you know from looking at the implementation of Draw() that it internally calls Clear() before doing any drawing. This is bad because you're now dependent on implementation details that you can't know from looking at the exposed interface.
I look forward to seeing if anyone else shares this interpretation of the phrase in the OP's question, or if I'm entirely off base...
It's a way to separate responsibilities / dependancies between modules.
By defining a particular Interface (an API), you ensure that the modules on either side of the interface won't "bother" one another.
For example, say module 1 will take care of displaying bank account info for a particular user, and module2 will fetch bank account info from "whatever" back-end is used.
By defining a few types and functions, along with the associated parameters, for example a structure defining a bank transaction, and a few methods (functions) like GetLastTransactions(AccountNumber, NbTransactionsWanted, ArrayToReturnTheseRec) and GetBalance(AccountNumer), the Module1 will be able to get the needed info, and not worry about how this info is stored or calculated or whatever. Conversely, the Module2 will just respond to the methods call by providing the info as per the defined interface, but won't worry about where this info is to be displayed, printed or whatever...
When a module is changed, the implementation of the interface may vary, but as long as the interface remains the same, the modules using the API may at worst need to be recompiled/rebuilt, but they do not need to have their logic modified in anyway.
That's the idea of an API.
At its core, this statement is really about dependencies. If I code my class Foo to an implementation (Bar instead of IBar) then Foo is now dependent on Bar. But if I code my class Foo to an interface (IBar instead of Bar) then the implementation can vary and Foo is no longer dependent on a specific implementation. This approach gives a flexible, loosely-coupled code base that is more easily reused, refactored and unit tested.
Take a red 2x4 Lego block and attach it to a blue 2x4 Lego block so one sits atop the other. Now remove the blue block and replace it with a yellow 2x4 Lego block. Notice that the red block did not have to change even though the "implementation" of the attached block varied.
Now go get some other kind of block that does not share the Lego "interface". Try to attach it to the red 2x4 Lego. To make this happen, you will need to change either the Lego or the other block, perhaps by cutting away some plastic or adding new plastic or glue. Notice that by varying the "implementation" you are forced to change it or the client.
Being able to let implementations vary without changing the client or the server - that is what it means to program to interfaces.
An interface is like a contract between you and the person who made the interface that your code will carry out what they request. Furthermore, you want to code things in such a way that your solution can solve the problem many times over. Think code re-use. When you are coding to an implementation, you are thinking purely of the instance of a problem that you are trying to solve. So when under this influence, your solutions will be less generic and more focused. That will make writing a general solution that abides by an interface much more challenging.
Look, I didn't realize this was for Java, and my code is based on C#, but I believe it provides the point.
Every car have doors.
But not every door act the same, like in UK the taxi doors are backwards. One universal fact is that they "Open" and "Close".
interface IDoor
{
void Open();
void Close();
}
class BackwardDoor : IDoor
{
public void Open()
{
// code to make the door open the "wrong way".
}
public void Close()
{
// code to make the door close properly.
}
}
class RegularDoor : IDoor
{
public void Open()
{
// code to make the door open the "proper way"
}
public void Close()
{
// code to make the door close properly.
}
}
class RedUkTaxiDoor : BackwardDoor
{
public Color Color
{
get
{
return Color.Red;
}
}
}
If you are a car door repairer, you dont care how the door looks, or if it opens one way or the other way. Your only requirement is that the door acts like a door, such as IDoor.
class DoorRepairer
{
public void Repair(IDoor door)
{
door.Open();
// Do stuff inside the car.
door.Close();
}
}
The Repairer can handle RedUkTaxiDoor, RegularDoor and BackwardDoor. And any other type of doors, such as truck doors, limousine doors.
DoorRepairer repairer = new DoorRepairer();
repairer.Repair( new RegularDoor() );
repairer.Repair( new BackwardDoor() );
repairer.Repair( new RedUkTaxiDoor() );
Apply this for lists, you have LinkedList, Stack, Queue, the normal List, and if you want your own, MyList. They all implement the IList interface, which requires them to implement Add and Remove. So if your class add or remove items in any given list...
class ListAdder
{
public void PopulateWithSomething(IList list)
{
list.Add("one");
list.Add("two");
}
}
Stack stack = new Stack();
Queue queue = new Queue();
ListAdder la = new ListAdder()
la.PopulateWithSomething(stack);
la.PopulateWithSomething(queue);
Allen Holub wrote a great article for JavaWorld in 2003 on this topic called Why extends is evil. His take on the "program to the interface" statement, as you can gather from his title, is that you should happily implement interfaces, but very rarely use the extends keyword to subclass. He points to, among other things, what is known as the fragile base-class problem. From Wikipedia:
a fundamental architectural problem of object-oriented programming systems where base classes (superclasses) are considered "fragile" because seemingly safe modifications to a base class, when inherited by the derived classes, may cause the derived classes to malfunction. The programmer cannot determine whether a base class change is safe simply by examining in isolation the methods of the base class.
In addition to the other answers, I add more:
You program to an interface because it's easier to handle. The interface encapsulates the behavior of the underlying class. This way, the class is a blackbox. Your whole real life is programming to an interface. When you use a tv, a car, a stereo, you are acting on its interface, not on its implementation details, and you assume that if implementation changes (e.g. diesel engine or gas) the interface remains the same. Programming to an interface allows you to preserve your behavior when non-disruptive details are changed, optimized, or fixed. This simplifies also the task of documenting, learning, and using.
Also, programming to an interface allows you to delineate what is the behavior of your code before even writing it. You expect a class to do something. You can test this something even before you write the actual code that does it. When your interface is clean and done, and you like interacting with it, you can write the actual code that does things.
"Program to an interface" can be more flexible.
For example, we are writing a class Printer which provides print service. currently there are 2 class (Cat and Dog) need to be printed. So we write code like below
class Printer
{
public void PrintCat(Cat cat)
{
...
}
public void PrintDog(Dog dog)
{
...
}
...
}
How about if there is a new class Bird also needs this print service? We have to change Printer class to add a new method PrintBird(). In real case, when we develop Printer class, we may have no idea about who will use it. So how to write Printer? Program to an interface can help, see below code
class Printer
{
public void Print(Printable p)
{
Bitmap bitmap = p.GetBitmap();
// print bitmap ...
}
}
With this new Printer, everything can be printed as long as it implements Interface Printable. Here method GetBitmap() is just a example. The key thing is to expose an Interface not a implementation.
Hope it's helpful.
Essentially, interfaces are the slightly more concrete representation of general concepts of interoperation - they provide the specification for what all the various options you might care to "plug in" for a particular function should do similarly so that code which uses them won't be dependent on one particular option.
For instance, many DB libraries act as interfaces in that they can operate with many different actual DBs (MSSQL, MySQL, PostgreSQL, SQLite, etc.) without the code that uses the DB library having to change at all.
Overall, it allows you to create code that's more flexible - giving your clients more options on how they use it, and also potentially allowing you to more easily reuse code in multiple places instead of having to write new specialized code.
By programming to an interface, you are more likely to apply the low coupling / high cohesion principle.
By programming to an interface, you can easily switch the implementation of that interface (the specific class).
It means that your variables, properties, parameters and return types should have an interface type instead of a concrete implementation.
Which means you use IEnumerable<T> Foo(IList mylist) instead of ArrayList Foo(ArrayList myList) for example.
Use the implementation only when constructing the object:
IList list = new ArrayList();
If you have done this you can later change the object type maybe you want to use LinkedList instead of ArrayList later on, this is no problem since everywhere else you refer to it as just "IList"
It's basically where you make a method/interface like this: create( 'apple' ) where the method create(param) comes from an abstract class/interface fruit that is later implemented by concrete classes. This is different than subclassing. You are creating a contract that classes must fulfill. This also reduces coupling and making things more flexible where each concrete class implements it differently.
The client code remains unaware of the specific types of objects used and remains unaware of the classes that implement these objects. Client code only knows about the interface create(param) and it uses it to make fruit objects. It's like saying, "I don't care how you get it or make it I, just want you to give it to me."
An analogy to this is a set of on and off buttons. That is an interface on() and off(). You can use these buttons on several devices, a TV, radio, light. They all handle them differently but we don't care about that, all we care about is to turn it on or turn it off.
Coding to an interface is a philosophy, rather than specific language constructs or design patterns - it instructs you what is the correct order of steps to follow in order to create better software systems (e.g. more resilient, more testable, more scalable, more extendible, and other nice traits).
What it actually means is:
===
Before jumping to implementations and coding (the HOW) - think of the WHAT:
What black boxes should make up your system,
What is each box' responsibility,
What are the ways each "client" (that is, one of those other boxes, 3rd party "boxes", or even humans) should communicate with it (the API of each box).
After you figure the above, go ahead and implement those boxes (the HOW).
Thinking first of what a box' is and what its API, leads the developer to distil the box' responsibility, and to mark for himself and future developers the difference between what is its exposed details ("API") and it's hidden details ("implementation details"), which is a very important differentiation to have.
One immediate and easily noticeable gain is the team can then change and improve implementations without affecting the general architecture. It also makes the system MUCH more testable (it goes well with the TDD approach).
===
Beyond the traits I've mentioned above, you also save A LOT OF TIME going this direction.
Micro Services and DDD, when done right, are great examples of "Coding to an interface", however the concept wins in every pattern from monoliths to "serverless", from BE to FE, from OOP to functional, etc....
I strongly recommend this approach for Software Engineering (and I basically believe it makes total sense in other fields as well).