Immutability and General vs. Specific Types - java

This is in the context of creating an interface/API.
Best practices suggest using general rather than specific types in interfaces - e.g. Map rather than HashMap.
Best practices also suggest preferring immutable types over mutable ones.
So considering both of these suggestions (and leaving aside concerns about performance/memory-footprint, 3rd-party-libraries/dependencies and convenience/functionality) should a method in a public interface look like this
public List<SomeClass> someMethod(...)
or rather this
public ImmutableList<SomeClass> someMethod(...)

When this has been discussed among the Guava folks, the following has been said:
The basic advice for types exchanged by APIs is this: choose the most general type that still conveys the relevant semantic information.
I consider the trio of semantic guarantees made by ImmutableCollection to be extremely relevant for return values in almost any circumstance (those three being immutability, lack of null elements and guaranteed iteration order). So I would virtually always return ImmutableSet, not Set.
We would really like people to view ImmutableSet etc. as being interfaces in every important sense of the word. There are only two reasons they are not: reliability of the immutability guarantee, and the fact that Java won't allow static methods on interfaces until JDK 8, and we wanted them there for convenience.
Most people think ImmutableList is an implementation for this reason, but there are actually several to dozens of different implementations of some of these types; you just don't see them.

If the method's contract guarantees that the List is immutable then the return type should be the ImmutableList rather than the List. This is much more explicit than simply mentioning that the List is immutable in the method's JavaDoc.
However, if the immutability of the list is an implementation detail, rather than a contract then the return type should be the List.

Writing APIs is about having contract with the users.
Best practices are mostly for the context where you need to write APIs for third parties and need to define the interfaces.
We can have different view in different contexts. If you are writing library which is going to be used by third party, you need to consider that they should or should not change the object state.
If API is going to be consumed internally (with in same code base) & purpose is to achieve loose coupling then you need to think about ease of writing, extensibility and maintainability.
Immutable APIs would avoid the data inconsistency specially when you made lot of assumptions on state of the object. On the other hand, mutable object would allow to save development efforts.

Related

Java - Is it good practice to return interface or abstract types from methods? [duplicate]

when programming in Java I practically always, just out of habit, write something like this:
public List<String> foo() {
return new ArrayList<String>();
}
Most of the time without even thinking about it. Now, the question is: should I always specify the interface as the return type? Or is it advisable to use the actual implementation of the interface, and if so, under what circumstances?
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList:
List bar = foo();
List myList = bar instanceof LinkedList ? new ArrayList(bar) : bar;
but that just seems horrible and my coworkers would probably lynch me in the cafeteria. And rightfully so.
What do you guys think? What are your guidelines, when do you tend towards the abstract solution, and when do you reveal details of your implementation for potential performance gains?
Return the appropriate interface to hide implementation details. Your clients should only care about what your object offers, not how you implemented it. If you start with a private ArrayList, and decide later on that something else (e.g., LinkedLisk, skip list, etc.) is more appropriate you can change the implementation without affecting clients if you return the interface. The moment you return a concrete type the opportunity is lost.
For instance, if I know that I will
primarily access the data in the list
randomly, a LinkedList would be bad.
But if my library function only
returns the interface, I simply don't
know. To be on the safe side I might
even need to copy the list explicitly
over to an ArrayList.
As everybody else has mentioned, you just mustn't care about how the library has implemented the functionality, to reduce coupling and increasing maintainability of the library.
If you, as a library client, can demonstrate that the implementation is performing badly for your use case, you can then contact the person in charge and discuss about the best path to follow (a new method for this case or just changing the implementation).
That said, your example reeks of premature optimization.
If the method is or can be critical, it might mention the implementation details in the documentation.
Without being able to justify it with reams of CS quotes (I'm self taught), I've always gone by the mantra of "Accept the least derived, return the most derived," when designing classes and it has stood me well over the years.
I guess that means in terms of interface versus concrete return is that if you are trying to reduce dependencies and/or decouple, returning the interface is generally more useful. However, if the concrete class implements more than that interface, it is usually more useful to the callers of your method to get the concrete class back (i.e. the "most derived") rather than aribtrarily restrict them to a subset of that returned object's functionality - unless you actually need to restrict them. Then again, you could also just increase the coverage of the interface. Needless restrictions like this I compare to thoughtless sealing of classes; you never know. Just to talk a bit about the former part of that mantra (for other readers), accepting the least derived also gives maximum flexibility for callers of your method.
-Oisin
Sorry to disagree, but I think the basic rule is as follows:
For input arguments use the most generic.
For output values, the most specific.
So, in this case you want to declare the implementation as:
public ArrayList<String> foo() {
return new ArrayList<String>();
}
Rationale:
The input case is already known and explained by everyone: use the interface, period. However, the output case can look counter-intuitive.
You want to return the implementation because you want the client to have the most information about what is receiving. In this case, more knowledge is more power.
Example 1: the client wants to get the 5th element:
return Collection: must iterate until 5th element vs return List:
return List: list.get(4)
Example 2: the client wants to remove the 5th element:
return List: must create a new list without the specified element (list.remove() is optional).
return ArrayList: arrayList.remove(4)
So it's a big truth that using interfaces is great because it promotes reusability, reduces coupling, improves maintainability and makes people happy ... but only when used as input.
So, again, the rule can be stated as:
Be flexible for what you offer.
Be informative with what you deliver.
So, next time, please return the implementation.
In OO programming, we want to encapsulate as much as possible the data. Hide as much as possible the actual implementation, abstracting the types as high as possible.
In this context, I would answer only return what is meaningful. Does it makes sense at all for the return value to be the concrete class? Aka in your example, ask yourself: will anyone use a LinkedList-specific method on the return value of foo?
If no, just use the higher-level Interface. It's much more flexible, and allows you to change the backend
If yes, ask yourself: can't I refactor my code to return the higher-level interface? :)
The more abstract is your code, the less changes your are required to do when changing a backend. It's as simple as that.
If, on the other hand, you end up casting the return values to the concrete class, well that's a strong sign that you should probably return instead the concrete class. Your users/teammates should not have to know about more or less implicit contracts: if you need to use the concrete methods, just return the concrete class, for clarity.
In a nutshell: code abstract, but explicitly :)
In general, for a public facing interface such as APIs, returning the interface (such as List) over the concrete implementation (such as ArrayList) would be better.
The use of a ArrayList or LinkedList is an implementation detail of the library that should be considered for the most common use case of that library. And of course, internally, having private methods handing off LinkedLists wouldn't necessarily be a bad thing, if it provides facilities that would make the processing easier.
There is no reason that a concrete class shouldn't be used in the implementation, unless there is a good reason to believe that some other List class would be used later on. But then again, changing the implementation details shouldn't be as painful as long as the public facing portion is well-designed.
The library itself should be a black box to its consumers, so they don't really have to worry about what's going on internally. That also means that the library should be designed so that it is designed to be used in the way it is intended.
It doesn't matter all that much whether an API method returns an interface or a concrete class; despite what everyone here says, you almost never change the implementiation class once the code is written.
What's far more important: always use minimum-scope interfaces for your method parameters! That way, clients have maximal freedom and can use classes your code doesn't even know about.
When an API method returns ArrayList, I have absolutely no qualms with that, but when it demands an ArrayList (or, all to common, Vector) parameter, I consider hunting down the programmer and hurting him, because it means that I can't use Arrays.asList(), Collections.singletonList() or Collections.EMPTY_LIST.
As a rule, I only pass back internal implementations if I am in some private, inner workings of a library, and even so only sparingly. For everything that is public and likely to be called from the outside of my module I use interfaces, and also the Factory pattern.
Using interfaces in such a way has proven to be a very reliable way to write reusable code.
The main question has been answered already and you should always use the interface. I however would just like to comment on
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList.
If you are returning a data structure that you know has poor random access performance -- O(n) and typically a LOT of data -- there are other interfaces you should be specifying instead of List, like Iterable so that anyone using the library will be fully aware that only sequential access is available.
Picking the right type to return isn't just about interface versus concrete implementation, it is also about selecting the right interface.
You use interface to abstract away from the actual implementation. The interface is basically just a blueprint for what your implementation can do.
Interfaces are good design because they allow you to change implementation details without having to fear that any of its consumers are directly affected, as long as you implementation still does what your interface says it does.
To work with interfaces you would instantiate them like this:
IParser parser = new Parser();
Now IParser would be your interface, and Parser would be your implementation. Now when you work with the parser object from above, you will work against the interface (IParser), which in turn will work against your implementation (Parser).
That means that you can change the inner workings of Parser as much as you want, it will never affect code that works against your IParser parser interface.
In general use the interface in all cases if you have no need of the functionality of the concrete class. Note that for lists, Java has added a RandomAccess marker class primarily to distinguish a common case where an algorithm may need to know if get(i) is constant time or not.
For uses of code, Michael above is right that being as generic as possible in the method parameters is often even more important. This is especially true when testing such a method.
You'll find (or have found) that as you return interfaces, they permeate through your code. e.g. you return an interface from method A and you have to then pass an interface to method B.
What you're doing is programming by contract, albeit in a limited fashion.
This gives you enormous scope to change implementations under the covers (provided these new objects fulfill the existing contracts/expected behaviours).
Given all of this, you have benefits in terms of choosing your implementation, and how you can substitute behaviours (including testing - using mocking, for example). In case you hadn't guessed, I'm all in favour of this and try to reduce to (or introduce) interfaces wherever possible.

ImmutableList vs List- what should I cast it as?

I know its commonly accepted to cast all List implementations down to List. Whether it is a variable, method return, or a method parameter using an ArrayList, CopyOnWriteArrayList, etc.
List<Market> mkts = new ArrayList<>();
When I'm using a Guava ImmutableList, I have the sense it can arguably be an exception to this rule (especially if I'm building in-house, complicated business applications and not a public API). Because if I cast it down to list, the deprecated mutator methods will no longer be flagged as deprecated. Also, it no longer is identified as an immutable object which is a very important part of its functionality and identity.
List<Market> mkts = ImmutableList.of(mkt1,mkt2,mkt3);
Therefore it makes sense to pass it around as an ImmutableList right? I could even argue that its a good policy for an internal API to only accept ImmutableList, so mutability and multithreading on the client side won't wreck anything inside the libary.
ImmutableList<Market> mkts = ImmutableList.of(mkt1,mkt2,mkt3);
I know there is a risk of ImmutableList itself becoming deprecated, and the day Oracle decides to create its own ImmutableList will require a lot of refactoring. But is it arguable the pros of maintaining an ImmutableList cast can outweigh the cons?
I agree with your rationale. If you are using the Guava collection library and your lists are immutable then passing them as ImmutableList is a good idea.
However:
I know there is a risk of ImmutableList itself becoming deprecated, and the day Oracle decides to create its own ImmutableList will require a lot of refactoring.
The first scenario seems unlikely, but it is a risk you take whenever you use any 3rd-party library. But the flipside is that you could chose to not upgrade your application's Guava version if they (Google) gratuitously deprecated a class or method that you relied on.
UPDATE
Louis Wasserman (who works for Google) said in a comment:
"Guava provides pretty strong compatibility guarantees for non-#Beta APIs."
So we can discount the possibility of gratuitous API changes.
The second scenario is even more unlikely (IMO). And you can be sure that if Oracle did add an immutable list class or interface, that would not require you to refactor. Oracle tries really hard to avoid breaking existing code when they enhance the standard APIs.
But having said that, it is really up to you to weigh up the pros and cons ... and how you would deal with the cons should the eventuate.
Unfortunately, there's no corresponding interface in Java (and most probably never will be). So my take is to pretend that ImmutableList is an interface. :D But seriously, it add important information which shouldn't get lost.
The ancient rule it all comes from actually states something like "program against interfaces". IIRC at the time the rules was created, there was no Java around and "interface" means programming interface, i.e., the contract, not java interface.
A method like
void strange(ArrayList list) {...}
is obviously strange, as there's no reason not to use List. A signature containing ImmutableList has a good reason.
I know there is a risk of ImmutableList itself becoming deprecated, and the day Oracle decides to create its own ImmutableList will require a lot of refactoring.
You mean Java 18? Let's see, but Guava's ImmutableList is pretty good and there's not much point in designing such a class differently. So you can hope that most changes will be in your imports only. And by 2050 there'll be worse problems than this.
Keep using List rather than ImmutableList! There is no problem with that and no reason for your API to start using ImmutableLists explicitly for several reasons:
ImmutableList is Guava only and unlikely to become standard Java at any point. Don't tie your code and coding habits to a third party library (even if it is a cool one like Guava).
Using immutable objects is good practice in Java and of particular importance when developing an API (see Effective Java Item 15 - minimize mutability). It is a general concept that can be taken for granted and does not need to be conveyed in the name of interfaces. Equally, you would not consider calling a User class that is designed for inheritance UserThatCanBeSubclassed.
In the name of stability your API should NEVER start modifying a List that was passed into it and ALWAYS make a defensive copy when passing a List to a client. Introducing ImmutableList here would lure you and the clients of your API into a false sense of security and entice them to violate that rule.
I understand your dilemma.
Personnaly, I would advise to keep using List as the reference type (to be future-proof and benefit from polymorphism), and use an #Immutable annotation to convey the information that it is immutable.
Annotations are more visible than plain javadoc comments, and you can even use the one from JSR-305 (ex-JCIP).
Some static analysis tools can even detect it and verify that your object is not mutated.
I would rather stay with just List for method parameter. There is no much benefit to enforce the caller to pass ImmutableList - it's your own method and you won't mutate list anyway, but you'd have method more reusable and generic.
As a return type, I would go with ImmutableList to let method users know that this list cannot be modified.

Why did they decide to make interfaces have "Optional Operations"

ImmutableSet implements the Set interface. The functions that don't make sense to an ImmutableSet are now called "Optional Operations" for Set. I assume for situations like this. So ImmutableSet now throws an UnsupportedOperationException for many Optional Operations.
This seems backwards to me. I was taught that an Interface was a contract so that you could use impose functionality across different implementations. The approach of Optional Operations seem to fundamentally change(contradict?) what Interfaces are meant to do. Implementing this today I would have the Set Interface broken into two interfaces: one for one for immutable operations and a second extending those operations for mutators. (Very quick, off the cuff solution)
I understand that technology changes. I'm not saying It should be done one way or another. My question is, does this change reflect a change in some underlying philosophy for Java? Is it just more of a bandaid to make things backwards compatible? Did I have an incomplete understanding of Interfaces?
The Java Collections API Design FAQ answers this question in detail:
Q: Why don't you support immutability directly in the core collection interfaces so that you can do away with optional operations (and UnsupportedOperationException)?
A: This is the most controversial design decision in the whole API. Clearly, static (compile time) type checking is highly desirable, and is the norm in Java. We would have supported it if we believed it were feasible. Unfortunately, attempts to achieve this goal cause an explosion in the size of the interface hierarchy, and do not succeed in eliminating the need for runtime exceptions (though they reduce it substantially).
Doug Lea, who wrote a popular Java collections package that did reflect mutability distinctions in its interface hierarchy, no longer believes it is a viable approach, based on user experience with his collections package. In his words (from personal correspondence) "Much as it pains me to say it, strong static typing does not work for collection interfaces in Java."
To illustrate the problem in gory detail, suppose you want to add the notion of modifiability to the Hierarchy. You need four new interfaces: ModifiableCollection, ModifiableSet, ModifiableList, and ModifiableMap. What was previously a simple hierarchy is now a messy heterarchy. Also, you need a new Iterator interface for use with unmodifiable Collections, that does not contain the remove operation. Now can you do away with UnsupportedOperationException? Unfortunately not.
Consider arrays. They implement most of the List operations, but not remove and add. They are "fixed-size" Lists. If you want to capture this notion in the hierarchy, you have to add two new interfaces: VariableSizeList and VariableSizeMap. You don't have to add VariableSizeCollection and VariableSizeSet, because they'd be identical to ModifiableCollection and ModifiableSet, but you might choose to add them anyway for consistency's sake. Also, you need a new variety of ListIterator that doesn't support the add and remove operations, to go along with unmodifiable List. Now we're up to ten or twelve interfaces, plus two new Iterator interfaces, instead of our original four. Are we done? No.
Consider logs (such as error logs, audit logs and journals for recoverable data objects). They are natural append-only sequences, that support all of the List operations except for remove and set (replace). They require a new core interface, and a new iterator.
And what about immutable Collections, as opposed to unmodifiable ones? (i.e., Collections that cannot be changed by the client AND will never change for any other reason). Many argue that this is the most important distinction of all, because it allows multiple threads to access a collection concurrently without the need for synchronization. Adding this support to the type hierarchy requires four more interfaces.
Now we're up to twenty or so interfaces and five iterators, and it's almost certain that there are still collections arising in practice that don't fit cleanly into any of the interfaces. For example, the collection-views returned by Map are natural delete-only collections. Also, there are collections that will reject certain elements on the basis of their value, so we still haven't done away with runtime exceptions.
When all was said and done, we felt that it was a sound engineering compromise to sidestep the whole issue by providing a very small set of core interfaces that can throw a runtime exception.
In short, having interfaces like Set with optional operations was done to prevent an exponential explosion in the number of different interfaces needed. It is not as simple as just "immutable" and "mutable". Guava's ImmutableSet then had to implement Set to be interoperable with all other code which uses Sets. It's not ideal but there is really no better way to do it.

Why does the Collections class contain standalone (static) methods, instead of them being added to the List interface?

For all the methods in Collections that take a List as their first argument, why aren't those methods simply part of the List interface?
My intuition is: given a List object, that object itself should "know" how to perform on itself operations such as rotate(), shuffle(), or reverse(). But instead, as a Java programmer, I have to review both the methods in the List interface, as well as the static methods "over there" in the Collections class, to ensure I'm using a canonical solution.
Why were some methods placed as static standalone methods in the Collections class, instead of being added to the List interface (and presumably thus implemented by some existing or would-be base class)?
I'm trying to better understand the design decisions behind the Java collections framework.
Is there some compelling OO design principle here that I'm overlooking? Or was this distinction done simply for some practical, performance reason?
The point is that given suitable primitive operations (remove, set etc) a bunch of more high level operations (sort, shuffle, binary search) can be implemented once rather than being implemented by every single list implementation.
Effectively, java.util.Collections is like .NET's Enumerable class - full of general purpose methods which can work on any collection, so that they can share a single implementation and avoid duplication.
Rational Behind the List Interface's Methods
The List interface is a very core part of the Java runtime and is already a little onerous to fully implement all of the members when rolling out your own List implementations. So, adding extra methods that aren't directly related to the definition of a list is a bit extraneous. If you need those methods on a List implementation, why not subclass the interface and then require them?
If you where going to come along say in version 1.3 and add functionality to the List interface by adding new utility methods, you will break all past implementors of the interface.
From a Domain-Driven Design perspective, the utility methods in Collections are not part of the normal domain of a list.
Regarding OO design principals, I think it would be important to make the distinction between application OO design and language runtime OO design.
The authors of Java may do things very differently now that they have hindsight and perspective of many years of usage of the API. That said the C# IList interface is quite similar to Java's and C#'s authors did have the perspective.
It's certainly a judgement call at some level. I think the main trade-off to consider is this: When you add a method to an interface, every implementer of that interface must write code to implement it.
If the semantics of that method are such that different implementations of the interface will best implement those semantics in very different ways, then it's better to put it in the interface. (Of course, if the semantics simply can't be defined in terms of other methods in the interface, then it must be its own method in the interface.)
On the other hand, if the semantics are such that they can be defined in terms of other methods in the interface, and implementers of the interface will just tend to write the same code over and over again, then it's better to make a utility method that takes an instance of the interface as an argument.
They are utility methods and not core List functionality. The List interface would just get bloated if you added every possible operation you could do on a List. And the operations in Collections do not need to know about the internals of a List, they operate on the public interface so can happily live in an external class.
There are two explanations here:
Historical: Collections class was created after List interface. Designers chose to preserve backward compatibility of already existing interface. Otherwise a lot of developers would have to change their code.
Logical: The methods you are talking about do not require internal knowledge on List implementation and can be implemented over ANY collection implementing it.

When should I return the Interface and when the concrete class?

when programming in Java I practically always, just out of habit, write something like this:
public List<String> foo() {
return new ArrayList<String>();
}
Most of the time without even thinking about it. Now, the question is: should I always specify the interface as the return type? Or is it advisable to use the actual implementation of the interface, and if so, under what circumstances?
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList:
List bar = foo();
List myList = bar instanceof LinkedList ? new ArrayList(bar) : bar;
but that just seems horrible and my coworkers would probably lynch me in the cafeteria. And rightfully so.
What do you guys think? What are your guidelines, when do you tend towards the abstract solution, and when do you reveal details of your implementation for potential performance gains?
Return the appropriate interface to hide implementation details. Your clients should only care about what your object offers, not how you implemented it. If you start with a private ArrayList, and decide later on that something else (e.g., LinkedLisk, skip list, etc.) is more appropriate you can change the implementation without affecting clients if you return the interface. The moment you return a concrete type the opportunity is lost.
For instance, if I know that I will
primarily access the data in the list
randomly, a LinkedList would be bad.
But if my library function only
returns the interface, I simply don't
know. To be on the safe side I might
even need to copy the list explicitly
over to an ArrayList.
As everybody else has mentioned, you just mustn't care about how the library has implemented the functionality, to reduce coupling and increasing maintainability of the library.
If you, as a library client, can demonstrate that the implementation is performing badly for your use case, you can then contact the person in charge and discuss about the best path to follow (a new method for this case or just changing the implementation).
That said, your example reeks of premature optimization.
If the method is or can be critical, it might mention the implementation details in the documentation.
Without being able to justify it with reams of CS quotes (I'm self taught), I've always gone by the mantra of "Accept the least derived, return the most derived," when designing classes and it has stood me well over the years.
I guess that means in terms of interface versus concrete return is that if you are trying to reduce dependencies and/or decouple, returning the interface is generally more useful. However, if the concrete class implements more than that interface, it is usually more useful to the callers of your method to get the concrete class back (i.e. the "most derived") rather than aribtrarily restrict them to a subset of that returned object's functionality - unless you actually need to restrict them. Then again, you could also just increase the coverage of the interface. Needless restrictions like this I compare to thoughtless sealing of classes; you never know. Just to talk a bit about the former part of that mantra (for other readers), accepting the least derived also gives maximum flexibility for callers of your method.
-Oisin
Sorry to disagree, but I think the basic rule is as follows:
For input arguments use the most generic.
For output values, the most specific.
So, in this case you want to declare the implementation as:
public ArrayList<String> foo() {
return new ArrayList<String>();
}
Rationale:
The input case is already known and explained by everyone: use the interface, period. However, the output case can look counter-intuitive.
You want to return the implementation because you want the client to have the most information about what is receiving. In this case, more knowledge is more power.
Example 1: the client wants to get the 5th element:
return Collection: must iterate until 5th element vs return List:
return List: list.get(4)
Example 2: the client wants to remove the 5th element:
return List: must create a new list without the specified element (list.remove() is optional).
return ArrayList: arrayList.remove(4)
So it's a big truth that using interfaces is great because it promotes reusability, reduces coupling, improves maintainability and makes people happy ... but only when used as input.
So, again, the rule can be stated as:
Be flexible for what you offer.
Be informative with what you deliver.
So, next time, please return the implementation.
In OO programming, we want to encapsulate as much as possible the data. Hide as much as possible the actual implementation, abstracting the types as high as possible.
In this context, I would answer only return what is meaningful. Does it makes sense at all for the return value to be the concrete class? Aka in your example, ask yourself: will anyone use a LinkedList-specific method on the return value of foo?
If no, just use the higher-level Interface. It's much more flexible, and allows you to change the backend
If yes, ask yourself: can't I refactor my code to return the higher-level interface? :)
The more abstract is your code, the less changes your are required to do when changing a backend. It's as simple as that.
If, on the other hand, you end up casting the return values to the concrete class, well that's a strong sign that you should probably return instead the concrete class. Your users/teammates should not have to know about more or less implicit contracts: if you need to use the concrete methods, just return the concrete class, for clarity.
In a nutshell: code abstract, but explicitly :)
In general, for a public facing interface such as APIs, returning the interface (such as List) over the concrete implementation (such as ArrayList) would be better.
The use of a ArrayList or LinkedList is an implementation detail of the library that should be considered for the most common use case of that library. And of course, internally, having private methods handing off LinkedLists wouldn't necessarily be a bad thing, if it provides facilities that would make the processing easier.
There is no reason that a concrete class shouldn't be used in the implementation, unless there is a good reason to believe that some other List class would be used later on. But then again, changing the implementation details shouldn't be as painful as long as the public facing portion is well-designed.
The library itself should be a black box to its consumers, so they don't really have to worry about what's going on internally. That also means that the library should be designed so that it is designed to be used in the way it is intended.
It doesn't matter all that much whether an API method returns an interface or a concrete class; despite what everyone here says, you almost never change the implementiation class once the code is written.
What's far more important: always use minimum-scope interfaces for your method parameters! That way, clients have maximal freedom and can use classes your code doesn't even know about.
When an API method returns ArrayList, I have absolutely no qualms with that, but when it demands an ArrayList (or, all to common, Vector) parameter, I consider hunting down the programmer and hurting him, because it means that I can't use Arrays.asList(), Collections.singletonList() or Collections.EMPTY_LIST.
As a rule, I only pass back internal implementations if I am in some private, inner workings of a library, and even so only sparingly. For everything that is public and likely to be called from the outside of my module I use interfaces, and also the Factory pattern.
Using interfaces in such a way has proven to be a very reliable way to write reusable code.
The main question has been answered already and you should always use the interface. I however would just like to comment on
It is obvious that using the interface has a lot of advantages (that's why it's there). In most cases it doesn't really matter what concrete implementation is used by a library function. But maybe there are cases where it does matter. For instance, if I know that I will primarily access the data in the list randomly, a LinkedList would be bad. But if my library function only returns the interface, I simply don't know. To be on the safe side I might even need to copy the list explicitly over to an ArrayList.
If you are returning a data structure that you know has poor random access performance -- O(n) and typically a LOT of data -- there are other interfaces you should be specifying instead of List, like Iterable so that anyone using the library will be fully aware that only sequential access is available.
Picking the right type to return isn't just about interface versus concrete implementation, it is also about selecting the right interface.
You use interface to abstract away from the actual implementation. The interface is basically just a blueprint for what your implementation can do.
Interfaces are good design because they allow you to change implementation details without having to fear that any of its consumers are directly affected, as long as you implementation still does what your interface says it does.
To work with interfaces you would instantiate them like this:
IParser parser = new Parser();
Now IParser would be your interface, and Parser would be your implementation. Now when you work with the parser object from above, you will work against the interface (IParser), which in turn will work against your implementation (Parser).
That means that you can change the inner workings of Parser as much as you want, it will never affect code that works against your IParser parser interface.
In general use the interface in all cases if you have no need of the functionality of the concrete class. Note that for lists, Java has added a RandomAccess marker class primarily to distinguish a common case where an algorithm may need to know if get(i) is constant time or not.
For uses of code, Michael above is right that being as generic as possible in the method parameters is often even more important. This is especially true when testing such a method.
You'll find (or have found) that as you return interfaces, they permeate through your code. e.g. you return an interface from method A and you have to then pass an interface to method B.
What you're doing is programming by contract, albeit in a limited fashion.
This gives you enormous scope to change implementations under the covers (provided these new objects fulfill the existing contracts/expected behaviours).
Given all of this, you have benefits in terms of choosing your implementation, and how you can substitute behaviours (including testing - using mocking, for example). In case you hadn't guessed, I'm all in favour of this and try to reduce to (or introduce) interfaces wherever possible.

Categories