What is the difference when creating these two objects
Queue<String> test = new LinkedList<String>();
and
List<String> test2 = new LinkedList<String>();
What are the actual differences between test and test2? Are both of them LinkedList ? Are there performance differences or reasons to use one over the other?
The two statements you've written each construct a LinkedList<String> object to hold a list of strings, then assign it to a variable. The difference is in the type of the variable.
By assigning the LinkedList<String> to a variable of type Queue<String>, you can only access the methods in the LinkedList that are available in the Queue<String> interface, which includes support for enqueuing and dequeuing elements. This would be useful if you needed to write a program that used a queue for various operations and wanted to implement that queue by using a linked list.
By assigning the LinkedList<String> to a variable of type List<String>, you can only access the methods in the LinkedList that are available in the List<String> interface, which are normal operations for maintaining a sequence of elements. This would be useful, for example, if you needed to process a list of elements that could grow and shrink anywhere.
In short, the two lines create the same object but intend to use them in different ways. One says that it needs a queue backed by a linked list, while the other says that it needs a general sequence of elements backed by a linked list.
Hope this helps!
In the both the cases, you are instantiating LinkedList.
The difference is the types of the variables you use to refer to those instances.
test is of type Queue and test2 is of type List. Depending on the type of variable, you only get to invoke the methods which are specified on that particular type. I think this what matters for your situation.
Performance-wise, it's going to be the same, because the actual implementation that you are using in both the cases is same (LinkedList).
I feel both of them are pretty much same except that the type of methods you are going to expose. As LinkedList implements both the interfaces, so choosing one of them opens up access to methods of that interface type.
please take a look at these links for interface method declarations
http://docs.oracle.com/javase/6/docs/api/java/util/Queue.html
http://docs.oracle.com/javase/6/docs/api/java/util/List.html
i am not sure about the performance, though i guess it shouldn't be different as the object implementation is common.
Related
I have often seen declarations like List<String> list = new ArrayList<>(); or Set<String> set = new HashSet<>(); for fields in classes. For me it makes perfect sense to use the interfaces for the variable types to provide flexibility in the implementation. The examples above do still define which kind of Collections have to be used, respectively which operations are allowed and how it should behave in some cases (due to docs).
Now consider the case where actually only the functionality of the Collection (or even the Iterable) interface is required to use the field in the class and the kind of Collection doesn't actually matter or I don't want to overspecify it. So I choose for example HashSet as implementation and declare the field as Collection<String> collection = new HashSet<>();.
Should the field then actually be of type Set in this case? Is this kind of declaration bad practice, if so, why? Or is it good practice to specify the actual type as less as possible (and still provide all required methods). The reason why I ask this is because I have hardly ever seen such a declaration and lately I get more an more in the situation where I only need to specify the functionality of the Collection interface.
Example:
// Only need Collection features, but decided to use a LinkedList
private final Collection<Listener> registeredListeners = new LinkedList<>();
public void init() {
ExampleListener listener = new ExampleListener();
registerListenerSomewhere(listener);
registeredListeners.add(listener);
listener = new ExampleListener();
registerListenerSomewhere(listener);
registeredListeners.add(listener);
}
public void reset() {
for (Listener listener : registeredListeners) {
unregisterListenerSomewhere(listener);
}
registeredListeners.clear();
}
Since your example uses a private field it doesn't matter all that much about hiding the implementation type. You (or whoever is maintaining this class) can always just go look at the field's initializer to see what it is.
Depending on how it's used, though, it might be worth declaring a more specific interface for the field. Declaring it to be a List indicates that duplicates are allowed and that ordering is significant. Declaring it to be a Set indicates that duplicates aren't allowed and that ordering is not significant. You might even declare the field to have a particular implementation class if there's something about it that's significant. For example, declaring it to be LinkedHashSet indicates that duplicates aren't allowed but that ordering is significant.
The choice of whether to use an interface, and what interface to use, becomes much more significant if the type appears in the public API of the class, and on what the compatibility constraints on this class are. For example, suppose there were a method
public ??? getRegisteredListeners() {
return ...
}
Now the choice of return type affects other classes. If you can change all the callers, maybe it's no big deal, you just have to edited other files. But suppose the caller is an application that you have no control over. Now the choice of interface is critical, as you can't change it without potentially breaking the applications. The rule here is usually to choose the most abstract interface that supports the operations you expect callers to want to perform.
Most of the Java SE APIs return Collection. This provides a fair degree of abstraction from the underlying implementation, but it also provides the caller a reasonable set of operations. The caller can iterate, get the size, do a contains check, or copy all the elements to another collection.
Some code bases use Iterable as the most-abstract interface to return. All it does is allow the caller to iterate. Sometimes this is all that's necessary, but it might be somewhat limiting compared to Collection.
Another alternative is to return a Stream. This is helpful if you think the caller might want to use stream's operations (such as filter, map, find, etc.) instead of iterating or using collection operations.
Note that if you choose to return Collection or Iterable, you need to make sure that you return an unmodifiable view or make a defensive copy. Otherwise, callers could modify your class's internal data, which would probably lead to bugs. (Yes, even an Iterable can permit modification! Consider getting an Iterator and then calling the remove() method.) If you return a Stream, you don't need to worry about that, since you can't use a Stream to modify the underlying source.
Note that I turned your question about the declaration of a field into a question about the declaration of method return types. There is this idea of "program to the interface" that's quite prevalent in Java. In my opinion it doesn't matter very much for local variables (which is why it's usually fine to use var), and it matters little for private fields, since those (almost) by definition affect only the class in which they're declared. However, the "program to the interface" principle is very important for API signatures, so those cases are where you really need to think about interface types. Private fields, not so much.
(One final note: there is a case where you need to be concerned about the types of private fields, and that's when you're using a reflective framework that manipulates private fields directly. In that case, you need to think of those fields as being public -- just like method return types -- even though they're not declared public.)
As with all things, it's a question of tradeoffs. There are two opposing forces.
The more generic the type, the more freedom the implementation has. If you use Collection you're free to use an ArrayList, HashSet, or LinkedList without affecting the user/caller.
The more generic the return type, the less features there are available to the user/caller. A List provides index-based lookup. A SortedSet makes it easy to get contiguous subsets via headSet, tailSet, and subSet. A NavigableSet provides efficient O(log n) binary search lookup methods. If you return Collection, none of these are available. Only the most generic access functions can be used.
Furthermore, the sub-types guarantee special properties that Collection does not: Sets hold unique items. SortedSets are sorted. Lists have an order; they're not unordered bags of items. If you use Collection then the user/caller can't necessarily assume that these properties hold. They may be forced to code defensively and, for instance, handle duplicate items even if you know there won't be duplicates.
A reasonable decision process might be:
If O(1) indexed access is guaranteed, use List.
If elements are sorted and unique, use SortedSet or NavigableSet.
If element uniqueness is guaranteed and order is not, use Set.
Otherwise, use Collection.
It really depends on what you want to do with the collection object.
Collection<String> cSet = new HashSet<>();
Collection<String> cList = new ArrayList<>();
Here in this case if you want you can do :
cSet = cList;
But if you do like :
Set<String> cSet = new HashSet<>();
the above operation is not permissible though you can construct a new list using the constructor.
Set<String> set = new HashSet<>();
List<String> list = new ArrayList<>();
list = new ArrayList<>(set);
So basically depending on the usage you can use Collection or Set interface.
I know that an instance of ArrayList can be declared in the two following ways:
ArrayList<String> list = new ArrayList<String>();
and
List<String> list = new ArrayList<String();
I know that using the latter declaration provides the flexibility of changing the implementation from one List subclass to another (eg, from ArrayList to LinkedList).
But, what is the difference in the time and space complexity in the two? Someone told me the former declaration will ultimately make the heap memory run out. Why does this happen?
Edit: While performing basic operations like add, remove and contains does the performance differ in the two implementations?
The space complexity of your data structure and the time complexity of different operations on it are all implementation specific. For both of the options you listed above, you're using the same data structure implementation i.e. ArrayList<String>. What type you declare them as on the left side of the equal sign doesn't affect complexity. As you said, being more general with type on the left of the equal sign just allows for swapping out of implementations.
What determines the behaviour of an object is its actual class/implementation. In your example, the list is still an ArrayList, so the behaviour won't change.
Using a List<> declaration instead of an ArrayList<> means that you will only use the methods made visible by the List interface and that if later you need another type of list, it will be easy to change it (you just change the call to new). This is why we often prefer it.
Example: you first use an ArrayList but then find out that you often need to delete elements in the middle of the list. You would thus consider switching to a LinkedList. If you used the List interface everywhere (in getter/setter, etc.), then the only change in your code will be:
List<String> list = new LinkedList<>();
but if you used ArrayList, then you will need to refactor your getter/setter signatures, the potential public methods, etc.
I don't understand difference between:
ArrayList<Integer> list = new ArrayList<Integer>();
Collection<Integer> list1 = new ArrayList<Integer>();
Class ArrayList extends class which implements interface Collection, so Class ArrayList implements Collection interface. Maybe list1 allows us to use static methods from the Collection interface?
An interface has no static methods [in Java 7]. list1 allows to access only the methods in Collection, whereas list allows to access all the methods in ArrayList.
It is preferable to declare a variable with its least specific possible type. So, for example, if you change ArrayList into LinkedList or HashSet for any reason, you don't have to refactor large portions of the code (for example, client classes).
Imagine you have something like this (just for illustrational purposes, not compilable):
class CustomerProvider {
public LinkedList<Customer> getAllCustomersInCity(City city) {
// retrieve and return all customers for that city
}
}
and you later decide to implement it returning a HashSet. Maybe there is some client class that relies on the fact that you return a LinkedList, and calls methods that HashSet doesn't have (e.g. LinkedList.getFirst()).
That's why you better do like this:
class CustomerProvider {
public Collection<Customer> getAllCustomersInCity(City city) {
// retrieve and return all customers for that city
}
}
What we're dealing with here is the difference between interface and implementation.
An interface is a set of methods without any regard to how those methods are implemented. When we instantiate an object as having a type that is actually an interface, what we're saying is that it is an object that implements all of the methods in that interface... but doesn't provide is with access to any of the methods in the class that actually provides those implementations.
When you instantiate an object with the type of an implementing class, then you have access to all of relevant methods of that class. Since that class is implementing an interface, you have access to the methods specified in the interface, plus any extras provided by the implementing class.
Why would you want to do this? Well, by restricting the type of your object to the interface, you can switch in new implementations without worrying about changing the rest of your code. This makes it a whole lot more flexible.
The difference, as others have said, is that you are limited to the methods defined by the Collection interface when you specify that as your variable type. But that doesn't answer the question of why you would want to do this.
The reason is that the choice of data type provides information to the people using the code. Especially when used as the parameter or return type from a function (where outside programmers may have no access to the internals).
In order of specificity, here is what different type choices might tell you:
Collection - a group of objects, with no further guarantees. The consumer of this object can iterate over the collection (with no guarantees as to iteration order), and can learn its size, but cannot do anything else.
List - a group of objects that have a specific order. When you iterate over these objects, you will always get them in the same order. You can also retrieve specific items from the collection by index, but you cannot make any assumptions about the performance of such retrieval.
ArrayList - a group of objects that have a specific order, and may be accessed by index in constant time.
And although you didn't ask about them, here are some other collection classes:
Set a group of objects that is guaranteed to contain no duplicates per the equals() method. There are no guarantees regarding the iteration order of these objects.
SortedSet a group of objects that contains no duplicates, and will always iterate in a specific order (although that specific order is not guaranteed by the collection).
TreeSet a group of ordered objects with no duplicates, that exhibits O(logN) insert and retrieval times.
HashSet a group of objects with no duplicates, that does not have an inherent order, but provides (amortized) constant-time access.
The only difference is that you're providing access to list1 through the Collection interface, whereas you provide access to list2 through the ArrayList interface. Sometimes, providing access through a restricted interface is useful, in that it promotes encapsulation and reduces dependence on implementation details.
When you perform operations on "list1", you'll only be able to access methods from the Collection interface (get, size, etc.). By declaring "list" as an ArrayList, you gain access to additional methods only defined in the ArrayList class (ensureCapacity and trimToSize, for example.
It's typically best practice to declare the variable as the least specific class you need. So, if you only need the methods from Collection, use it. Typically in this case, that would mean using List, which lets you know it's ordered and can handle duplicates.
Using the least specific class/interface allows you to freely change the implementation later. For example, if you later learn that a LinkedList would be a better implementation to use, you could change it without breaking all your code if you define the variable to be a List.
I have a scenario where I have to work with multiple lists of data in a java app...
Now each list can have any number of elements in it... Also, the number of such lists is also not known initially...
Which approach will suit my scenario best? I can think of arraylist of list, or list of list or list of arraylist etc(ie combinations of arraylist + list/ arraylist+arraylist/list+list)... what I would like to know is--
(1) Which of the above (or your own solution) will be easiest to manage- viz to store/fetch data
(2) Which of the above will use the least amount of memory?
I would declare my variable as:
List<List<DataType>> lists = new ArrayList<List<DataType>>();
There is a slight time penalty in accessing list methods through a variable of an interface type, but this, I think, is more than balanced by the flexibility you have of changing the type as you see fit. (For instance, if you decided to make lists immutable, you could do that through one of the methods in java.util.Collections, but not if you had declared it to be an ArrayList<List<DataType>>.)
Note that lists will have to hold instances of some concrete class that implements List<DataType>, since (as others have noted) List is an interface, not a class.
List is an interface. ArrayList is one implementation of List.
When you construct a List you must choose a specific concrete type (e.g. ArrayList). When you use the list it is better to program against the interface if possible. This prevents tight coupling between your code and the specific List implementation, allowing you to more easily change to another List implementation later if you wish.
If you know a way to identify which list you will be dealing with, use a map of lists.
Map<String,List<?>> = new HashMap<String,List<?>>();
This way you would avoid having to loop through the outer elements to reach the actual list. Hash map performs better than an iterator.
For example:
List<String> list = new ArrayList<String>();
vs
ArrayList<String> list = new ArrayList<String>();
What is the exact difference between these two?
When should we use the first one and when should we use the second?
Use the first form whenever possible (I would even say: use Collection if sufficient). This is especially important when accepting input from client code (method arguments). Sometimes, for the convenience of the client code/library user it is better to accept the most generic input you can (like Collection) and deal with it rather than forcing the user to convert arguments all the time (user has LinkedList but the API requires ArrayList - terrible).
Use the second form only when you need to invoke methods on list variable that are defined in ArrayList but not in List (like ArrayList.trimToSize()). Also when returning data to the user consider (but this is not the rule of thumb) returning more specific types. E.g. consider List over Collection so the client code can easier deal with the result. However! Returning too specific types (e.g. ArrayList) will lock your implementation for the future, so try to find a compromise.
This is a general rule - use the most general type you can. Even more general: use common sense.
List is not a superclass, it is an interface.
By using List rather than ArrayList, you make sure that users of your list will only use methods that are defined on List. Meaning that you can change the implementation to (for example) Vector, without breaking the existing code.
So, use the first form.
The first form is the most desirable one because you hide the implementation (ArrayList) from the rest of your code and ensure your code only works with the abstraction (List). The advantage of this is that your code will be more generic and therefore easier to adapt, for example when you change from using an ArrayList to a LinkedList, Vector or own List implementation. It also means local changes are less likely to cause changes in other parts of your code ('ripple-effect'), increasing your code's maintainability.
You need the second form when you want to do things with your variable that are not offered by the List interface, for example ensureCapacity or trimToSize
EDIT: extra explanation of changing the implementation
Here is an example of declaring a variable as a Collection (an even more generic interface in java.util):
public class Example {
private Collection<String> greetings = new ArrayList<String>();
public void addGreeting(String greeting) {
greetings.add(greeting);
}
}
Now suppose you want to change the implementation in order to store unique greetings, and therefore switch from ArrayList to HashSet. Both are implementations of the Collection interface. This would be easy in this case because all the existing code treats the greetings field as a Collection:
public class Example {
private Collection<String> greetings = new HashSet<String>();
public void addGreeting(String greeting) {
greetings.add(greeting);
}
}
There is an exception. If there is code which casts the greetings field back to its implementation, this makes that code 'implementation-aware', violating the information-hiding you tried to achieve, for example:
ArrayList<String> greetingList = (ArrayList<String>) greetings;
greetingList.ensureCapacity(42);
Such code would cause a runtime error 'java.lang.ClassCastException: java.util.HashSet incompatible with java.util.ArrayList' if you change the implementation to HashSet, so this practice should be avoided if possible.
There are some advantages of using interfaces against concrete classes:
You are not stuck to concrete implementation (you can easy change it without modifying code)
Your code is clearer as no methods of concrete class are available
You need concrete implementation only in case if you USE some features of it.
E.g. we have Matrix interface and have two concrete implementations SparseMathix and FullMatrix. If you want to effectively multiply them you CAN use some implementation details of SparseMatrix otherwise performance MAY be too slow.