In this answer, I tried to explain why the Collection method add has the signature add(E) while remove is remove(Object). I came up with the idea that the correct signature should be
public boolean remove(? super E element)
And since this is invalid syntax in Java, they had to stick to Object, which just happens to be super E (supertype of E) for any E. The following code explains why this makes sense:
List<String> strings = new ArrayList();
strings.add("abc");
Object o = "abc"; // runtime type is String
strings.remove(o);
Since the runtime type is String, this succeeds. If the signature were remove(E), this would cause an error at compile-time but not at runtime, which makes no sense. However, the following should raise an error at compile time, because the operation is bound to fail because of its types, which are known at compile-time:
strings.remove(1);
The remove takes an Integer as an argument, which is not super String, which means it could never actually remove anything from the collection.
If the remove method was defined with the parameter type ? super E, situations like the above could be detected by the compiler.
Question:
Am I correct with my theory that remove should have a contravariant ? super E parameter instead of Object, so that type mismatches as shown in the above example can be filtered out by the compiler? And is it correct that the creators of the Java Collections Framework chose to use Object instead of ? super E because ? super E would cause a syntax error, and instead of complicating the generic system they simply agreed to use Object instead of super?
Also, should the signature for removeAll be
public boolean removeAll(Collection<? super E> collection)
Note that I do not want to know why the signature is not remove(E), which is asked and explained in this question. I want to know if remove should be contravariant (remove(? super E)), while remove(E) represents covariance.
One example where this does not work would be the following:
List<Number> nums = new ArrayList();
nums.add(1);
nums.remove(1); // fails here - Integer is not super Number
Rethinking my signature, it should actually allow sub- and supertypes of E.
This is a faulty assumption:
because the operation is bound to fail because of its types, which are known at compile-time
It's the same reasoning that .equals accepts an object: objects don't necessarily need to have the same class in order to be equal. Consider this example with different subtypes of List, as pointed out in the question #Joe linked:
List<ArrayList<?>> arrayLists = new ArrayList<>();
arrayLists.add(new ArrayList<>());
LinkedList<?> emptyLinkedList = new LinkedList<>();
arrayLists.remove(emptyLinkedList); // removes the empty ArrayList and returns true
This would not be possible with the signature you proposed.
remove(? super E) is entirely equivalent to remove(Object), because Object is itself a supertype of E, and all objects extend Object.
I think that designers of the collections framework made a decision to keep remove untyped, because it is a valid solution that lets you keep a post-condition without introducing a pre-condition or compromising type safety.
The post-condition of a c.remove(x) is that after the call x is not present in c. Method signature remove(Object) lets you pass any object or null, with no further checks. Method signature ? super E, on the other hand, introduces a pre-condition on the type of x, requiring it to be related to E.
Each pre-condition that you introduce in an API makes your API harder to use. If removing a pre-condition lets you keep all your post-conditions, it is a good idea to remove the pre-condition.
Note that removing an object of a wrong type is not necessarily an error. Here is a small example:
class Segregator {
private final Set<Integer> ints = ...
private final Set<String> strings = ...
public void addAll(List<Object> data) {
for (Object o : data) {
if (o instanceof Integer) {
ints.add((Integer)o);
}
if (o instanceof String) {
strings.add((String)o);
}
}
}
// Here is the method that becomes easier to write:
public void removeAll(List<Object> data) {
for (Object o : data) {
ints.remove(o);
strings.remove(o);
}
}
}
Note how removeAll method's code is simpler than the code of addAll, because remove does not care about the type of the object that you pass to it.
In your question you already explained why it can't (or shouldn't) be remove(E).
But there is also a reason why it shouldn't be remove(? super E). Imagine some piece of code where you have an object of unknown type. You still might want to try to remove that object from that list. Consider this code:
public void removeFromList(Object o, Collection<String> col) {
col.remove(o);
}
Now your argument was, that remove(? super E) is more typesafe way. But I say it doesn't have to be. Look at the Javadoc of remove(). It says:
More formally, removes an element e such that (o==null ? e==null : o.equals(e)), if this collection contains one or more such elements.
So all the preconditions the parameter has to match is that you can use == and equals() on it, which is the case with Object. This still enables you to try to remove an Integer from a Collection<String>. It just wouldn't do anything.
Related
This question already has answers here:
What are the reasons why Map.get(Object key) is not (fully) generic
(11 answers)
Closed 6 months ago.
Why isn't Collection.remove(Object o) generic?
Seems like Collection<E> could have boolean remove(E o);
Then, when you accidentally try to remove (for example) Set<String> instead of each individual String from a Collection<String>, it would be a compile time error instead of a debugging problem later.
remove() (in Map as well as in Collection) is not generic because you should be able to pass in any type of object to remove(). The object removed does not have to be the same type as the object that you pass in to remove(); it only requires that they be equal. From the specification of remove(), remove(o) removes the object e such that (o==null ? e==null : o.equals(e)) is true. Note that there is nothing requiring o and e to be the same type. This follows from the fact that the equals() method takes in an Object as parameter, not just the same type as the object.
Although, it may be commonly true that many classes have equals() defined so that its objects can only be equal to objects of its own class, that is certainly not always the case. For example, the specification for List.equals() says that two List objects are equal if they are both Lists and have the same contents, even if they are different implementations of List. So coming back to the example in this question, it is possible to have a Map<ArrayList, Something> and for me to call remove() with a LinkedList as argument, and it should remove the key which is a list with the same contents. This would not be possible if remove() were generic and restricted its argument type.
Josh Bloch and Bill Pugh refer to this issue in Java Puzzlers IV: The
Phantom Reference Menace, Attack of the Clone, and Revenge of The
Shift.
Josh Bloch says (6:41) that they attempted to generify the get method
of Map, remove method and some other, but "it simply didn't work".
There are too many reasonable programs that could not be generified if
you only allow the generic type of the collection as parameter type.
The example given by him is an intersection of a List of Numbers and a
List of Longs.
Because if your type parameter is a wildcard, you can't use a generic remove method.
I seem to recall running into this question with Map's get(Object) method. The get method in this case isn't generic, though it should reasonably expect to be passed an object of the same type as the first type parameter. I realized that if you're passing around Maps with a wildcard as the first type parameter, then there's no way to get an element out of the Map with that method, if that argument was generic. Wildcard arguments can't really be satisfied, because the compiler can't guarantee that the type is correct. I speculate that the reason add is generic is that you're expected to guarantee that the type is correct before adding it to the collection. However, when removing an object, if the type is incorrect then it won't match anything anyway. If the argument were a wildcard the method would simply be unusable, even though you may have an object which you can GUARANTEE belongs to that collection, because you just got a reference to it in the previous line....
I probably didn't explain it very well, but it seems logical enough to me.
In addition to the other answers, there is another reason why the method should accept an Object, which is predicates. Consider the following sample:
class Person {
public String name;
// override equals()
}
class Employee extends Person {
public String company;
// override equals()
}
class Developer extends Employee {
public int yearsOfExperience;
// override equals()
}
class Test {
public static void main(String[] args) {
Collection<? extends Person> people = new ArrayList<Employee>();
// ...
// to remove the first employee with a specific name:
people.remove(new Person(someName1));
// to remove the first developer that matches some criteria:
people.remove(new Developer(someName2, someCompany, 10));
// to remove the first employee who is either
// a developer or an employee of someCompany:
people.remove(new Object() {
public boolean equals(Object employee) {
return employee instanceof Developer
|| ((Employee) employee).company.equals(someCompany);
}});
}
}
The point is that the object being passed to the remove method is responsible for defining the equals method. Building predicates becomes very simple this way.
Assume one has a collection of Cat, and some object references of types Animal, Cat, SiameseCat, and Dog. Asking the collection whether it contains the object referred to by the Cat or SiameseCat reference seems reasonable. Asking whether it contains the object referred to by the Animal reference may seem dodgy, but it's still perfectly reasonable. The object in question might, after all, be a Cat, and might appear in the collection.
Further, even if the object happens to be something other than a Cat, there's no problem saying whether it appears in the collection--simply answer "no, it doesn't". A "lookup-style" collection of some type should be able to meaningfully accept reference of any supertype and determine whether the object exists within the collection. If the passed-in object reference is of an unrelated type, there's no way the collection could possibly contain it, so the query is in some sense not meaningful (it will always answer "no"). Nonetheless, since there isn't any way to restrict parameters to being subtypes or supertypes, it's most practical to simply accept any type and answer "no" for any objects whose type is unrelated to that of the collection.
I always figured this was because remove() has no reason to care what type of object you give it. It's easy enough, regardless, to check if that object is one of the ones the Collection contains, since it can call equals() on anything. It's necessary to check type on add() to ensure that it only contains objects of that type.
It was a compromise. Both approaches have their advantage:
remove(Object o)
is more flexible. For example it allows to iterate through a list of numbers and remove them from a list of longs.
code that uses this flexibility can be more easily generified
remove(E e) brings more type safety to what most programs want to do by detecting subtle bugs at compile time, like mistakenly trying to remove an integer from a list of shorts.
Backwards compatibility was always a major goal when evolving the Java API, therefore remove(Object o) was chosen because it made generifying existing code easier. If backwards compatibility had NOT been an issue, I'm guessing the designers would have chosen remove(E e).
Remove is not a generic method so that existing code using a non-generic collection will still compile and still have the same behavior.
See http://www.ibm.com/developerworks/java/library/j-jtp01255.html for details.
Edit: A commenter asks why the add method is generic. [...removed my explanation...] Second commenter answered the question from firebird84 much better than me.
Another reason is because of interfaces. Here is an example to show it :
public interface A {}
public interface B {}
public class MyClass implements A, B {}
public static void main(String[] args) {
Collection<A> collection = new ArrayList<>();
MyClass item = new MyClass();
collection.add(item); // works fine
B b = item; // valid
collection.remove(b); /* It works because the remove method accepts an Object. If it was generic, this would not work */
}
Because it would break existing (pre-Java5) code. e.g.,
Set stringSet = new HashSet();
// do some stuff...
Object o = "foobar";
stringSet.remove(o);
Now you might say the above code is wrong, but suppose that o came from a heterogeneous set of objects (i.e., it contained strings, number, objects, etc.). You want to remove all the matches, which was legal because remove would just ignore the non-strings because they were non-equal. But if you make it remove(String o), that no longer works.
What is the difference between Collections.sort(list) and Collections.sort(list,null)
I supposed both of them compared elements in the list in their natural order.
So I tried these two codes:
CODE 1:
List<Object> list=Arrays.asList("hello",123);
Collections.sort(list);
CODE 2:
List<Object> list=Arrays.asList("hello",123);
Collections.sort(list,null);
The latter compiles but former doesn't giving the expected compiler error that instances of class Object are not comparable.
Why latter does not give compile time error.
EDIT: Based on the comment given below. I understand why latter doesn't give compile time error but on running it throws ClassCastException : String cannot be converted into Integer . How it deduced that runtime objects are String and Integer because what I think
public static sort(List<Object> list) ---> Since list was of type object
{
// For instances of object in list call the compareTo method
}
}
Java's generic types are checked at compile time. If you violate the constraits, you can't even compile. The first method is defined as:
<T extends Comparable<? super T>> void sort(List<T> list)
That requires that the List you use is of a type that extend Comparable, specifically some Comparable<X> where X may be any superclass of T. Sounds complicated but doesn't even matter here (try understanding http://yzaslavs.blogspot.de/2010/07/generics-pecs-principle-step-by-step.html if you're interested in that part). List<Object> does not match the first part already. Object doesn't implement any Comparable. => Compiler says no.
The second one is defined as
<T> void sort(List<T> list, Comparator<? super T> c)
that no longer requires that the type of the List has any special property. Any T will work. The only requirement is that you can provide an implementation of Comparator that is able to sort T or a super type. null is like a joker and fits anything. The compiler will not complain even if using null is probably wrong. You do see the problem at runtime.
The reason for
Exception in thread "main" java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
at java.lang.Integer.compareTo(Integer.java:52)
at java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:290)
at java.util.ComparableTimSort.sort(ComparableTimSort.java:157)
at java.util.Arrays.sort(Arrays.java:537)
at java.util.TimSort.sort(TimSort.java:178)
at java.util.TimSort.sort(TimSort.java:173)
at java.util.Arrays.sort(Arrays.java:659)
at java.util.Collections.sort(Collections.java:217)
at Main.main(Main.java:9)
is that at "TimSort.java:178" it does
static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c) {
if (c == null) {
Arrays.sort(a, lo, hi);
return;
}
which falls back to natural sorting like your first call would do. However it's just a Object[] array at that point and nothing can guarantee that types are actually comparable. It simply casts the types and that fails depending on your luck & content of the list either in Integer.compareTo( ) or String.compareTo( ) because those methods require their own type.
These are two different methods in Collections
//here the elements in list should impl. Comparable
Collections.sort(list)
//here we need a Comparator object, (null is Comparator obj too)
Collections.sort(list, null)
Now comes to the question of runtime classcast problem.
Java converts your list into array to do the sort in background. If your Comparator is null, java will cast the element to Comparable to do sort. Fortunately, the two elements in your list both (String and Integer) implemented Comparable. so till here no Exception.
You have only two elements (2<7 7 is the insertionsort threshold) in your list, so java just simply do insertion sort. Take the Integer, and call the compareTo() method with your string as parameter. Here java cast the parameter to Integer, so that it can compare. As you've seen, String cannot be cast to Integer, you got that Exception.
There is no difference between these two, Is clearly mentioned in api doc.
Refer Collections
> what is null?
According to JSL -
There is also a special null type, the type of the expression null,
which has no name. Because the null type has no name, it is impossible
to declare a variable of the null type or to cast to the null type.
The null reference is the only possible value of an expression of null
type. The null reference can always be cast to any reference type. In
practice, the programmer can ignore the null type and just pretend
that null is merely a special literal that can be of any reference
type.
As null can be reference of any type so it can be reference of Comparator as well, That is why compiler accepts Collections.sort(list,null);.
Where as Collections.sort(list,new Object()); gives compile time exception.
At the time of comparison check compareTo Method will be called Where in Integer.compareTo method it generate ClassCastException.
I am reading the java tutorial about Wildcards in Generics. In the following code:
void printCollection(Collection<Object> c) {
for (Object e : c) {
System.out.println(e);
}
}
Does this means the collection c take type object as its elements, and we can not call c.add("apple"),
because "apple" is a string and the for loop takes any object elements from collection c?
But I do not understand the following code,
void printCollection(Collection<?> c) {
for (Object e : c) {
System.out.println(e);
}
}
This code uses wildcards, meaning "a collection whose element type matches anything." Does this mean we can add any type of object to it, such as c.add("string");,
c.add(1);, and c.add(new apple()); ?
and the for loop take any object e from collection c, if c is not an object type, we say c's elements are Integer. Does this code works? Does this mean it should be cast?
You got it almost exactly backwards.
A Collection<Object> can contain Object and subclasses of it, and since everything (including String) is a subclass of Object, you can add anything to such a collection. However, you cannot make any assumptions about its contents except that they're Objects.
On the other hand, A Collection<?> contains only instances of a specific unknown type (and its subclasses), but since you don't know which type it is, you cannot add anything (except null) to such a collection, nor make any assumptions about its conents (except that they're Objects, because everything is).
In Angelika Langer's Java Generics FAQ, question "What is the difference between the unbounded wildcard parameterized type and the raw type?" (link) you'll see that Collection<?> and Collection<Object> are almost equivalent.
In the case of the second statement the "?" wildcard means that the generic is not defined. The result is that the type is bound to "Object" because this is the default bound for no declaration.
In fact even "Integer" is a subclass of Object. If you mean "int" you are right, thats a primitive and not a derivate of Object, but you can't put it into a Collection, since a Collection only allows derivates of Object.
And to the question if the elements put into the collection should be casted. No, thats not necessary since they are clear derivate classes from Object. The compiler does not need any explicit cast information there, he resolves the right class definition automatically.
Is it to maintain backwards compatibility with older (un-genericized) versions of Collection? Or is there a more subtle detail that I am missing? I see this pattern repeated in remove also (remove(Object o)), but add is genericized as add(E e).
contains() takes an Object because the object it matches does not have to be the same type as the object that you pass in to contains(); it only requires that they be equal. From the specification of contains(), contains(o) returns true if there is an object e such that (o==null ? e==null : o.equals(e)) is true. Note that there is nothing requiring o and e to be the same type. This follows from the fact that the equals() method takes in an Object as parameter, not just the same type as the object.
Although it may be commonly true that many classes have equals() defined so that its objects can only be equal to objects of its own class, that is certainly not always the case. For example, the specification for List.equals() says that two List objects are equal if they are both Lists and have the same contents, even if they are different implementations of List. So coming back to the example in this question, it is possible to have a Collection<ArrayList> and for me to call contains() with a LinkedList as argument, and it might return true if there is a list with the same contents. This would not be possible if contains() were generic and restricted its argument type to E.
In fact, the fact that contains() takes any object as an argument allows an interesting use where you can to use it to test for the existence of an object in the collection that satisfies a certain property:
Collection<Integer> integers;
boolean oddNumberExists = integers.contains(new Object() {
public boolean equals(Object e) {
Integer i = (Integer)e;
if (i % 2 != 0) return true;
else return false;
}
});
Answered here.
Why aren't Java Collections remove methods generic?
In short, they wanted to maximize backwards compatibility, because collections have been introduced long before generics.
And to add from me: the video he's referring is worth watching.
http://www.youtube.com/watch?v=wDN_EYUvUq0
update
To clarify, the man who said that (in the video) was one of the people who updated java maps and collections to use generics. If he doesn't know, then who.
It is because the contains function utilizes the equals function, and the equals function is defined in the base Object class with a signature of equals(Object o) rather than equals(E e) (since not all classes are generic). Same case with the remove function - it traverses the collection using the equals function which takes an Object argument.
This doesn't directly explain the decision however, as they could've still used type E and allowed it to be automatically cast to type Object on the call to equals; but I imagine they wanted to allow the function to be called on other Object types. There's nothing wrong with having a Collection<Foo> c; and then calling c.contains(somethingOfTypeBar) - it will always return false, and so it eliminates the need for a cast to type Foo (which can throw an exception) or, to protect from the exception, a typeof call. So you can imagine if you're iterating over something with mixed types and calling contains on each of the elements, you can simply use the contains function on all of them rather than needing guards.
It's actually reminiscent of the "newer" loosely-typed languages, when you look at it that way...
Because otherwise it could have only be compared to the exact match of parameter type, specifically wildcarded collections would have stopped working, e.g.
class Base
{
}
class Derived
extends Base
{
}
Collection< ? extends Base > c = ...;
Derived d = ...;
Base base_ref = d;
c.contains( d ); // Would have produced compile error
c.contains( base_ref ); // Would have produced compile error
EDIT
For doubters who think that's not one of the reasons, here is a modified array list with a would be generified contains method
class MyCollection< E > extends ArrayList< E >
{
public boolean myContains( E e )
{
return false;
}
}
MyCollecttion< ? extends Base > c2 = ...;
c2.myContains( d ); // does not compile
c2.myContains( base_ref ); // does not compile
Basically contains( Object o ) is a hack to make this very common use case to work with Java Generics.
"does that basket of apples contain this orange?"
clearly a TRUE answer cannot be given. but that still leaves too possibilities:
the answer is FALSE.
the question is not well formed, it should not pass compile.
the collection api chose the 1st one. but the 2nd choice would also make perfect sense. a question like that is a bullshit question 99.99% of times, so don't even ask!
public void wahey(List<Object> list) {}
wahey(new LinkedList<Number>());
The call to the method will not type-check. I can't even cast the parameter as follows:
wahey((List<Object>) new LinkedList<Number>());
From my research, I have gathered that the reason for not allowing this is type-safety. If we were allowed to do the above, then we could have the following:
List<Double> ld;
wahey(ld);
Inside the method wahey, we could add some Strings to the input list (as the parameter maintains a List<Object> reference). Now, after the method call, ld refers to a list with a type List<Double>, but the actual list contains some String objects!
This seems different to the normal way Java works without generics. For instance:
Object o;
Double d;
String s;
o = s;
d = (Double) o;
What we are doing here is essentially the same thing, except this will pass compile-time checks and only fail at run-time. The version with Lists won't compile.
This leads me to believe this is purely a design decision with regards to the type restrictions on generics. I was hoping to get some comments on this decision?
What you are doing in the "without generics" example is a cast, which makes it clear that you are doing something type-unsafe. The equivalent with generics would be:
Object o;
List<Double> d;
String s;
o = s;
d.add((Double) o);
Which behaves the same way (compiles, but fails at runtime). The reason for not allowing the behavior you're asking about is because it would allow implicit type-unsafe actions, which are much harder to notice in code. For example:
public void Foo(List<Object> list, Object obj) {
list.add(obj);
}
This looks perfectly fine and type-safe until you call it like this:
List<Double> list_d;
String s;
Foo(list_d, s);
Which also looks type-safe, because you as the caller don't necessarily know what Foo is going to do with its parameters.
So in that case you have two seemingly type-safe bits of code, which together end up being type-unsafe. That's bad, because it's hidden and therefore hard to avoid and harder to debug.
Consider if it was...
List<Integer> nums = new ArrayList<Integer>();
List<Object> objs = nums
objs.add("Oh no!");
int x = nums.get(0); //throws ClassCastException
You would be able to add anything of the parent type to the list, which may not be what it was formerly declared as, which as the above example demonstrates, causes all sorts of problems. Thus, it is not allowed.
They aren't subtypes of each other due how generics work. What you want is to declare your function like this:
public void wahey(List<?> list) {}
Then it will accept a List of anything that extends Object. You can also do:
public void wahey(List<? extends Number> list) {}
This will let you take in Lists of something that's a subclass of Number.
I'd recommend you pick up a copy of "Java Generics and Collections" by Maurice Naftalin & Philip Wadler.
There are essentially two dimensions of abstraction here, the list abstraction and the abstraction of its contents. It's perfectly fine to vary along the list abstraction - to say, for instance, that it's a LinkedList or an ArrayList - but it's not fine to further restrict the contents, to say: This (list which holds objects) is a (linked list which holds only numbers). Because any reference that knows it as a (list which holds objects) understands, by the contract of its type, that it can hold any object.
This is quite different from what you have done in the non-generics example code, where you've said: treat this String as if it were a Double. You are instead trying to say: treat this (list which holds only numbers) as a (list which holds anything). And it doesn't, and the compiler can detect it, so it doesn't let you get away with it.
"What we are doing here is essentially
the same thing, except this will pass
compile-time checks and only fail at
run-time. The version with Lists won't
compile."
What you're observing makes perfect sense when you consider that the main purpose of Java generics is to get type incompatibilities to fail at compile time instead of run time.
From java.sun.com
Generics provides a way for you to
communicate the type of a collection
to the compiler, so that it can be
checked. Once the compiler knows the
element type of the collection, the
compiler can check that you have used
the collection consistently and can
insert the correct casts on values
being taken out of the collection.
In Java, List<S> is not a subtype of List<T> when S is a subtype of T. This rule provides type safety.
Let's say we allow a List<String> to be a subtype of List<Object>. Consider the following example:
public void foo(List<Object> objects) {
objects.add(new Integer(42));
}
List<String> strings = new ArrayList<String>();
strings.add("my string");
foo(strings); // this is not allow in java
// now strings has a string and an integer!
// what would happen if we do the following...??
String myString = strings.get(1);
So, forcing this provides type safety but it also has a drawback, it's less flexible. Consider the following example:
class MyCollection<T> {
public void addAll(Collection<T> otherCollection) {
...
}
}
Here you have a collection of T's, you want to add all items from another collection. You can't call this method with a Collection<S> for an S subtype of T. Ideally, this is ok because you are only adding elements into your collection, you are not modifying the parameter collection.
To fix this, Java provides what they call "wildcards". Wildcards are a way of providing covariance/contravariance. Now consider the following using wildcards:
class MyCollection<T> {
// Now we allow all types S that are a subtype of T
public void addAll(Collection<? extends T> otherCollection) {
...
otherCollection.add(new S()); // ERROR! not allowed (Here S is a subtype of T)
}
}
Now, with wildcards we allow covariance in the type T and we block operations that are not type safe (for example adding an item into the collection). This way we get flexibility and type safety.