Java Collection methods

Java Collection methods - java

I'm starting to learn Java and I have a question about generics.
In this methods from Collection<E> interface:
boolean containsAll( Collection <?> c);
boolean removeAll(Collection<?> c);
boolean retainAll ( Collection <?> c);
Why is the parameter Collection <?> c instead of Collection <E> c?
Thanks a lot

The JDK designers wanted code like the following to be possible:
Collection<String> strings = Arrays.asList("foo", "bar", "baz");
Collection<Object> objects = Arrays.asList("foo", 123);
strings.removeAll(objects);
// strigns now contains only "bar" and "baz"
(The above code might not exactly compile because I can't remember how Arrays.asList() captures type parameters, but it should get the point across.)
That is, because you can call .equals() on any pair of objects and get a meaningful result, you don't really need to restrict those methods to a specific item type.

Because a E type parameter needs to be specified while a wildcard ? works for every type. The subtle difference is that
E means any specified type
? means any unknown type
Since there methods are supposed to work on a collection of any unknown type then they doesn't specify a type parameter at all. E is a type variable. ? is not a variable, is a placeholder which cannot be specified.

Related

Why are some Collection methods not generic? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What are the reasons why Map.get(Object key) is not (fully) generic
Why do we have contains(Object o) instead of contains(E e)?
As you all can see here, a templated java.util.List of type E has its contains method not templated: it takes an Object instead. Does anyone know why?
in what case would a List<String> return true in myList.contains(new OtherNonString())? If I'm not mistaken, never, unless the object that it's compared to has type E as an ancestor (which in my string example is impossible due to String being final)
Is it only to maintain backwards compatibility with pre-generics versions? am I missing a use-case where it makes sense? if it's just for backwards compatibility, why not deprecate contains(Object) and create a contains(E)?
Edit:
Some of my sub-questions had been answered before. For reference, also check this question

if it's just for backwards compatibility, why not deprecate
contains(Object) and create a contains(E)?
Because contains(Object) and contains(E) have the same type erasure (as you can see in this code sample) and hence would cause compilation errors. Also, deprecating methods was not an option, the top priority back then was to make legacy code work.

Because there is no need to have a template here : this would only prevent some tests and if an object isn't of the required class the method would answer false in any case.
It's much simpler to have in your code a simple test checking if the return of the function is a boolean than a test and a try/catch. The few cases where having a type checked at compile time would allow to find a bug aren't worth the overhead.

It's because the method can return true, even if the parameter is of a different type than the list type. More precisely, contains(Object o) will return true if the list contains an element e, so that e.equals(o) is true.
For example, the following code will print true, even if the type of l2 is not allowed in list:
List<ArrayList<String>> list =
new ArrayList<ArrayList<String>>();
ArrayList<String> l1 = new ArrayList<String>();
l1.add("foo");
list.add(l1);
LinkedList<String> l2 = new LinkedList<String>();
l2.add("foo");
System.out.println(list.contains(l2));
The reason for this is that the distinct classes ArrayList and LinkedList both inherit the equals implementation from AbstractList, which does not distinguish between different subclasses. Even if two objects don't have a common superclass, it is possible for their equals implementations to mutually recognize each other.

One of the reason could be contains() doesn't alter list, so don't need to enforce for the type.
From the link you have:
Returns true if this list contains the specified element. More
formally, returns true if and only if this list contains at least one
element e such that (o==null ? e==null : o.equals(e))

Is it only to maintain backwards compatibility with pre-generics versions?
No, that is handled by the type erasure.
It's like that because that method is not required to be type-safe, and doesn't need to return the actual type.

A counter-example:
List<String> strings = Arrays.asList("hello", "world");
Object o = "hello";
System.out.println(strings.contains(o)); // true
If the contains method didn't allow an Object reference as a parameter, It wouldn't be possible to compile the code above. However, the o variable references an instance of a String, which actually is contained in the given list.
The result of contains is determined by the result of Object.equals(Object o) method, which also defines the type of its argument as a general Object, for the very same reason:
String hello = "hello";
Object o = "hello";
System.out.println(hello.equals(o)); // true

Generics in java is implemented with a technique called erasure.
If no generic type is given the type is being replaced with Object.
If necessary the java compiler creates type cast to another object if another generic type is being given.
The compiler also Generate bridge methods to preserve polymorphism in extended generic types.
This is why there are nog generic types during runtime in the compiled bytecode.
for example
public static <T> void printArray ( T [] inputArray ) {
for ( T element : inputArray )
System.out.printf("%s ", element) ;
System.out.println();
}
after erasure is performed by the compiler
public static void printArray ( Object [] inputArray ) {
for ( Object element : inputArray )
System.out.printf("%s ", element) ;
System.out.println();
}
Their is exactly only one copy of this code in memory, which is called for all printArray calls in this example.
The reason why this is done is backwards compatibility. Generics were first introduced in java version 1.5.
In java version < 1.5 you defined a list like this:
List myList = new ArrayList();
and not like this
List<Integer> myList = new ArrayList<Integer>();
To make sure that old code won't break that was already written the compiled class can not contain information about generics.

Java generics, objects and wildcards differences & clarifications

I wish to understand this concept:
T object - generic, will be erased into actual type.
? object - will be erased into what?
Object object;
What are the differences between T, ? and Object?
I can easily understand #1, but what about:
Object var;
? var;
What is the difference between the two? I have read that I can't use ? explicitly, like T or any other variable, and that ? is related to objects, not types.
But what is the practical reason? Why can't I just write a List of objects (List<Object>) instead of a List of wildcards (List<?>)? As I don't know the types of objects on both cases.
In addition, I would like to know what is the erasure for ? ?

I will list the main differences between T and ?:
Basic: T is a type parameter and ? is a wildcard.
Meaning: T is used as a type parameter when defining a generic class. T will be replaced by a concrete type when you instantiate the generic class.
On the other hand, we use ? when we want to refer to an unknown type argument.
Place of definition: You need to declare T on the top of the class, or method if you define a generic method. You can use ? everywhere.
Mapping: Every use of T is mapped to the same type (in the same class). Every use of ? can be mapped to a different type.
Object instantiation: You can create objects with the generic parameter T like new ArrayList<T>(). You cannot instantiate objects but only pointers with ?.
Collections updating: You can add objects to a collection of type T. You cannot add object to a collection of type ? (since you don't know its type).
Type erasures: With generics, type erasure applies to the use of generics. when generics are used, they are converted into compile-time checks and execution-time casts. So if you have this code for example: List<String> myList = new ArrayList<String>(); and then you wish to add to your list so you do myList.add("Hello World"); and then you want to get the item you just added by performing String myString = myList.get(0); then the compiler will compile your code to List myList = new ArrayList(); and String myString = (String) myList.get(0); (the add stays the same for obvious reasons).
So basically, at execution time there is no way of finding out that T is essentially String for the list object (that information is gone).
Now for wildcards the story is different. A wildcard (?) is replaced by an Object (since it's unbounded). This is not very useful. At build-time the compiler will check you are only calling Object's behaviours. If you have something like ? extends Foo, then the ? is replaced with its bound Foo (at build-time the compiler will check you are only passing Foo or any of its subtypes (types that inherit from Foo) as an argument).
For differences between ? and Object & T and Object you may read here and here respectively.

Should remove(Object) be remove(? super E)

In this answer, I tried to explain why the Collection method add has the signature add(E) while remove is remove(Object). I came up with the idea that the correct signature should be
public boolean remove(? super E element)
And since this is invalid syntax in Java, they had to stick to Object, which just happens to be super E (supertype of E) for any E. The following code explains why this makes sense:
List<String> strings = new ArrayList();
strings.add("abc");
Object o = "abc"; // runtime type is String
strings.remove(o);
Since the runtime type is String, this succeeds. If the signature were remove(E), this would cause an error at compile-time but not at runtime, which makes no sense. However, the following should raise an error at compile time, because the operation is bound to fail because of its types, which are known at compile-time:
strings.remove(1);
The remove takes an Integer as an argument, which is not super String, which means it could never actually remove anything from the collection.
If the remove method was defined with the parameter type ? super E, situations like the above could be detected by the compiler.
Question:
Am I correct with my theory that remove should have a contravariant ? super E parameter instead of Object, so that type mismatches as shown in the above example can be filtered out by the compiler? And is it correct that the creators of the Java Collections Framework chose to use Object instead of ? super E because ? super E would cause a syntax error, and instead of complicating the generic system they simply agreed to use Object instead of super?
Also, should the signature for removeAll be
public boolean removeAll(Collection<? super E> collection)
Note that I do not want to know why the signature is not remove(E), which is asked and explained in this question. I want to know if remove should be contravariant (remove(? super E)), while remove(E) represents covariance.
One example where this does not work would be the following:
List<Number> nums = new ArrayList();
nums.add(1);
nums.remove(1); // fails here - Integer is not super Number
Rethinking my signature, it should actually allow sub- and supertypes of E.

This is a faulty assumption:
because the operation is bound to fail because of its types, which are known at compile-time
It's the same reasoning that .equals accepts an object: objects don't necessarily need to have the same class in order to be equal. Consider this example with different subtypes of List, as pointed out in the question #Joe linked:
List<ArrayList<?>> arrayLists = new ArrayList<>();
arrayLists.add(new ArrayList<>());
LinkedList<?> emptyLinkedList = new LinkedList<>();
arrayLists.remove(emptyLinkedList); // removes the empty ArrayList and returns true
This would not be possible with the signature you proposed.

remove(? super E) is entirely equivalent to remove(Object), because Object is itself a supertype of E, and all objects extend Object.

I think that designers of the collections framework made a decision to keep remove untyped, because it is a valid solution that lets you keep a post-condition without introducing a pre-condition or compromising type safety.
The post-condition of a c.remove(x) is that after the call x is not present in c. Method signature remove(Object) lets you pass any object or null, with no further checks. Method signature ? super E, on the other hand, introduces a pre-condition on the type of x, requiring it to be related to E.
Each pre-condition that you introduce in an API makes your API harder to use. If removing a pre-condition lets you keep all your post-conditions, it is a good idea to remove the pre-condition.
Note that removing an object of a wrong type is not necessarily an error. Here is a small example:
class Segregator {
private final Set<Integer> ints = ...
private final Set<String> strings = ...
public void addAll(List<Object> data) {
for (Object o : data) {
if (o instanceof Integer) {
ints.add((Integer)o);
}
if (o instanceof String) {
strings.add((String)o);
}
}
}
// Here is the method that becomes easier to write:
public void removeAll(List<Object> data) {
for (Object o : data) {
ints.remove(o);
strings.remove(o);
}
}
}
Note how removeAll method's code is simpler than the code of addAll, because remove does not care about the type of the object that you pass to it.

In your question you already explained why it can't (or shouldn't) be remove(E).
But there is also a reason why it shouldn't be remove(? super E). Imagine some piece of code where you have an object of unknown type. You still might want to try to remove that object from that list. Consider this code:
public void removeFromList(Object o, Collection<String> col) {
col.remove(o);
}
Now your argument was, that remove(? super E) is more typesafe way. But I say it doesn't have to be. Look at the Javadoc of remove(). It says:
More formally, removes an element e such that (o==null ? e==null : o.equals(e)), if this collection contains one or more such elements.
So all the preconditions the parameter has to match is that you can use == and equals() on it, which is the case with Object. This still enables you to try to remove an Integer from a Collection<String>. It just wouldn't do anything.

What's the difference between raw types, unbounded wild cards and using Object in generics

I am reading the chapter on Generics in Effective Java.
Help me understand difference between Set, Set<?> and Set<Object>?
The following paragraph is taken from the book.
As a quick review, Set<Object> is a parameterized type representing a
set that can contain objects of any type, Set<?> is a wildcard type
representing a set that can contain only objects of some unknown
type, and Set is a raw type, which opts out of the generic type
system.
What is meant by "some unknown type"? Are all unknown types of type Object? In that case what is the specific difference between Set<?> and Set<Object>?

a raw type (Set) treats the type as if it had no generic type information at all. Note the subtle effect that not only will the type argument T be ignored, but also all other type arguments that methods of that type might have. You can add any value to it and it will always return Object.
Set<Object> is a Set that accepts all Object objects (i.e. all objects) and will return objects of type Object.
Set<?> is a Set that accepts all objects of some specific, but unknown type and will return objects of that type. Since nothing is known about this type, you can't add anything to that set (except for null) and the only thing that you know about the values it returns is that they are some sub-type of Object.

At runtime, the JVM will just see Set because of type erasure.
At compile-time, there's a difference:
Set<Object> parameterized a type E with Object thus, Set.add(E element) will be parameterized to Set.add(Object element).
Set<?> on the other hand, adds a wildcard on a type E so Set.add(E element) is translated to Set.add(? element). Since this is not compilable, java instead "translates" it as Set.add(null element). It means that you cannot add anything to that set (except a null). The reason is that the wildcard is referencing to an unknown type.

what is meant by "some unknown type"
Exactly what it means - the Set has some generic parameter, but we don't know what it is.
So the set assigned to a Set<?> variable might be a Set<String>, or a Set<Integer>, or a Set<Map<Integer, Employee>> or a set containing any other specific type.
So what does that mean for how you can use it? Well, anything you get out of it will be an instance of the ?, whatever that is. Since we don't know what the type parameter is, you can't say anything more specific than that elements of the set will be assignable to Object (only because all classes extend from it).
And if you're thinking of adding something to the set - well, the add method takes a ? (which makes sense, since this is the type of objects within the set). But if you try to add any specific object, how can you be sure this is type-safe? You can't - if you're inserting a String, you might be putting it into a Set<Integer> for example, which would break the type-safety you get from generics. So while you don't know the type of the generic parameter, you can't supply any arguments of this type (with the single exception of null, as that's an "instance" of any type).
As with most generics-related answers, this has focused on collections because they're easier to comprehend instinctively. However the arguments apply to any class that takes generic parameters - if it's declared with the unbounded wildcard parameter ?, you can't supply any arguments to it, and any values you receive of that type will only be assignable to Object.

Set: No generics here, unsafe. Add what you want.
Set<?>: A set of a certain type we don't know from our scope. The same as Set<? extends Object>. Can reference to Sets of any type, but that type must be defined at the point where the set is actually instantiated. With the wildcarded reference, we can´t modify the set (we can't add or remove with anything not null). Is like a view.
Set<Object>: Set containing Objects (base class only, not subclasses). I mean you can instance the set using Collections of type Object, like HashSet<Object> but not with HashSet<String>. You can of course add elements of any type to the set, but just because it happens that everything is an Object or a subclass of Object. If the set were defined as Set, you can only add Numbers and subclasses of Number, and nothing more.

The difference between Set<Object> and Set<?> is that a variable of type Set<?> can have a more specific generic type assigned to it, as in:
Set<?> set = new HashSet<Integer>();
while Set<Object> can only be assigned Set<Object>:
Set<Object> set = new HashSet<Integer>(); // won't compile
The Set<Object> is still useful, since any object can be put into it. It is much like the raw Set in that sense, but works better with the generic type system.

I was explaining this item to my friend and specifically asked for the "safeAdd" method as a counter to the unsafeAdd example. So here it is.
public static void main(String[] args) {
List<String> strings = new ArrayList<String>();
unsafeAdd(strings, new Integer(42)); // No compile time exceptions
// New
safeAdd(strings, new Integer(42)); // Throwing an exception at compile time
String s = strings.get(0); // Compiler-generated cast
}
private static void unsafeAdd(List list, Object o) {
list.add(o);
}
private static <E> void safeAdd(List<E> list, E o) {
list.add(o);
}

Set<?> set = new HashSet<String>();
set.add(new Object()); // compile time error
Since we don’t know what the element type of set stands for, we cannot add objects
to it. The add() method takes arguments of type E, the element type of the Set.
When the actual type parameter is ?, it stands for some unknown type. Any
parameter we pass to add would have to be a subtype of this unknown type. Since we
don’t know what type that is, we cannot pass anything in. The sole exception is
null, which is a member of every type.
Given a Set<?>, we can call get() and make use of the result. The result type is an
unknown type, but we always know that it is an object. It is therefore safe to
assign the result of get() to a variable of type Object or pass it as a parameter
where the type Object is expected.

Let say,you are writing a common method to print the elements appearing in a List. Now, this method could be used for printing Lists of type Integer, Double,Object or any other type. Which one do you choose?
List<Object> : If we use this one, this would help us to print only the elements of type Object. It wont be useful for printing elements belonging to other classes like Double. This is because Generic does not support Inheritance by default and needs to be specified explicitely using 'super' keyword.
// Would only print objects of type 'Object'
public static void printList(List<Object> list) {
for (Object elem : list)
System.out.println(elem + " ");
System.out.println();
}
List<?> : This could help us have a common method for printing any datatype. We could use this method for printing instances of any type.
// The type would really depend on what is being passed
public static void printList(List<?> list) {
for (Object elem: list)
System.out.print(elem + " ");
System.out.println();
}

How the wildcards works in Java

I am reading the java tutorial about Wildcards in Generics. In the following code:
void printCollection(Collection<Object> c) {
for (Object e : c) {
System.out.println(e);
}
}
Does this means the collection c take type object as its elements, and we can not call c.add("apple"),
because "apple" is a string and the for loop takes any object elements from collection c?
But I do not understand the following code,
void printCollection(Collection<?> c) {
for (Object e : c) {
System.out.println(e);
}
}
This code uses wildcards, meaning "a collection whose element type matches anything." Does this mean we can add any type of object to it, such as c.add("string");,
c.add(1);, and c.add(new apple()); ?
and the for loop take any object e from collection c, if c is not an object type, we say c's elements are Integer. Does this code works? Does this mean it should be cast?

You got it almost exactly backwards.
A Collection<Object> can contain Object and subclasses of it, and since everything (including String) is a subclass of Object, you can add anything to such a collection. However, you cannot make any assumptions about its contents except that they're Objects.
On the other hand, A Collection<?> contains only instances of a specific unknown type (and its subclasses), but since you don't know which type it is, you cannot add anything (except null) to such a collection, nor make any assumptions about its conents (except that they're Objects, because everything is).

In Angelika Langer's Java Generics FAQ, question "What is the difference between the unbounded wildcard parameterized type and the raw type?" (link) you'll see that Collection<?> and Collection<Object> are almost equivalent.

In the case of the second statement the "?" wildcard means that the generic is not defined. The result is that the type is bound to "Object" because this is the default bound for no declaration.
In fact even "Integer" is a subclass of Object. If you mean "int" you are right, thats a primitive and not a derivate of Object, but you can't put it into a Collection, since a Collection only allows derivates of Object.
And to the question if the elements put into the collection should be casted. No, thats not necessary since they are clear derivate classes from Object. The compiler does not need any explicit cast information there, he resolves the right class definition automatically.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.