Varargs and Generics in java

Varargs and Generics in java - java

Consider the following scenario:
<T> void function(T...args){
...code...
}
And then I call it using a Integer[]. How does the compiler assume that T is an Integer, and not a Integer[]? (Note, I'm glad that this is the case, but I still find the ambiguity odd).
Furthermore, if I wanted T to be Integer[], is there anyway for me to do that (assuming boxing/unboxing doesn't exist)?

The Java compiler is smart enough to know that, since you gave it an Integer[], you probably meant for T to be Integer, not Integer[]. I'd assume this is part of the Java language specification that defines ... as varargs.
If you want to specify what T is, you can do that with the following syntax:
Integer[] ary = { 1, 2, 3 };
myObj.function(ary); // T is Integer
myObj.<Integer>function(ary); // T is Integer
myObj.<Integer[]>function(ary); // T is Integer[]
<Integer>function(ary); // this is invalid; instead you could do...
this.<Integer>function(ary); // this if it's an instance method
MyClass.<Integer>function(ary); // or this if it's static

Generics works on object references, so <T> will work on object references of a class. int[] is a class that references an array of int, while int is a primitive. Integer[] is a class that references to an array of Integer, where Integer is another class.
After reviewing this, the varargs param T ... args expects an array of object references, so int[] would be a single element in the array of object references, while Integer[] is an array of object references.
If you want to send an Integer[] as each element of your varargs, you can send an Integer[][]. I wrote an example:
public class SomeMain {
static <T> void foo(T...ts) {
for(T t : ts) {
System.out.println(t);
}
System.out.println();
}
public static void main(String[] args) {
int[] ints = { 1, 2, 3 };
Integer[] integers = { 1, 2, 3 };
foo(ints);
foo(integers);
//note, here each element in the varags will behave as Integer[]
foo(new Integer[][] { integers });
}
}
Output (the hash code of the array will change on every run):
[I#8dc8569
1
2
3
[Ljava.lang.Integer;#45bab50a

There are 3 phases in finding the applicable methods. On the 1st phase, javac tries to match argument types and method parameter types exactly. The parameter type of the method is T[] on this phase, the argument type is Integer[], the two matche after T is inferred to be Integer, therefore the method is chosen as the applicable method (there are no other overloading methods to consider). No further phases are carried out.
If the 1st phase does not yield an applicable method, javac will continue to other phases. For example, if T is explicitly specified as Integer[], the method will not match on the 1st phase (because T[] would not match Integer[])
On the 3rd phase, varargs are considered; javac will match T, not T[], with trailing argument types.
This is indeed, quite confusing, and appear to be ambiguous to our intuition.

Note that Generics isn't completely relevant to the question. The exact same question would apply if the function signature were void function(Object... args) -- if you pass an expression of type Integer[], it could interpreted as either using the array as args, or as one of the elements of args.
The answer is that, basically, the compiler will prefer to use the argument as args if possible. Since the expression you are passing has "array of reference type" type, it is compatible with args, and therefore, that interpretation prevails.
Furthermore, if I wanted T to be Integer[], is there anyway for me to
do that (assuming boxing/unboxing doesn't exist)?
Since it is a generic method, you can explicitly specify the the type argument when calling: this.<Integer[]>function(...).
But back to the more general question where the function signature is void function(Object... args). You could explicitly create the array of arguments yourself:
function(new Integer[][]{ myIntegerArray });
or (simpler) you can cast the expression to a type that is no longer an array of reference type:
function((Object)myIntegerArray);

Related

How type inference work for method calls?

Consider the following example:
public class Learn {
public static <T> T test (T a, T b) {
System.out.println(a.getClass().getSimpleName());
System.out.println(b.getClass().getSimpleName());
b = a;
return a;
}
public static void main (String[] args) {
test("", new ArrayList<Integer>());
}
}
In the main method, I am calling test with a String and an ArrayList <Integer> object. Both are different things and assigning an ArrayList to String (generally) gives a compile error.
String aString = new ArrayList <Integer> (); // won't compile
But I am doing exactly that in the 3rd line of test and the program compiles and runs fine. First I thought that the type parameter T is replaced by a type that's compatible with both String and ArrayList (like Serializable). But the two println statements inside test print "String" and "ArrayList" as types of a and b respectively. My question is, if ais String and b is ArrayList at runtime, how can we assign a to b.

For a generic method, the Java compiler will infer the most specific common type for both parameters a and b.
The inference algorithm determines the types of the arguments and, if available, the type that the result is being assigned, or returned. Finally, the inference algorithm tries to find the most specific type that works with all of the arguments.
You aren't assigning the result of the call to test to anything, so there is no target to influence the inference.
In this case, even String and ArrayList<Integer> have a common supertype, Serializable, so T is inferred as Serializable, and you can always assign one variable of the same type to another. For other examples, you may even find Object as the common supertype.
But just because you have variables of type T that are inferred as Serializable, the objects themselves are still a String and an ArrayList, so getting their classes and printing their names still prints String and ArrayList. You're not printing the type of the variables; you're printing the type of the objects.

test("", new ArrayList<Integer>());
This is equivalent to the following:
Learn.test("", new ArrayList<Integer>());
Which is also equivalent to the following:
Learn.<Serializable>test("", new ArrayList<Integer>());
This will not compile if you explicitly specify a generic type other than Serializable (or Object), such as String:
Learn.<String>test("", new ArrayList<Integer>()); // DOES NOT COMPILE
So essentially both parameters are treated as Serializable in your case.

Understanding Principle of Truth In Advertising Java Generics

I have been trying to understand Java generics properly.So in this quest I have come accross one principle " Principle of Truth In Advertising", I am tring to understand this in simple language.
The Principle of Truth in Advertising: the reified type of an array must be a subtype
of the erasure of its static type.
I have written sample code .java and .class files as follows.Please go through code and please explain what part(in code) designates/indicates what part of above statement.
I have written comments to I think I should not write description of code here.
public class ClassA {
//when used this method throws exception
//java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.lang.String;
public static <T> T[] toArray(Collection<T> collection) {
//Array created here is of Object type
T[] array = (T[]) new Object[collection.size()];
int i = 0;
for (T item : collection) {
array[i++] = item;
}
return array;
}
//Working fine , no exception
public static <T> T[] toArray(Collection<T> collection, T[] array) {
if (array.length < collection.size()) {
//Array created here is of correct intended type and not actually Object type
//So I think , it inidicates "the reified type of an array" as type here lets say String[]
// is subtype of Object[](the erasure ), so actually no problem
array = (T[]) Array.newInstance(array.getClass().getComponentType(), collection.size());
}
int i = 0;
for (T item : collection) {
array[i++] = item;
}
return array;
}
public static void main(String[] args) {
List<String> list = Arrays.asList("A", "B");
String[] strings = toArray(list);
// String[] strings = toArray(list,new String[]{});
System.out.println(strings);
}
}
Please try to explain in simple language.Please point out where I am wrong. Corrected code with more comments is appreciated.
Thank you all

I refer to Java Generics and Collections as the Book and the Book's authors as the Authors.
I would upvote this question more than once as the Book makes a poor job of explaining the principle IMO.
Statement
Principle of Truth in Advertising:
the reified type of an array must be a subtype of the erasure of its static type.
Further referred to as the Principle.
How does the principle help?
Follow it and the code will compile and run without exceptions
Do not follow it and the code will compile, but throw an exception at runtime.
Vocabulary
What is a static type?
Should be called the reference type.
Provided A and B are types, in the following code
A ref = new B();
A is the static type of ref (B is the dynamic type of ref). Academia parlance term.
What is the reified type of an array?
Reification means type information available at runtime. Arrays are said to be reifiable because the VM knows their component type (at runtime).
In arr2 = new Number[30], the reified type of arr2 is Number[] an array type with component type Number.
What is the erasure of a type?
Should be called the runtime type.
The virtual machine's view (the runtime view) of a type parameter.
Provided T is a type parameter, the runtime view of the following code
<T extends Comparable<T>> void stupidMethod(T[] elems) {
T first = elems[0];
}
will be
void stupidMethod(Comparable[] elems) {
Comparable first = elems[0];
}
That makes Comparable the runtime type of T. Why Comparable? Because that's the leftmost bound of T.
What kind of code do I look at so that the Principle is relevant?
The code should imply assignment to a reference of array type. Either the lvalue or the rvalue should involve a type parameter.
e.g. provided T is a type parameter
T[] a = (T[])new Object[0]; // type parameter T involved in lvalue
or
String[] a = toArray(s); // type parameter involved in rvalue
// where the signature of toArray is
<T> T[] toArray(Collection<T> c);
The principle is not relevant where there are no type parameters involved in either lvalue or rvalue.
Example 1 (Principle followed)
<T extends Number> void stupidMethod(List<T>elems) {
T[] ts = (T[]) new Number[0];
}
Q1: What is the reified type of the array ts is referencing?
A1: Array creation provides the answer: an array with component type Number is created using new. Number[].
Q2: What is the static type of ts?
A2: T[]
Q3: What is the erasure of the static type of ts?
A3: For that we need the erasure of T. Given that T extends Number is bounded, T's erasure type is its leftmost boundary - Number. Now that we know the erasure type for T, the erasure type for ts is Number[]
Q4: Is the Principle followed?
A4: restating the question. Is A1 a subtype of A3? i.e. is Number[] a subtype of Number[]? Yes => That means the Principle is followed.
Example 2 (Principle not followed)
<T extends Number> void stupidMethod(List<T>elems) {
T[] ts = (T[]) new Object[0];
}
Q1: What is the reified type of the array ts is referencing?
A1: Array creation using new, component type is Object, therefore Object[].
Q2: What is the static type of ts?
A2: T[]
Q3: What is the erasure of the static type of ts?
A3: For that we need the erasure of T. Given that T extends Number is bounded, T's erasure type is its leftmost boundary - Number. Now that we know the erasure type for T, the erasure type for ts is Number[]
Q4: Is the Principle followed?
A4: restating the question. Is A1 a subtype of A3? i.e. is Object[] a subtype of Number[]? No => That means the Principle is not followed.
Expect an exception to be thrown at runtime.
Example 3 (Principle not followed)
Given the method providing an array
<T> T[] toArray(Collection<T> c){
return (T[]) new Object[0];
}
client code
List<String> s = ...;
String[] arr = toArray(s);
Q1: What is the reified type of the array returned by the providing method?
A1: for that you need too look in the providing method to see how it's initialized - new Object[...]. That means the reified type of the array returned by the method is Object[].
Q2: What is the static type of arr?
A2: String[]
Q3: What is the erasure of the static type of ts?
A3: No type parameters involved. The type after erasure is the same as the static type String[].
Q4: Is the Principle followed?
A4: restating the question. Is A1 a subtype of A3? i.e. is Object[] a subtype of String[]? No => That means the Principle is not followed.
Expect an exception to be thrown at runtime.
Example 4 (Principle followed)
Given the method providing an array
<T> T[] toArray(Collection<T> initialContent, Class<T> clazz){
T[] result = (T[]) Array.newInstance(clazz, initialContent);
// Copy contents to array. (Don't use this method in production, use Collection.toArray() instead)
return result;
}
client code
List<Number> s = ...;
Number[] arr = toArray(s, Number.class);
Q1: What is the reified type of the array returned by the providing method?
A1: array created using reflection with component type as received from the client. The answer is Number[].
Q2: What is the static type of arr?
A2: Number[]
Q3: What is the erasure of the static type of ts?
A3: No type parameters involved. The type after erasure is the same as the static type Number[].
Q4: Is the Principle followed?
A4: restating the question. Is A1 a subtype of A3? i.e. is Number[] a subtype of Number[]? Yes => That means the Principle is followed.
What's in a funny name?
Ranting here. Truth in advertising may mean selling what you state you are selling.
In
lvalue = rvalue we have rvalue as the provider and lvalue as the receiver.
It might be that the Authors thought of the provider as the Advertiser.
Referring to the providing method in Example 3 above,
<T> T[] toArray(Collection<T> c){
return (T[]) new Object[0];
}
the method signature
<T> T[] toArray(Collection<T> c);
may be read as an advertisement: Give me a List of Longs and I will give you an array of Longs.
However looking in the method body, the implementation shows that the method is not being truthful, as the array it creates and returns is an array of Objects.
So toArray method in Example 3 lies in its marketing campaigns.
In Example 4, the providing method is being truthful as the statement in the signature (Give me a collection and its type parameter as a class literal and I will give you an array with that component type) matches with what happens in the body.
Examples 3 and 4 have method signatures to act as advertisement.
Examples 1 and 2 do not have an explicit advertisement (method signature). The advertisement and the provision are intertwined.
Nevertheless, I could think of no better name for the Principle. That is a hell of a name.
Closing remarks
I consider the statement of the principle unnecessarily cryptic due to use of terms like static type and erasure type. Using reference type and runtime type/type after erasure, respectively, would make it considerably easier to grasp to the Java layman (like yours truly).
The Authors state the Book is the best on Java Generics [0]. I think that means the audience they address is a broad one and therefore more examples for the principles they introduce would be very helpful.
[0] https://youtu.be/GOMovkQCYD4?t=53

Think of it that way:
T[] array = (T[]) new Object[collection.size()]; A new Array is created. Due to language design, the type of T is unkown during runtime. In your example you know for a fact T is String, but the from the viewpoint of the vm T is Object. All casting operations are happening in the calling method.
So in toArray an array Object[] is created. The type parameter is more or less syntactic sugar which has no consequence for the bytecode created.
So why can't an array of objects be casted to an array of strings?
Let's have an example:
void methodA(){
Object[] array = new Object[10];
array[0]=Integer.valueOf(10);
array[1]=Object.class;
array[2]=new Object();
array[3]="Hello World";
methodB((String[])array);
}
void methodB(String[] stringArray){
String aString=stringArray[1]; //This is not a String, but Object.class!
}
If you could cast an array, you'd say "all elements I've added before are of a valid subtype". But since your array is of type Object, the vm can't guarantee the array will always under all circumstances contain valid subtypes.
methodB thinks it deals with an array of Strings, but in reality the array does contain very different types.
The other way around does not work either:
void methodA(){
String[] array = new String[10];
array[0]="Hello World";
methodB((Object[])array);
//Method B had controll over the array and could have added any object, especially a non-string!
System.out.println(array[1]);
}
void methodB(Object[] oArray){
oArray[1]=Long.valueOf(2);
}
I hope this helps a little bit.
Edit: After reading your question again, I think you are mixing to things:
Arrays can't be casted (as I explained above)
The cited sentence does say in plain English: "If you create an array of type A, all elements in this array must be of type A or a of a subtype of A". So if you create an array of Object you can put any java object into to array, but if you create an array of Number the values have to be of type Number (Long, Double, ...). All in all the sentence is rather trivial. Or I didn't understand it either ;)
Edit 2: As a matter of fact you can cast an array to any type you want. That is, you can cast an array as you can cast any type to String (String s=(String)Object.class;).
Especially you can cast a String[] to an Object[] and the other way around. As I pointed out in the examples, this operation introduces potential bugs in great numbers, since reading/writing to the array will likely fail. I can think of no situation where it is a good decision to cast an array. There might be situations (like generalized utility classes) where it seems to be a good solution, but I still would suggest to overthink the design if you find yourself in a situation where you want to cast an array.
Thanks to newacct for pointing out the cast operation itself is valid.

Why can't we just use arrays instead of varargs?

I just came across varargs while learning android(doInBackground(Type... params)) ,SO posts clarified the use of it
My question is why can't we just use Arrays instead of varargs
public void foo(String...strings) { }
I can replace this type of a call by packing my variable number of arguments in an array and passing it to a method such as this
public void foo(String[] alternativeWay){ }
Also does main(String[] args) in java use varargs , if not how are we able to pass runtime parameters to it
Please suggest the benefits or use of varargs and is there there anything else important to know about varargs

The only difference between
foo(String... strings)
and
foo(String[] strings)
is for the calling code. Consider this call:
foo("a", "b");
That's valid with the first declaration of foo, and the compiler will emit code to create an array containing references to "a" and "b" at execution time. It's not valid with the second declaration of foo though, because that doesn't use varargs.
In either case, it's fine for the caller to explicitly create the array:
for(new String[] { "a", "b" }); // Valid for either declaration
Also does main(String[] args) in java use varargs , if not how are we able to pass runtime parameters to it
When it's written as main(String[] args) it doesn't; if you write main(String... args) then it does. It's irrelevant to how the JVM treats it though, because the JVM initialization creates an array with the command line arguments. It would only make a difference if you were writing your own code to invoke main explicitly.

We could use arrays instead of varargs. Varargs are syntactic sugar for using arrays. But they make your code more compact and more readable. Compare
private void foo(String... ss) { ... }
private void bar() {
...
foo("One", "Two", "Three");
...
}
with
private void foo(String[] ss) { ... }
private bar() {
...
foo(new String[] { "One", "Two", "Three" });
...
}
Similarly, we don't need the diamond operator (<>, Java 7) or lambdas (Java 8) either. But they do make code more readable and therefore more maintainable.

One advantage of varargs is for methods requiring at least one parameter, such as max. With varargs you can do it like this
static int max(int first, int... remaining) {
int max = first;
for (int number : remaining)
max = Math.max(max, number);
return max;
}
This is great, because it is impossible to pass no parameters to the max method, and the calling code for max is really clean: max(2, 4, 1, 8, 9). Without varargs the only way to have enforced the condition that at least one number should be passed would have been to have thrown an exception at runtime if the array had length 0 (always best avoided) or to force the caller to write max(2, new int[] {4, 1, 8, 9}) which is really ugly.

Because you function call looks more like a function call, ex.:
new MyAsyncTask().execute("str1", "str2");
looks better than:
new MyAsyncTask().execute(new String[]{"str1", "str2"});
There is no magic behind AsyncTask, very often you dont really need to pass any parameters, sometimes you pass parameters to constructor instead of execute. There are also implementations of AsyncTask :
https://github.com/roboguice/roboguice/blob/master/roboguice/src/main/java/roboguice/util/SafeAsyncTask.java
that dont use varargs at all

Using a different method signature in Java

The following two methods compiles fine and do what they stand for.
public int returnArray()[]
{
int a[]={1 ,2};
return a;
}
public String[] returnArray(String[] array[])[]
{
return array;
}
According to this method signature, can't we somehow have a method signature like the following?
public <T>List rerurnList(List<T> list)<T>
{
return new ArrayList<T>();
}
This method is intended to return a java.util.List of generic type. It does not compile. It must be modified as follows for its successful compilation.
public <T>List<T> rerurnList(List<T> list)
{
return new ArrayList<T>();
}
Can't we have a method signature like the first cases, in this case?

For some reason Java lets you define arrays like in C, adding the [] modifier after the variable or method name. That, however, is not possible with generics.
Generic type arguments have to be declared right with the type, because they are part of the type descriptor. Arrays should also be declared that way, as they are also part of the type descriptor.
In order to understand why the compiler does not let you write things that way (and why it shouldn't let you write things like in the first examples), we need to break it down to pieces.
public int returnArray()[] { ... }
public: Visibility declaration
int: Return type, integer
returnArray: Method name
(): Argument list (empty)
[]: Whoops! the return type is actually an arrayof what we said before
This is even better:
public String[] returnArray(String[] array[])[]
public: Visibility declaration
String[]: Return type, an array of strings
returnArray: Method name
(String[] array[]): Argument list...
String[]: Type of the argument, array of strings
array: Name of the argument
[]: Whoops! argument type is actually an array of what we said before
[]: Whoops again! return type is actually an array of what we said before
Foot note: Don't do this, specify the types only in the types. Instead of String[] array[], use String[][] array.
Now that the array thing syntax is clear, and I hope you understand why it should be wrong, let begin with the generig thing:
public <T> List<T> rerurnList(List<T> list) { ... }
public: Visibility declaration
<T>: This method uses generic type T
List<T>: Return type, a generic List of T
rerurnList: Method name
(List<T> list): Argument list
List<T>: Argument type, generic List of T
list: Argument name

Just to answer your question: the syntax you're trying for the method that fails compilation is definitely wrong because of the <T> placed between the parameter list and the method body:
(List<T> list)<T>{
This is simply not valid Java syntax. That's not how you mark a generic method. You already marked the method as generic by putting the type parameter - <P> - between the method access modifier and its return type.

Java SafeVarargs annotation, does a standard or best practice exist?

I've recently come across the java #SafeVarargs annotation. Googling for what makes a variadic function in Java unsafe left me rather confused (heap poisoning? erased types?), so I'd like to know a few things:
What makes a variadic Java function unsafe in the #SafeVarargs sense (preferably explained in the form of an in-depth example)?
Why is this annotation left to the discretion of the programmer? Isn't this something the compiler should be able to check?
Is there some standard one must adhere to in order to ensure his function is indeed varags safe? If not, what are the best practices to ensure it?

1) There are many examples on the Internet and on StackOverflow about the particular issue with generics and varargs. Basically, it's when you have a variable number of arguments of a type-parameter type:
<T> void foo(T... args);
In Java, varargs are a syntactic sugar that undergoes a simple "re-writing" at compile-time: a varargs parameter of type X... is converted into a parameter of type X[]; and every time a call is made to this varargs method, the compiler collects all of the "variable arguments" that goes in the varargs parameter, and creates an array just like new X[] { ...(arguments go here)... }.
This works well when the varargs type is concrete like String.... When it's a type variable like T..., it also works when T is known to be a concrete type for that call. e.g. if the method above were part of a class Foo<T>, and you have a Foo<String> reference, then calling foo on it would be okay because we know T is String at that point in the code.
However, it does not work when the "value" of T is another type parameter. In Java, it is impossible to create an array of a type-parameter component type (new T[] { ... }). So Java instead uses new Object[] { ... } (here Object is the upper bound of T; if there upper bound were something different, it would be that instead of Object), and then gives you a compiler warning.
So what is wrong with creating new Object[] instead of new T[] or whatever? Well, arrays in Java know their component type at runtime. Thus, the passed array object will have the wrong component type at runtime.
For probably the most common use of varargs, simply to iterate over the elements, this is no problem (you don't care about the runtime type of the array), so this is safe:
#SafeVarargs
final <T> void foo(T... args) {
for (T x : args) {
// do stuff with x
}
}
However, for anything that depends on the runtime component type of the passed array, it will not be safe. Here is a simple example of something that is unsafe and crashes:
class UnSafeVarargs
{
static <T> T[] asArray(T... args) {
return args;
}
static <T> T[] arrayOfTwo(T a, T b) {
return asArray(a, b);
}
public static void main(String[] args) {
String[] bar = arrayOfTwo("hi", "mom");
}
}
The problem here is that we depend on the type of args to be T[] in order to return it as T[]. But actually the type of the argument at runtime is not an instance of T[].
3) If your method has an argument of type T... (where T is any type parameter), then:
Safe: If your method only depends on the fact that the elements of the array are instances of T
Unsafe: If it depends on the fact that the array is an instance of T[]
Things that depend on the runtime type of the array include: returning it as type T[], passing it as an argument to a parameter of type T[], getting the array type using .getClass(), passing it to methods that depend on the runtime type of the array, like List.toArray() and Arrays.copyOf(), etc.
2) The distinction I mentioned above is too complicated to be easily distinguished automatically.

For best practices, consider this.
If you have this:
public <T> void doSomething(A a, B b, T... manyTs) {
// Your code here
}
Change it to this:
public <T> void doSomething(A a, B b, T... manyTs) {
doSomething(a, b, Arrays.asList(manyTs));
}
private <T> void doSomething(A a, B b, List<T> manyTs) {
// Your code here
}
I've found I usually only add varargs to make it more convenient for my callers. It would almost always be more convenient for my internal implementation to use a List<>. So to piggy-back on Arrays.asList() and ensure there's no way I can introduce Heap Pollution, this is what I do.
I know this only answers your #3. newacct has given a great answer for #1 and #2 above, and I don't have enough reputation to just leave this as a comment. :P

#SafeVarargs is used to indicate that methods will not cause heap pollution.
Heap pollution is when we mix different parameterized types in generic array.
For example:
public static <T> T[] unsafe(T... elements) {
return elements;
}
Object [] listOfItems = unsafe("some value", 34, new ArrayList<>());
String stringValue = (String) listOfItems[0]; // some value
String intValue = (String) listOfItems[1]; // ClassCastException
As you can see, such implementation could easily cause ClassCastException if we don't guess with the type.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Varargs and Generics in java - java

Related

How type inference work for method calls?

Understanding Principle of Truth In Advertising Java Generics

Why can't we just use arrays instead of varargs?

Using a different method signature in Java

Java SafeVarargs annotation, does a standard or best practice exist?

Categories

Resources