Why doesn't Java allow for the creation of generic arrays? - java

There are plenty of questions on stackoverflow from people who have attempted to create an array of generics like so:
ArrayList<Foo>[] poo = new ArrayList<Foo>[5];
And the answer of course is that the Java specification doesn't allow you to declare an array of generics.
My question however is why ? What is the technical reason underlying this restriction in the java language or java vm? It's a technical curiosity I've always wondered about.

Arrays are reified - they retain type information at runtime.
Generics are a compile-time construct - the type information is lost at runtime. This was a deliberate decision to allow backward compatibility with pre-generics Java bytecode. The consequence is that you cannot create an array of generic type, because by the time the VM wants to create the array, it won't know what type to use.

See Effective Java, Item 25.

Here is an old blog post I wrote where I explain the problem: Java generics quirks
See How do I generically create objects and arrays? from Angelika Langer's Java Generics FAQ for a workaround (you can do it using reflection). That FAQ contains everything you ever want to know about Java generics.

You are forced to use
ArrayList<Foo>[] poo = new ArrayList[5];
which will give you an unchecked warning. The reason is that there is a potential type safety issue with generics and the runtime type-checking behavior of Java arrays, and they want to make sure you are aware of this when you program. When you write new ArrayList[...], you are creating something which will check, at runtime, everything that gets put into it to make sure that it is an instance of ArrayList. Following this scheme, when you do new ArrayList<Foo>[...], then you expect to create something that checks at runtime everything that gets put into it to make sure it is an instance of ArrayList<Foo>. But this is impossible to do at runtime, because there is no generics info at runtime.

Related

Why the Java compiler know what you want to do with arrays but cannot know what you want to do with generics?

I was reading Thinking in Java 4th edition, and in the Chapter Generics, I found these sentences:
This paragraph is explaining why array support covariance but generics don't.
The real issue is that we are talking about the type of the container,
rather than the type that the container is holding. Unlike arrays,
generics do not have built-in covariance.
This is because arrays are completely defined in the language and can
thus have both compile-time and run-time checks built in,but with
generics, the compiler and runtime system cannot know what you want to
do with your types and what the rules should be.
but I cannot really understand what this paragraph means, why the compiler know what you want to do with arrays but cannot know what you want to do with generics? can anybody give me any example?
First of all, the quote is very unclear. Both arrays and generics are "completely defined in the language", albeit in a different way. And neither the compiler nor the run-time system can read your mind and thus do not know "what you want to do with your types".
The quote seems to refer to the fact that array are reified, while generics are not: at runtime, the element type of a List is not known, but the element type of an array is. That is: at runtime, both a List<String> and a List<Integer> have the same type (a List), whereas a String[] and an Integer[] have different types. At compile time, however, List<String> and List<Integer> are different types.
The reason for this is mostly a historical one. Arrays were introduced in the very first version of Java, and there was no reason not to make the element type of an array known at runtime. When generics were introduced in Java 5, however, the goal was to make new Java code compatible with old Java code (and vice-versa), and therefore generics had to be erased at run-time.
To support legacy code, newer versions of Java allow type-safe code to be used with older code. While creating the bytecode, the Java compiler replaces all the type-safe declarations with relevant casts (casting to Object if no type parameters are present). The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods. This is called Type-erasure. (All of this hassle JUST to support legacy code).
You can find a detailed explanation at my blog here.

How is backward compatibility maintained by not allowing primitive in generics in java

I know that Java does not allow primitive data types to be used in Generics, i.e
List<int> l = new List<int>();
is not allowed.
I have read a related post which states that this is for the purpose of backward compatibility. Can anyone explain how not allowing primitives to be used in generics maintains backward compatibility? I would greatly appreciate a small explanation with an example.
One sub question: what is/are major/minor drawbacks of how generics are implemented in java.
Your response will be greatly appreciated.
As explained here (with examples), generics in Java only exist at compile time. Under the hood, a generic collection is really a non-generic collection, and all their contents are stored as Object. In pre-generics times (pre Java 5), collections could not contain primitives because they cannot be down-casted to Object, and they still cannot.
To answer your question: Allowing generic collections to store primitives would not be a breaking change because generic collections did not exist previously, and thus the primitive decision has nothing to do with backward compatibility. If they decided to implement generics like in C#, primitive generic collection types would exist, and it would not change the behavior of previous programs.
The reasons as to why they implemented generics this way, though, might be related to the fact that this is a low-energy approach: They barely had to add any code to collections to support this feature, and they did not have to change the JVM. This improves maintainability. That does not sound like much of a reason, but it might actually be very important if you want every future JVM be able to still execute Java 1.0 code until the end of time.

why java generics have to erase the type information?

Java's generics would erase the type information after the source code was compiled. And i guess the "erase" is necessary because java only keep one copy of class no matter what the generic type is. So List<String> or List<Number> are simply just one List. Then I wonder if it possible that at the premise of keeping only one copy of certain class, the instance of the class stores the generic type information at the time it is created.
For instance:
when we write:
List<String> list = new List<String>.
the compiler create an object of List along with a String's Class Object(meaning the Object String.class) accociated with the List, so that the generic object list can check the type information at runtime using the Class Object. Is it posssible or practicable?
I'm not entirely sure what you're asking specifically, but the big reason Java has to use erasure for generics is backwards compatibility. If the behaviour of
List list = new ArrayList(); //You can't do new list(), it's an interface
...was altered between versions, when you upgraded from say Java 1.4 to Java 5, you'd have all sorts of weird things going on potentially causing bugs where the code didn't behave in the same way as previously. That's definitely a bad thing if that happens!
If they didn't have to preserve backwards compatibility then yes, they could've done what they liked - we could've had nice reified generics and done a whole bunch of other stuff we couldn't do now. There was a proposal (by Gafter I think) that would've allowed reified generics in Java in a backwards compatible way, but it would've involved creating new versions of all the classes that should have been generic. That would've caused a load of mess with the API, so (for better or worse) they chose not to go down that route.
We are using List<String> l = new ArrayList<String>(); from java 5. Before it was not like this.
It was like List l = new ArrayList(); and user can add anything into it like Integer,String or any Object of user-type too.So Java people did not want to change the old code.So they just keep it upto compiler which can check this at compile time.
to preserve binary backward compatibility with pre-Java5 code.

Making generic calls with JAVA JNI and C++

I am working with JNI and I have to pass in some generic types to the C++. I am stuck with how to approach this on the C++ side
HashMap<String, Double[]> data1 ;
ArrayList<ArrayList<String>> disc ;
I am new to JNI and looked around but could not find much help. Can some one help me how to write JNI code for this please. Any reference to material on the net would be very helpful too.
Short answer: You cannot.
Long answer: Type Erasure : http://download.oracle.com/javase/tutorial/java/generics/erasure.html
Consider a parametrized instance of ArrayList<Integer>. At compile time, the compiler checks that you are not putting anything but things compatible to Integer in the array list instance.
However, also at compile time (and after syntactic checking), the compiler strips the type parameter, rendering ArrayList<Integer> into Arraylist<?> which is equivalent to ArrayList<Object> or simply ArrayList (as in pre JDK 5 times.)
The later form is what JNI expects (because of historical reasons as well as due to the way generics are implemented in Java... again, type erasure.)
Remember, an ArrayList<Integer> is-a ArrayList. So you can pass an ArrayList<Integer> to JNI wherever it expects an ArrayList. The opposite is not necessarily true as you might get something out of JNI that is not upwards compatible with your nicely parametrized generics.
At this point, you are crossing a barrier between a typed, parametrized domain (your generics) and an untyped one (JNI). You have to encapsulate that barrier pretty nicely, and you have to add glue code and error checking/error handling code to detect when/if things don't convert well.
The runtime signature is just plain HashMap and ArrayList - Generics are a compile-time thing.
You can use javah to generate a C header file with correct signatures for native functions.
It depends on what you're trying to map to and if they are yours to change.
Here are a few directions I'd try to go about (if i were you, that is :) ):
Using SWIG templates (related SO question) or TypeMaps.
Doing some reflection magic to be used against your-own-custom-generic-data-passing native API (haven't figured the details out, but if you want to follow on it, tell what you've got on the C++ side).
This has been asked before and you might want to resort to Luis' arrays solution.

Java Generics - <int> to <Integer>

In the way of learning Java Generics, I got stuck at a point.
It was written "Java Generics works only with Objects and not the primitive types".
e.g
Gen<Integer> gen=new Gen<Integer>(88); // Works Fine ..
But, with the primitive types like int,char etc ...
Gen<int> gen=new Gen<int>(88) ; // Why this results in compile time error
I mean to say, since java generics does have the auto-boxing & unboxing feature, then why this feature cannot be applied when we declare a specific type for our class ?
I mean, why Gen<int> doesn't
automatically get converted to
Gen<Integer> ?
Please help me clearing this doubt.
Thanks.
Autoboxing doesn't say that you can use int instead of Integer. Autoboxing automates the process of boxing and unboxing. E.g. If I need to store some primitive int to a collection, I don't need to create the wrpper object manually. Its been taken care by Java compiler. In the above example you are instantiating an generic object which is of Integer type. This generic object will still work fine with int but declaring int as a generic type is wrong. Generics allow only object references not the primitives.
As you have discovered, you can't mention a primitive type as a type parameter in Java generics. Why is this the case? It is discussed at length in many places, including Java bug 4487555.
The simple explanation: Generics are defined that way.
A good reason from the Java perspective: It simplifies type erasure and translation to byte code for the compiler. All the compiler needs to do is some casting.
With non-primitives the compiler would have to decide whether to cast or to inbox/outbox, it would to need to have additional validating rules (extends and & wouldn't make sense with primitives, should a ? include primitives, yes or no? and so on) and have to handle type conversions (assume you parametize a collection with long and add an int...?)
A good reason from a programmers perspective: operations with a bad performance are kept visible! Allowing primitves as Type Arguments would require hidden autoboxing (inboxing for store, outboxing for read operations. Inboxing may create new objects which is expensive. People would expect fast operations if they parametize a generic class with primitives but the opposite would be true.
That's a very good question.
As you suspected, the abstraction could surely be extended to the type parameters, and made them trasparent to the programmer. In fact, that is what most modern JVM languages do (statically typed ones, of course). Examples include Scala, Ceylon, Kotlin etc.
This is what your example would look like in Scala:
val gen: Gen[Int] = new Gen[Int](80)
Int is just a regular class, just like other classes. There is no primitive-object distinction whatsoever.
As to why Java people did not do it... I don't actually know the reason, but I imagine such an abstraction would not fit with the existing Java specification without overcomplicating the semantics (or without sacrificing the backward compatibility, which is certainly not a viable option).

Categories