I wonder why Java compiler doesn't trust in this line of code :
List<Car> l = new ArrayList();
and expects to have a typed ArrayList :
List<Car> l = new ArrayList<Car>();
Indeed, compiler indicates an unchecked assignment with the first case.
Why doesn't compiler see that this ArrayList() has just been created and so it's impossible to find already in it some objects other than 'Car'?
This warning would make sense if the untyped ArrayList was created before but doesn't in this case ...
Indeed, since List is typed as 'Car', all futures "l.add('object')" will be allowed only if 'object' is a 'Car'. => So, according to me, no surprise could happen.
Am I wrong ?
Thanks
Why doesn't compiler see that this ArrayList() has just been created and so it's impossible to find already in it some objects other than 'Car'?
The simple answer is "Because it is not allowed to."
The compiler has to implement the Java Language Specification. If some compiler writer goes and adds a bunch of smarts to the compiler to allow things that "everyone" knows are safe, then what he's actually done is to introduce a portability problem. Code compiled and tested with this compiler will give compilation errors when compiled with a dumb (or more accurately, strictly conformant) Java compiler.
So why doesn't the JLS allow this? I can think of a couple of possible explanations:
The JLS writers didn't have time to add this to the specification; e.g. to consider all of the ramifications on the rest of the specification.
The JLS writers couldn't figure out a sound way to express this in the specification.
The JLS writers didn't want to put something into the specification that they weren't sure was implementable in a real world compiler ... without being too burdensome on the compiler writer.
And then there is the associated question of whether a compiler could implement such a check. I'm not qualified to answer that ... but I know enough about the problem to realize that it is not necessarily as simple to solve as one might imagine.
To add to Stephen C's really good answer (mostly because writing this as a comment would be really cumbersome, sorry), this problem is actually explicitly mentioned in the JLS:
Discussion
Variables of a raw type can be assigned from values of any
of the type's parametric instances.
For instance, it is possible to assign a Vector<String> to a Vector,
based on the subtyping rules (§4.10.2).
The reverse assignment from Vector to Vector<String> is unsafe (since
the raw vector might have had a different element type), but is still
permitted using unchecked conversion (§5.1.9) in order to enable
interfacing with legacy code. In this case, a compiler will issue an
unchecked warning.
Why they didn't special case this particular case? Because adding special cases to an implementation isn't something you do if you don't have to and the "correct" solution doesn't add any particular problems (apart from some writing effort which they presumably didn't think was too important).
Also every special case means that the compiler gets more complicated (well generally speaking at least and in this case certainly) - considering how simple javac is on the whole, I'd think it's not unlikely that having a simple, fast compiler was also one of the design goals.
Related
Upon reading the following:
A lot of people define static typing and dynamic typing with respect
to the point at which the variable types are checked. Using this
analogy, static typed languages are those in which type checking is
done at compile-time, whereas dynamic typed languages are those in
which type checking is done at run-time.
This analogy leads to the analogy we used above to define static and
dynamic typing. I believe it is simpler to understand static and
dynamic typing in terms of the need for the explicit declaration of
variables, rather than as compile-time and run-time type checking.
Source
I was thinking that the two ways we define static and dynamic typing: compile-time checking and explicit type declaration are a bit like apples and oranges. A characteristic in all statically typed languages (from my knowledge) is the reference variables have a defined type. Can there be a language that has the benefits of compile-time checking (like Java) but also the ability to have variables unbounded to a specific type (like Python)?
Note: Not exactly type inference in a language like Java, because the variables are still assigned a type, just implicitly. This theoretical language wouldn't have reference types, so there would be no casting. I'm trying to avoid the use of "static typing" vs "dynamic typing" because of the confusion.
There could be, but should there be?
Imagine in hypothetical-pseudo-C++:
class Object
{
public:
virtual Object invoke(const char *name, std::list<Object> args);
virtual Object get_attr(const char *name);
virtual const Object &set_attr(const char *name, const Object &src);
};
And that you have a language that arranges:
to make Object class the root base class of all classes
syntactic sugar to turn blah.frabjugate() into blah.invoke("frabjugate") and
blah.x = 10 into blah.set_attr("x", 10)
Add to this something combining attributes of boost::variant and boost::any and you have a pretty good start. All the dynamicism (both good and runtime bugs bad) of Python with the eloquence and rigidity (yay!) of C++ or Java. With added run-time bloat and efficiency of hash-table lookups vs. call/jmp machine instructions.
In languages like Python, when you call blah.do_it() it has to do potentially multiple hash table lookups of the string "do_it" to find out if your instance blah or its class has a callable thing called "doit" every time it is called. This is the most extreme late-binding that could be imaged:
flarg.do_it() # replaces flarg.do_it()
flarg.do_it() # calls a different flarg.do_it()
You could have your hypothetical language give some control over when the binding occurs. C++-like standard methods are crudely static bound to the apparent reference type, not the real instance type. C++ virtual methods are late-bound to the object instance type. Python-like attributes and methods are extremely late bound to the current version of the object instance.
I think you could definitely program in a strong static typed language in a dynamic style, just as you could build an interpreter in a language like C++ or Java. Some syntax hooks could make it look a little more seamless. But maybe you could do the same in reverse: maybe a Python decorator that automatically checks argument types, or a MetaClass that does it at compile time? [no, I don't think this is possible...]
I think you should view it as a union of features. but you'd get both the best and the worst of both worlds...
Can there be a language that has the benefits of compile-time checking (like Java) but also the ability to have variables unbounded to a specific type (like Python)?
Actually mostly language have support for both, so yes. The difference is which form is preferred/easier and generally used. Java prefers static types but also supports dynamic casts and reflection.
This theoretical language wouldn't have reference types, so there would be no casting.
You have to consider that language also need to perform reasonably well so you have to consider how they will be implemented. You could have a super type but this makes optimisation very hard and you code will most likely either run slowly or use much more resources.
The more popular languages tend to make pragmatic implementation choices. They are not purely one type or another and are willing to borrow styles even if they don't handle them as cleanly as a "pure" language.
what exactly do they allow the compiler or programmer to do that dynamic types can't?
It is generally accepted that the quicker you find a bug, the cheaper it is to fix. When you first start programming, the cost of maintenance isn't high in your mind, but once you have much more experience you will realise that a successful project costs far more to maintain than it did to develop and fixing long standing bugs can be really costly.
static languages have two advantages
you pick up bugs sooner rather than later. The sooner the better. With dynamic languages you might never discover a bug if the code is never run.
the cost of maintenance is easier. Static languages make clearer the assumption made when the code was first written and are more likely to detect issues if you don't have enough test coverage (btw, you never have enough test coverage)
No you cannot. The difference here boils down to early binding versus late binding. Early binding means matching everything up on the binary level upfront, fixing it in code. The result is rigid, type-safe and fast code. Late binding means there is some kind of runtime interpretation involved. This results in flexiblility (potentially unsafe) at the cost of performance.
The two approaches are different on a technical level (compilation versus interpretation) and the programmer would have to choose which is desired when, which would defeat the benefit of having both in the first place.
In languages that use a (common) language runtime however you do get some of what you are asking for through reflection. But it is organized differently and still type-safe. It is not the implicit kind of binding you refer to but requires a bit of work and awareness from the programmer.
As far as what is possible with static types that is impossible with dynamic types: nothing. They are both Turing complete
The value of static types is finding bugs early. In Python, something as simple as a misspelled name isn't caught until you run the program, and even then only if the line of code with the misspelling is run.
class NuclearReactor():
def turn_power_off(self):
...
def shut_down_cleanly(self):
self.turn_power_of()
Saw a couple of similar questions today- got me thinking:
What are the rules for when to use generics?
When a collection is involved?
When there are getter methods which return collection elements?
Whether the object changes type during its lifetime?
Whether the relationship is composition/aggregation to the class?
There doesn't seem to be a consensus on the questions you should ask yourself in order to determine whether you should use generics. Is it purely an opinionated decision?
Is it easier to ask when you shouldn't use generics??
Let me start with some general points about generics and type information before I get back to the first point on your bullet list.
Generics prevent unnecessary type casts.
Do you remember Java before generics were introduced? Type casts were used everywhere.
This is what type casts essentially are: You are telling the compiler about an object's type, because the compiler doesn't know or cannot infer it.
The problem with type casts is that you sometimes make mistakes. You can suggest to the compiler that an instance of class Fiddle is a Frobble ((Frobble)fiddle), and the compiler will happily believe you and compile your source code. But if it turns out that you were wrong, you'll much later get a nice run-time error.
Generics are a different, and usually safer way of letting the compiler retain type information. Basically, the compiler is less likely to make typing mistakes than a human programmer... the less type casts required, the fewer potential error sources! Once you've established that a list can only contain Fiddle objects (List<Fiddle>), the compiler will keep this information and prevent you from having to type-cast each item in that list to some type. (You still could cast a list item to Frobble, but why should you, now that the compiler let's you know that the item is a Fiddle!?)
I have found that generics greatly reduce the need for type casting, so the presence of lots of type casts — especially when you always cast to the same type — might be an indicator that generics should be used instead.
Letting the compiler keep as much type information as possible is a good thing because typing errors can be discovered earlier (at compile-time instead of at run-time).
Generics as a replacement for the "generic" java.lang.Object type:
Before generics, if you wanted to write a method that worked on any type, you employed the java.lang.Object supertype, because every class derives from it.
Generics allow you to also write methods that work for any type, but without forcing you or the compiler to throw away known type information — that's exactly what happens when you cast an object to the Object type. So, frequent use of the Object type might be another indicator that generics might be appropriate.
When a collection is involved?
Why do generics seem an especially good fit for collections? Because, according to the above reasoning, collections are rarely allowed to contain just any kind of object. If that were so, then the Object type would be appropriate because it doesn't put any restrictions whatsoever on the collection. Usually however, you expect all items in a collection to be (at least) a Frobble (or some other type), and it helps if you let the compiler know. Generics are the way how to do just that.
Whether the relationship is composition/aggregation to the class?
You've linked to another question that asks about a class Person having a car property should be made generic as class Person<T extends ICar>.
In that case, it depends whether your program needs to distinguish between Honda people and Opel people. By making such a Person class generic, you essentially introduce the possibility of different kinds of people. If this actually solves a problem in your code, then go for it. If, however, it only introduces hurdles and difficulties, then resist the urge and stay with your non-generic Person class.
Side node: Keep in mind that you don't have to make a whole class generic; you can make only a few specific methods generic. At least in the .NET ecosystem, it is recommended to keep generics as "local" as possible, i.e. don't turn a class into a generic one when it's sufficient to make only a method generic.
I find myself using generics when the following three criteria are met:
I note that I am repeating code, and start thinking of how to refactor it into a new method/class.
The class/method I am rewriting doesn't really care about what the concrete type of one of the arguments is, only that it follows a certain contract (eg <T extends Bar>).
The return type of the method/one of the methods is related to said parameter or
two or more parameters are related and need to have the same type, although I don't really care what that type is.
Usually when these criteria are met, there is a Collection of some kind involved, but not necessarily.
In my opinion the second statement (when not to use them) is correct.
When not to use generics: when the strong typing is too restrictive (typically generics in generics). In some cases you want to ensure loose coupling among your components and the point is "send me what you want, the API will somehow handle it", than you will employ some kind of visitor, rather than specifying complete concrete API using some generic type.
When you should: if you had not, you would have to cast the variable to some type (you even you might have to guess or use instanceof)...
Just one sidenote: every structured type is some kind of collection...
Is it easier to ask when you shouldn't use generics??
To answer this question, one of the major problems with Generics is its treatment for Checked Exceptions. Here is a write-up from Geotz about this.
Reason why you should consider generics, again there is a cache of information shared.
You can use e.g. JUnit to test the functionality of your library, but how do you test its type-safetiness with regards to generics and wildcards?
Only testing against codes that compile is a "happy path" testing; shouldn't you also test your API against non-type-safe usage and confirm that those codes do NOT compile?
// how do you write and verify these kinds of "tests"?
List<Number> numbers = new ArrayList<Number>();
List<Object> objects = new ArrayList<Object>();
objects.addAll(numbers); // expect: this compiles
numbers.addAll(objects); // expect: this does not compile
So how do you verify that your genericized API raises the proper errors at compile time? Do you just build a suite a non-compiling code to test your library against, and consider a compilation error as a test success and vice versa? (Of course you have to confirm that the errors are generics-related).
Are there frameworks that facilitate such testing?
Since this is not testing in the traditional sense (that is - you can't "run" the test), and I don't think such a tool exists, here's what I can suggest:
Make a regular unit-test
Generate code in it - both the right code and the wrong code
Use the Java compiler API to try to compile it and inspect the result
You can make an easy-to-use wrapper for that functionality and contribute it for anyone with your requirements.
It sounds like you are trying to test the Java compiler to make sure it would raise the right compilation errors if you assign the wrong types (as opposed to testing your own api).
If that is the case, why aren't you also concerned about the compiler not failing when you assign Integers to String fields, and when you call methods on objects that have not been initialized, and the million other things compilers are supposed to check when they compile code?!
I guess your question isn't limited to generics. We can raise the same question to non-generic codes. If the tool you described exists, I'll be terrified. There are lots of people very happy to test their getters and setters(and try to enforce that on others). Now they are happier to write new tests to make sure that accesses to their private fields don't compile! Oh the humanity!
But then I guess generics are way more complicated so your question isn't moot. To most programmers, they'll be happy if they can get their damn generics code finally compile. If a piece of generics code doesn't compile, which is the norm during dev, they aren't really sure who to blame.
"How do you test the type-safetiness of your genericized API?" IMHO, the short answer to your question should be:
Don't use any #SuppressWarnings
Make sure you compile without warnings (or errors)
The longer answer is that "type safety" is not a property of an API, it is a property of the programming language and its type system. Java 5 generics is type safe in the sense that it gives you the guarantee that you will not have a type error (ClassCastException) at runtime unless it originates from a user-level cast operation (and if you program with generics, you rarely need such casts anymore). The only backdoor is the use of raw types for interoperability with pre-Java 5 code, but for these cases the compiler will issue warnings such as the infamous "unchecked cast" warning to indicate that type-safety may be compromised. However, short of such warnings, Java will guarantee your type safety.
So unless you are a compiler writer (or you do not trust the compiler), it seems strange to want to test "type safety". In the code example that you give, if you are the implementor of ArrayList<T>, you should only care to give addAll the most flexible type signature that allows you to write a functionally correct implementation. For example, you could type the argument as Collection<T>, but also as Collection<? extends T>, where the latter is preferred because it is more flexible. While you can over-constrain your types, the programming language and the compiler will make sure that you cannot write something that is not type-safe: for example, you simply cannot write a correct implementation for addAll where the argument has type Collection<?> or Collection<? super T>.
The only exception I can think of, is where you are writing a facade for some unsafe part of the system, and want to use generics to enforce some kind of guarantees on the use of this part through the facade. For example, although Java's reflection is not controlled as such by the type system, Java uses generics in things such as Class<T>, to allow that some reflective operations, such as clazz.newInstance(), to integrate with the type system.
Maybe you can use Collections.checkedList() in your unit test. The following example will compile but will throw a ClassCassException. Example below is copied from #Simon G.
List<String> stringList = new ArrayList<String>();
List<Number> numberList = Collections.checkedList(new ArrayList<Number>(), Number.class);
stringList.add("a string");
List list = stringList;
numberList.addAll(list);
System.out.println("Number list is " + numberList);
Testing for compilation failures sounds like barking up the wrong tree, then using a screwdriver to strip the bark off again. Use the right tool for the right job.
I would think you want one or more of:
code reviews (maybe supported by a code review tool like JRT).
static analysis tools (FindBugs/CheckStyle)
switch language to C++, with an implementation that supports concepts (may require also switching universe to one in which such an implementation exists).
If you really needed to to this as a 'test', you could use reflection to enforce any desired rule, say 'any function starting with add must have an argument that is a generic'. That's not very different from a custom Checkstyle rule, just clumsier and less reusable.
Well, in C++ they tried to do this with concepts but that got booted from the standard.
Using Eclipse I get pretty fast turn around time when something in Java doesn't compile, and the error messages are pretty straight forward. For example if you expect a type to have a certain method call and it doesn't then your compiler tells you what you need to know. Same with type mismatches.
Good luck building compile time concepts into java :P
There are plenty of questions on stackoverflow from people who have attempted to create an array of generics like so:
ArrayList<Foo>[] poo = new ArrayList<Foo>[5];
And the answer of course is that the Java specification doesn't allow you to declare an array of generics.
My question however is why ? What is the technical reason underlying this restriction in the java language or java vm? It's a technical curiosity I've always wondered about.
Arrays are reified - they retain type information at runtime.
Generics are a compile-time construct - the type information is lost at runtime. This was a deliberate decision to allow backward compatibility with pre-generics Java bytecode. The consequence is that you cannot create an array of generic type, because by the time the VM wants to create the array, it won't know what type to use.
See Effective Java, Item 25.
Here is an old blog post I wrote where I explain the problem: Java generics quirks
See How do I generically create objects and arrays? from Angelika Langer's Java Generics FAQ for a workaround (you can do it using reflection). That FAQ contains everything you ever want to know about Java generics.
You are forced to use
ArrayList<Foo>[] poo = new ArrayList[5];
which will give you an unchecked warning. The reason is that there is a potential type safety issue with generics and the runtime type-checking behavior of Java arrays, and they want to make sure you are aware of this when you program. When you write new ArrayList[...], you are creating something which will check, at runtime, everything that gets put into it to make sure that it is an instance of ArrayList. Following this scheme, when you do new ArrayList<Foo>[...], then you expect to create something that checks at runtime everything that gets put into it to make sure it is an instance of ArrayList<Foo>. But this is impossible to do at runtime, because there is no generics info at runtime.
private ArrayList<String> colors = new ArrayList<String>();
Looking at the example above, it seems the main point of generics is to enforce type on a collection. So, instead of having an array of "Objects", which need to be cast to a String at the programmer's discretion, I enforce the type "String" on the collection in the ArrayList. This is new to me but I just want to check that I'm understanding it correctly. Is this interpretation correct?
That's by far not the only use of generics, but it's definitely the most visible one.
Generics can be (and are) used in many different places to ensure static type safety, not just with collections.
I'd just like to mention that, because you'll come accross places where generics could be useful, but if you're stuck with the generics/collections association, then you might overlook that fact.
Yes, your understanding is correct. The collection is strongly-typed to whatever type is specified, which has various advantages - including no more run-time casting.
Yeah, that's basically it. Before generics, one had to create an ArrayList of Objects. This meant that one could add any type of Object to the list - even if you only meant for the ArrayList to contain Strings.
All generics do is add type safety. That is, now the JVM will make sure that any object in the list is a String, and prevent you from adding a non-String object to the list. Even better: this check is done at compile time.
Yes. To maintain type safety and remove runtime casts is the correct answer.
You may want to check out the tutorial in the Java site. It gives a good explanation of in the introduction.
Without Generics:
List myIntList = new LinkedList(); // 1
myIntList.add(new Integer(0)); // 2
Integer x = (Integer) myIntList.iterator().next(); // 3
With Generics
List<Integer> myIntList = new LinkedList<Integer>(); // 1'
myIntList.add(new Integer(0)); // 2'
Integer x = myIntList.iterator().next(); // 3'
I think it as type safety and also saving the casting. Read more about autoboxing.
You can add runtime checks with the Collections utility class.
http://java.sun.com/javase/6/docs/api/java/util/Collections.html#checkedCollection(java.util.Collection,%20java.lang.Class)
Also see checkedSet, checkedList, checkedSortedSet, checkedMap, checkedSortedMap
Yes, you are correct. Generics adds compile-time type safety to your program, which means the compiler can detect if you are putting the wrong type of objects into i.e. your ArrayList.
One thing I would like to point out is that although it removes the visible run-time casting and un-clutters the source code, the JVM still does the casting in the background.
The way Generics is implemented in Java it just hides the casting and still produces non-generic bytecode. An ArrayList<String> still is an ArrayList of Objects in the byte-code. The good thing about this is that it keeps the bytecode compatible with earlier versions. The bad thing is that this misses a huge optimization opportunity.
You can use generic anywhere where you need a type parameter, i.e. a type that should be the same across some code, but is left more or less unspecified.
For example, one of my toy projects is to write algorithms for computer algebra in a generic way in Java. This is interesting for the sake of the mathematical algorithms, but also to put Java generics through a stress test.
In this project I've got various interfaces for algebraic structures such as rings and fields and their respective elements, and concrete classes, e.g. for integers or for polynomials over a ring, where the ring is a type parameter. It works, but it becomes somewhat tedious in places. The record so far is a type in front of a variable that spans two complete lines of 80 characters, in an algorithm for testing irreducibility of polynomials. The main culprit is that you can't give a complicated type its own name.