Saw a couple of similar questions today- got me thinking:
What are the rules for when to use generics?
When a collection is involved?
When there are getter methods which return collection elements?
Whether the object changes type during its lifetime?
Whether the relationship is composition/aggregation to the class?
There doesn't seem to be a consensus on the questions you should ask yourself in order to determine whether you should use generics. Is it purely an opinionated decision?
Is it easier to ask when you shouldn't use generics??
Let me start with some general points about generics and type information before I get back to the first point on your bullet list.
Generics prevent unnecessary type casts.
Do you remember Java before generics were introduced? Type casts were used everywhere.
This is what type casts essentially are: You are telling the compiler about an object's type, because the compiler doesn't know or cannot infer it.
The problem with type casts is that you sometimes make mistakes. You can suggest to the compiler that an instance of class Fiddle is a Frobble ((Frobble)fiddle), and the compiler will happily believe you and compile your source code. But if it turns out that you were wrong, you'll much later get a nice run-time error.
Generics are a different, and usually safer way of letting the compiler retain type information. Basically, the compiler is less likely to make typing mistakes than a human programmer... the less type casts required, the fewer potential error sources! Once you've established that a list can only contain Fiddle objects (List<Fiddle>), the compiler will keep this information and prevent you from having to type-cast each item in that list to some type. (You still could cast a list item to Frobble, but why should you, now that the compiler let's you know that the item is a Fiddle!?)
I have found that generics greatly reduce the need for type casting, so the presence of lots of type casts — especially when you always cast to the same type — might be an indicator that generics should be used instead.
Letting the compiler keep as much type information as possible is a good thing because typing errors can be discovered earlier (at compile-time instead of at run-time).
Generics as a replacement for the "generic" java.lang.Object type:
Before generics, if you wanted to write a method that worked on any type, you employed the java.lang.Object supertype, because every class derives from it.
Generics allow you to also write methods that work for any type, but without forcing you or the compiler to throw away known type information — that's exactly what happens when you cast an object to the Object type. So, frequent use of the Object type might be another indicator that generics might be appropriate.
When a collection is involved?
Why do generics seem an especially good fit for collections? Because, according to the above reasoning, collections are rarely allowed to contain just any kind of object. If that were so, then the Object type would be appropriate because it doesn't put any restrictions whatsoever on the collection. Usually however, you expect all items in a collection to be (at least) a Frobble (or some other type), and it helps if you let the compiler know. Generics are the way how to do just that.
Whether the relationship is composition/aggregation to the class?
You've linked to another question that asks about a class Person having a car property should be made generic as class Person<T extends ICar>.
In that case, it depends whether your program needs to distinguish between Honda people and Opel people. By making such a Person class generic, you essentially introduce the possibility of different kinds of people. If this actually solves a problem in your code, then go for it. If, however, it only introduces hurdles and difficulties, then resist the urge and stay with your non-generic Person class.
Side node: Keep in mind that you don't have to make a whole class generic; you can make only a few specific methods generic. At least in the .NET ecosystem, it is recommended to keep generics as "local" as possible, i.e. don't turn a class into a generic one when it's sufficient to make only a method generic.
I find myself using generics when the following three criteria are met:
I note that I am repeating code, and start thinking of how to refactor it into a new method/class.
The class/method I am rewriting doesn't really care about what the concrete type of one of the arguments is, only that it follows a certain contract (eg <T extends Bar>).
The return type of the method/one of the methods is related to said parameter or
two or more parameters are related and need to have the same type, although I don't really care what that type is.
Usually when these criteria are met, there is a Collection of some kind involved, but not necessarily.
In my opinion the second statement (when not to use them) is correct.
When not to use generics: when the strong typing is too restrictive (typically generics in generics). In some cases you want to ensure loose coupling among your components and the point is "send me what you want, the API will somehow handle it", than you will employ some kind of visitor, rather than specifying complete concrete API using some generic type.
When you should: if you had not, you would have to cast the variable to some type (you even you might have to guess or use instanceof)...
Just one sidenote: every structured type is some kind of collection...
Is it easier to ask when you shouldn't use generics??
To answer this question, one of the major problems with Generics is its treatment for Checked Exceptions. Here is a write-up from Geotz about this.
Reason why you should consider generics, again there is a cache of information shared.
Related
I just making an effort to understand the power of the interfaces and how to use them to the best advantage.
So far, I understood that interfaces:
enable us to have another layer of abstraction, separate the what (defined by the interface) and the how (any valid implementation).
Given just one single implementation I would just build a house (in one particular way) and say here, its done instead of coming round with a building plan (the interface) and ask you, other developers to build it as i expect.
So far, so good.
What still puzzles me is why to favor interface types over class types when it comes to method parameters and return values. Why is that so? What are the benefits (drawbacks of the class approach)?
What interests me the most is how this actually translates into code.
Say we have a sort of pseudo mathInterface
public interface pseudoMathInterface {
double getValue();
double getSquareRoot();
List<Double> getFirstHundredPrimes();
}
//...
public class mathImp implements pseudoMathInterface { }
//.. actual implementation
So in the case of getPrimes() method I would bound it to List, meaning any concrete implementation of the List interface rather than a concerete implementation such as ArrayList!?
And in terms of the method parameter would I once again broaden my opportunities whilst ensuring that i can do with the type whatever i would like to do given it is part of the interface's contract which the type finally implements.!?
Say you are the creator of a Maven dependency, a JAR with a well-known, well-specified API.
If your method requests an ArrayList<Thing>, treating it is a collection of Things, but all I have got is a HashSet<Thing>, your method will twist my arm into copying everything into an ArrayList for no benefit;
if your method declares to return an ArrayList<Thing>, which (semantically) contains just a collection of Things and the index of an element within it carries no meaning, then you are forever binding yourself to returning an actual ArrayList, even though e.g. the future course of the project makes it obvious that a custom collection implementation, specifically tailored to the optimization of the typical use case of this method, is desperately needed to improve a key performance bottleneck.
You are forced to make an API breaking change, again for no benefit to your client, but just to fix an internal issue. In the meantime you've got people writing code which assumes an ArrayList, such as iterating through it by index (there is an extremely slight performance gain to do so, but there are early optimizers out there to whom that's plenty).
I propose you judiciously generalize from the above two statements into general principles which capture the "why" of your question.
An important reason to prefer interfaces for formal argument types is that it does not bind you to a particular class hierarchy. Java supports only single inheritance of implementation (class inheritance), but it supports unlimited inheritance of interface (implements).
Return types are a different question. A good rule of thumb is to prefer the most general possible argument types, and the most specific possible return types. The "most general possible" is pretty easy, and it clearly lines up with preferring interface types for formal arguments. The "most specific possible" return types is trickier, however, because it depends on just what you mean by "possible".
One reason for using interface types as your methods' declared return types is to allow you to return instances of non-public classes. Another is to preserve the flexibility to change what specific type you return without breaking dependent code. Yet another is to allow different implementations to return different types. That's just off the top of my head.
So in the case of getPrimes() method I would bound it to List, meaning any concrete implementation of the List interface rather than a concerete implementation such as ArrayList!?
Yes, this allows the method to later then change what List type it returns without breaking client code that uses the method.
Besides having the ability to change what object is really passed to/returned from a method without breaking code, sometimes it may be better to use an interface type as a parameter/return type to lower the visibility of fields and methods available. This would reduce overall complexity of the code that then uses that interface type object.
I have an ArrayList<Container>, However, Container can be Container<String>, Container<Integer> etc. while iterating the arraylist, I need to find out what type of container it is and respond accordingly. I know that java has type erasure, but is there a way to pre-store the type and retrieve it later? something like
public T type;
and to use it later such as
container.type A = container.dosomething();
The type of a variable is a compile-time concept. Its purpose is just for allowing the compiler to determine what operations are allowed with that variable or expression.
Therefore, a "type which is not known at compile time" is completely useless, because the compiler doesn't know anything about what can be done with it. So the variable might as well be typed Object.
At runtime Java does not keep type information on generics. Type deletion occurs and all containers are considered to be of the same runtime type. Using reflection it is possible to get information on generics, but it is rarely worth the effort. You should rather think about a redesign of your code, or using getClass() on an element in the container to determine the type of the actual elements in the container.
I am trying to build a Java to C++ trans-compiler (i.e. Java code goes in, semantically "equivalent" (more or less) C++ code comes out).
Not considering garbage collection, the languages are quite familiar, so the overall process works quite well already. One issue, however, are generics which do not exist in C++. Of course, the easiest way would be to perform erasure as done by the java compiler. However, the resulting C++ code should be nice to handle, so it would be good if I would not lose generic type information, i.e., it would be good, if the C++ code would still work with List<X> instead of List. Otherwise, the C++ code would need explicit casting everywhere where such generics are used. This is bug-prone and inconvenient.
So, I am trying to find a way to somehow get a better representation for generics. Of course, templates seem to be a good candidate. Although they are something completely different (metaprogramming vs. compile-time only type enhancement), they could still be useful. As long as no wildcards are used, just compiling a generic class to a template works reasonably well. However, as soon as wildcards come into play, things get really messy.
For example, consider the following java constructor of a list:
class List<T>{
List(Collection<? extends T> c){
this.addAll(c);
}
}
//Usage
Collection<String> c = ...;
List<Object> l = new List<Object>(c);
how to compile this? I had the idea of using chainsaw reinterpret cast between templates. Then, the upper example could be compiled like that:
template<class T>
class List{
List(Collection<T*> c){
this.addAll(c);
}
}
//Usage
Collection<String*> c = ...;
List<Object*> l = new List<Object*>(reinterpret_cast<Collection<Object*>>(c));
however, the question is whether this reinterpret cast produces the expected behaviour. Of course, it is dirty. But will it work? Usually, List<Object*> and List<String*> should have the same memory layout, as their template parameter is only a pointer. But is this guaranteed?
Another solution I thought of would be replacing methods using wildcards by template methods which instanciate each wildcard parameter, i.e., compile the constructor to
template<class T>
class List{
template<class S>
List(Collection<S*> c){
this.addAll(c);
}
}
of course, all other methods involving wildcards, like addAll would then also need template parameters. Another problem with this approach would be handling wildcards in class fields for example. I cannot use templates here.
A third approach would be a hybrid one: A generic class is compiled to a template class (call it T<X>) and an erased class (call it E). The template class T<X> inherits from the erased class E so it is always possible to drop genericity by upcasting to E. Then, all methods containing wildcards would be compiled using the erased type while others could retain the full template type.
What do you think about these methods? Where do you see the dis-/advantages of them?
Do you have any other thoughts of how wildcards could be implemented as clean as possible while keeping as much generic information in the code as possible?
Not considering garbage collection, the languages are quite familiar, so the overall process works quite well already.
No. While the two languages actually look rather similar, they are significantly different as to "how things are done". Such 1:1 trans-compilations as you are attempting will result in terrible, underperforming, and most likely faulty C++ code, especially if you are looking not at a stand-alone application, but at something that might interface with "normal", manually-written C++.
C++ requires a completely different programming style from Java. This begins with not having all types derive from Object, touches on avoiding new unless absolutely necessary (and then restricting it to constructors as much as possible, with the corresponding delete in the destructor - or better yet, follow Potatoswatter's advice below), and doesn't end at "patterns" like making your containers STL-compliant and passing begin- and end-iterators to another container's constructor instead of the whole container. I also didn't see const-correctness or pass-by-reference semantics in your code.
Note how many of the early Java "benchmarks" claimed that Java was faster than C++, because Java evangelists took Java code and translated it to C++ 1:1, just like you are planning to do. There is nothing to be won by such transcompilation.
An approach you haven't discussed is to handle generic wildcards with a wrapper class template. So, when you see Collection<? extends T>, you replace it with an instantiation of your template that exposes a read-only[*] interface like Collection<T> but wraps an instance of Collection<?>. Then you do your type erasure in this wrapper (and others like it), which means the resulting C++ is reasonably nice to handle.
Your chainsaw reinterpret_cast is not guaranteed to work. For instance if there's multiple inheritance in String, then it's not even possible in general to type-pun a String* as an Object*, because the conversion from String* to Object* might involve applying an offset to the address (more than that, with virtual base classes)[**]. I expect you'll use multiple inheritance in your C++-from-Java code, for interfaces. OK, so they'll have no data members, but they will have virtual functions, and C++ makes no special allowance for what you want. I think with standard-layout classes you could probably reinterpret the pointers themselves, but (a) that's too strong a condition for you, and (b) it still doesn't mean you can reinterpret the collection.
[*] Or whatever. I forget the details of how the wildcards work in Java, but whatever's supposed to happen when you try to add a T to a List<? extends T>, and the T turns out not to be an instance of ?, do that :-) The tricky part is auto-generating the wrapper for any given generic class or interface.
[**] And because strict aliasing forbids it.
If the goal is to represent Java semantics in C++, then do so in the most direct way. Do not use reinterpret_cast as its purpose is to defeat the native semantics of C++. (And doing so between high-level types almost always results in a program that is allowed to crash.)
You should be using reference counting, or a similar mechanism such as a custom garbage collector (although that sounds unlikely under the circumstances). So these objects will all go to the heap anyway.
Put the generic List object on the heap, and use a separate class to access that as a List<String> or whatever. This way, the persistent object has the generic type that can handle any ill-formed means of accessing it that Java can express. The accessor class contains just a pointer, which you already have for reference counting (i.e. it subclasses the "native" reference, not an Object for the heap), and exposes the appropriately downcasted interface. You might even be able to generate the template for the accessor using the generics source code. If you really want to try.
I wonder why Java compiler doesn't trust in this line of code :
List<Car> l = new ArrayList();
and expects to have a typed ArrayList :
List<Car> l = new ArrayList<Car>();
Indeed, compiler indicates an unchecked assignment with the first case.
Why doesn't compiler see that this ArrayList() has just been created and so it's impossible to find already in it some objects other than 'Car'?
This warning would make sense if the untyped ArrayList was created before but doesn't in this case ...
Indeed, since List is typed as 'Car', all futures "l.add('object')" will be allowed only if 'object' is a 'Car'. => So, according to me, no surprise could happen.
Am I wrong ?
Thanks
Why doesn't compiler see that this ArrayList() has just been created and so it's impossible to find already in it some objects other than 'Car'?
The simple answer is "Because it is not allowed to."
The compiler has to implement the Java Language Specification. If some compiler writer goes and adds a bunch of smarts to the compiler to allow things that "everyone" knows are safe, then what he's actually done is to introduce a portability problem. Code compiled and tested with this compiler will give compilation errors when compiled with a dumb (or more accurately, strictly conformant) Java compiler.
So why doesn't the JLS allow this? I can think of a couple of possible explanations:
The JLS writers didn't have time to add this to the specification; e.g. to consider all of the ramifications on the rest of the specification.
The JLS writers couldn't figure out a sound way to express this in the specification.
The JLS writers didn't want to put something into the specification that they weren't sure was implementable in a real world compiler ... without being too burdensome on the compiler writer.
And then there is the associated question of whether a compiler could implement such a check. I'm not qualified to answer that ... but I know enough about the problem to realize that it is not necessarily as simple to solve as one might imagine.
To add to Stephen C's really good answer (mostly because writing this as a comment would be really cumbersome, sorry), this problem is actually explicitly mentioned in the JLS:
Discussion
Variables of a raw type can be assigned from values of any
of the type's parametric instances.
For instance, it is possible to assign a Vector<String> to a Vector,
based on the subtyping rules (§4.10.2).
The reverse assignment from Vector to Vector<String> is unsafe (since
the raw vector might have had a different element type), but is still
permitted using unchecked conversion (§5.1.9) in order to enable
interfacing with legacy code. In this case, a compiler will issue an
unchecked warning.
Why they didn't special case this particular case? Because adding special cases to an implementation isn't something you do if you don't have to and the "correct" solution doesn't add any particular problems (apart from some writing effort which they presumably didn't think was too important).
Also every special case means that the compiler gets more complicated (well generally speaking at least and in this case certainly) - considering how simple javac is on the whole, I'd think it's not unlikely that having a simple, fast compiler was also one of the design goals.
private ArrayList<String> colors = new ArrayList<String>();
Looking at the example above, it seems the main point of generics is to enforce type on a collection. So, instead of having an array of "Objects", which need to be cast to a String at the programmer's discretion, I enforce the type "String" on the collection in the ArrayList. This is new to me but I just want to check that I'm understanding it correctly. Is this interpretation correct?
That's by far not the only use of generics, but it's definitely the most visible one.
Generics can be (and are) used in many different places to ensure static type safety, not just with collections.
I'd just like to mention that, because you'll come accross places where generics could be useful, but if you're stuck with the generics/collections association, then you might overlook that fact.
Yes, your understanding is correct. The collection is strongly-typed to whatever type is specified, which has various advantages - including no more run-time casting.
Yeah, that's basically it. Before generics, one had to create an ArrayList of Objects. This meant that one could add any type of Object to the list - even if you only meant for the ArrayList to contain Strings.
All generics do is add type safety. That is, now the JVM will make sure that any object in the list is a String, and prevent you from adding a non-String object to the list. Even better: this check is done at compile time.
Yes. To maintain type safety and remove runtime casts is the correct answer.
You may want to check out the tutorial in the Java site. It gives a good explanation of in the introduction.
Without Generics:
List myIntList = new LinkedList(); // 1
myIntList.add(new Integer(0)); // 2
Integer x = (Integer) myIntList.iterator().next(); // 3
With Generics
List<Integer> myIntList = new LinkedList<Integer>(); // 1'
myIntList.add(new Integer(0)); // 2'
Integer x = myIntList.iterator().next(); // 3'
I think it as type safety and also saving the casting. Read more about autoboxing.
You can add runtime checks with the Collections utility class.
http://java.sun.com/javase/6/docs/api/java/util/Collections.html#checkedCollection(java.util.Collection,%20java.lang.Class)
Also see checkedSet, checkedList, checkedSortedSet, checkedMap, checkedSortedMap
Yes, you are correct. Generics adds compile-time type safety to your program, which means the compiler can detect if you are putting the wrong type of objects into i.e. your ArrayList.
One thing I would like to point out is that although it removes the visible run-time casting and un-clutters the source code, the JVM still does the casting in the background.
The way Generics is implemented in Java it just hides the casting and still produces non-generic bytecode. An ArrayList<String> still is an ArrayList of Objects in the byte-code. The good thing about this is that it keeps the bytecode compatible with earlier versions. The bad thing is that this misses a huge optimization opportunity.
You can use generic anywhere where you need a type parameter, i.e. a type that should be the same across some code, but is left more or less unspecified.
For example, one of my toy projects is to write algorithms for computer algebra in a generic way in Java. This is interesting for the sake of the mathematical algorithms, but also to put Java generics through a stress test.
In this project I've got various interfaces for algebraic structures such as rings and fields and their respective elements, and concrete classes, e.g. for integers or for polynomials over a ring, where the ring is a type parameter. It works, but it becomes somewhat tedious in places. The record so far is a type in front of a variable that spans two complete lines of 80 characters, in an algorithm for testing irreducibility of polynomials. The main culprit is that you can't give a complicated type its own name.