Keeping track of what's in a Collection in pre-generics Java? - java

For a bunch of reasons that (believe it or not) are not as unsound as you may think, we are still (sigh) using Java 1.4 to build and run our code (though we plan to finally move to Java 7 by the end of the year).
Our existing code that uses Collection classes doesn't do a very good job of making it clear what is expected to be in the Collection. Obviously, you can read the code and see what the downcasts end up being done and infer from that, but you can't just look at a method declaration and know what the Collection object that is a method argument or method return value actually holds.
In new code that I'm writing and when I am in older code that uses Collections, I've been adding in-line comments to Collections declarations to show what would have been declared if generics were being used. For example:
Map/*<String, Set<Integer>>*/ theMap = new HashMap/*<String, Set<Integer>>*/();
or
List/*<Actions>*/ someMethod(List/*<Job>*/ jobs);
In keeping with the frowning at subjectivity here at SO, rather than asking what you think of this (though admittedly I'd like to know -- I do find it a bit ugly but still like having the type info there) I'd instead just ask what, if anything, you do to make it clear what is being held by pre-generics Collection objects.

What we recommended back in the old days -- and I was a Java Architect at Sun when Java 1.1 was the New Thing -- was to write a class around the structure (I don't think 1.1 even had Collection as a base class) so that the typecasts happned in code you control instead of in user code. So, for example, something like
public class ArrayOfFoo {
Object [] ary; // ctor left as exercise
public void set(int index, Foo value){
ary[index] = (Object) value; // cast strictly not needed, any Foo is an Object
}
public void get(int index){
return (Foo) ary[index]; // cast needed, not every Object is a Foo
}
}
Sounds like the code base you have isn't built to this convention; if you're writing new code, there's no reason you can't start. Failing that, your convention isn't bad, but it's easy to forget the cast and then have to search to find out why you're getting a bad cast exception. It's mildly better to resort of some variant on Hungarian notation, or the Smalltalk 'aVariable' convention, by encoding the type in the names, so that you use
Object fooAry = new Object[aZillion];
fooAry[42] = new Foo();
Foo aFoo = fooAry[42];

Use clear variable identifiers such as jobList, actionList, or dictionaryMap. If you're concerned with the type of objects they contain, you could even make it a convention to always let the identifier of a Collection hint about which type of objects it holds.
The inlined comments aren't that idea actually. When I ported a 1.5 project back to 1.4 I did just that (instead of removing the type parameters). It worked out quite well.

I'd recommend writing tests. For various reasons:
You should be writing tests anyway!
You can assert the type of a collection member very easily to ensure that all your code paths are adding the right types to the collection
You can use the test to write code that serves as an "example" of how to use the collection correctly

If you just need binary compatibility to 1.4 you could consider using a tool to downgrade the class files back to 1.4 and thus start to develop in 1.6 or 1.7 right now. You would of course need to avoid any API that hasn't been there in 1.4 (unfortunately you can't compile code with generics against the 1.4 jars directly as they don't declare any generic types). The Bytecode is still the same (at least with 1.6, I don't know for sure about 1.7). One free tool that can do the trick is ProGuard. It can do much more sophisticated things and can also remove all traces of generics in the class files. Just turn off the obfuscation and optimization if you don't need it. It will also warn you if some missing API was used in the processed code if you feed it the 1.4 libraries.
I'm aware that is considered a hack by many but we had a similar requirement where we needed some code to still run on a Personal Java VM (this is essentially Java 1.1) and several other exotic VMs and this approach worked quite well. We started with ProGuard and then made our own tool for the task to be able to implement a few workarounds for some Bugs in the diverse VMs.

Related

Java vs Scala Types Hierarchy

Currently, I am learning Scala and I noticed that Type Hierarchy in Scala is much more consistent. There is Any type which is really super type of all types, unlike Java Object which is only a super type of all reference types.
Java Examples
Java approach led to introduction of Wrapper Classes for primitives, Auto-boxing. It also led to having types which cannot be e.g. keys in HashMaps. All those things adds to complexity of the language.
Integer i = new Integer(1); // Is it really needed? 1 is already an int.
HashMap<int, String> // Not valid, as int is not an Object sub type.
Question
It seems like a great idea to have all types in one hierarchy. This leads to the question: Why there is no single hierarchy of all types in Java? There is division between primitive types and reference types. Does it have some advantages? Or was it bad design decision?
That's a rather broad question.
Many different avenues exist to explain this. I'll try to name some of them.
Java cares about older code; scala mostly does not.
Programming languages are in the end defined by their community; a language that nobody but you uses is rather handicapped, as you're forced to write everything yourself. That does mean that the way the community tends to do things does reflect rather strongly on whether a language is 'good' or not. The java community strongly prefers reasonable backwards compatibility (reasonable as in: If there is a really good reason not to be entirely backwards compatible, for example because the feature you're breaking is very rarely used, or almost always used in ways that's buggy anyway, that's okay), the scala community tends to flock from one hip new way of doing things to the other, and any library that isn't under very active development either does not work at all anymore or trying to integrate it into more modern scala libraries is a very frustrating exercise.
Java doesn't work like that. This can be observed, for example, in generics: Generics (the T in List<T>) weren't in java 1.0; they were introduced in java 1.5, and they were introduced in a way that all existing code would just continue to work fine, all libraries, even without updates, would work allright with newer code, and adapting existing code to use generics did not require picking new libraries or updating much beyond adding the generics to the right places in the source file.
But that came at a cost: erasure. And because the pre-1.5 List class worked with Objects, generics had to work with Object as an implicit bound. Java could introduce an Any type but it would be mostly useless; you couldn't use it in very many places.
Erasure means that, in java, generics are mostly a figment of the compiler's imagination: That's why, given an instance of a list, you cannot ask it what its component type is; it simply does not know. You can write List<String> x = ...; String y = x.get(0); and that works fine, but that is because the compiler injects an invisible cast for you, and it knows this cast is fine because the generics give the compiler a framework to judge that this cast will never cause a ClassCastException (barring explicit attempts to mess with it, which always comes with a warning from the compiler when you do). But you can't cast an Object to an int, and for good reason.
The Scala community appears to be more accepting of a new code paradigm that just doesn't interact with the old; they'll refactor all their code and leave some older library by the wayside more readily.
Java is more explicit than scala is.
Scalac will infer tons of stuff, that's more or less how the language is designed (see: implicit). For some language features you're forced to just straight up make a call: You're trading clarity for verbosity. There where you are forced to choose, java tends to err on the side of clarity. This shows up specifically when we're talking about silent heap and wrapper hoisting: Java prefers not to do it. Yes, there's auto-boxing (which is silent wrapping), but silently treating int which, if handled properly, is orders of magnitude faster than a wrapped variant, as the wrapped variant for an entire collection just so you can write List<int> is a bridge too far for java: Presumably it would be too difficult to realize that you're eating all the performance downsides.
That's why java doesn't 'just' go: Eh, whatever, we'll introduce an Any type and tape it all together at runtime by wrapping stuff silently.
primitives are performant.
In java (and as scala runs on the JVM, scala too), there are really only 9 types: int, long, double, short, float, boolean, char, byte, and reference. As in, when you have an int variable, in memory, it is literally just that value, but if you have a String variable, the string lives in the heap someplace and the value you're passing around everywhere is a pointer to it. Given that you can't directly print the pointer or do arithmetic on it, in java we like to avoid the term and call it a 'reference' instead, but make no mistake: That's just a pointer with another name.
pointers are inherently memory wasting and less performant. There are excellent reasons for this tradeoff, but it is what it is. However, trying to write code that can deal with a direct value just as well as a reference is not easy. Moving this complexity into your face by making it relatively difficult to writing code that is agnostic (which is what the Any type is trying to accomplish) is one way to make sure the programmers don't ever get confused about it.
The future
Add up the 3 things above and hopefully it is now clear that an Any type either causes a lot of downsides, or, that it would be mostly useless (you couldn't use it anywhere).
However, there is good news on the horizon. Google for 'Project Valhalla' and 'java value types'. This is a difficult endeavour that will attempt to allow a lot of what an Any type would bring you, including, effectively, primitives in generics. In a way that integrates with existing java code, just like how java's approach to closures meant that java did not need to make scala's infamous Function8<A,B,C,D,E,F,G,H,R> and friends. Doing it right tends to be harder, so it took quite a while, and project valhalla isn't finished yet. But when it is, you WILL be able to write List<int> list = new ArrayList<int>, AND use Any types, and it'll all be as performant as can be, and integrate with existing java code as best as possible. Project Valhalla is not part of JDK14 and probably won't make 15 either.

Vector is an obsolete Collection

The Inspection reports any uses of java.util.Vector or java.util.hashtable. While still supported, these classes were made obsolete by the JDK 1.2 Collection classes and should probably not be used in new Development....
I have a project in Java which uses vector Everywhere, and I'm using JDK 8 which is the latest one. I want to know if I can run that application on latest java.
And tell if i can use some other keyword for ArrayList like Vector for new java.
First of all, although Vector is mostly obsoleted by ArrayList, it is still perfectly legal to use, and your project should run just fine.
As noted, however, it is not recommended to use. The main reason for this is that all of its methods are synchronized, which is usually useless and could considerably slow down your application. Any local variable that's not shared outside the scope of the method can safely be replaced with an ArrayList. Method arguments, return values and data members should be inspected closely before being replaced with ArrayList, lest you unwittingly change the synchronization semantics and introduce a hard-to-discover bug.

How unsafe is the use of sun.misc.Unsafe actually?

I am wondering about how unsafe the use sun.misc.Unsafe actually is. I want to create a proxy of an object where I intercept every method call (but the one to Object.finalize for performance considerations). For this purpose, I googled a litle bit and came up with the following code snippet:
class MyClass {
private final String value;
MyClass() {
this.value = "called";
}
public void print() {
System.out.println(value);
}
}
#org.junit.Test
public void testConstructorTrespassing() throws Exception {
#SuppressWarnings("unchecked")
Constructor<MyClass> constructor = ReflectionFactory.getReflectionFactory()
.newConstructorForSerialization(MyClass.class, Object.class.getConstructor());
constructor.setAccessible(true);
assertNull(constructor.newInstance().print());
}
My consideration is:
Even though Java is advertised as Write once, run everywhere my reality as a developer looks rather like Write once, run once in a controllable customer's run time environment
sun.misc.Unsafe is considered to become part of the public API in Java 9
Many non-Oracle VMs also offer sun.misc.Unsafe since - I guess - there are quite some libraries already use it. This also makes the class unlikely to disappear
I am never going to run the application on Android, so this does not matter for me.
How many people are actually using non-Oracle VMs anyways?
I am still wondering: Are there other reasons why I should not use sun.misc.Unsafe I did not think of? If you google this questions, people rather answer an unspecified because its not safe but I do not really feel it is besides of the (very unlikely) possibility that the method will one day disappear form the Oracle VM.
I actually need to create an object without calling a constructor to overcome Java's type system. I am not considering sun.misc.Unsafe for performance reasons.
Additional information: I am using ReflectionFactory in the example for convenience which delegates to Unsafe eventually. I know about libraries like objenesis but looking at the code I found out that they basically do something similar but check for other ways when using Java versions which would not work for me anyways so I guess writing four lines is worth saving a dependency.
There are three significant (IMO) issues:
The methods in the Unsafe class have the ability to violate runtime type safety, and do other things that can lead to your JVM "hard crashing".
Virtually anything that you do using Unsafe could in theory be dependent on internal details of the JVM; i.e. details of how the JVM does things and represents things. These may be platform dependent, and may change from one version of Java to the next.
The methods you are using ... or even the class name itself ... may not be the same across different releases, platforms and vendors.
IMO, these amount to strong reasons not to do it ... but that is a matter of opinion.
Now if Unsafe becomes standardised / part of the standard Java API (e.g. in Java 9), then some of the above issues would be moot. But I think the risk of hard crashes if you make a mistake will always remain.
During one JavaOne 2013 session Mark Reinhold (the JDK architect) got a question: "how safe it is to use the Unsafe class?". He replied with sort of surprising answer: "I believe its should become a stable API. Of course properly guarded with security checks, etc..."
So it looks like there may be something like java.util.Unsafe for JDK9. Meanwhile using the existing class is relatively safe (as safe as doing something unsafe can be).

Java inheritance not recognised in reflection

I generally oppose extension since it creates a very strong connection between classes, which is easy to accidentally break.
However, I finally thought I'd found a reasonable case for it - I want to optionally use a compressed version of a file type in an existing system. The compressed version would be almost as quick as the uncompressed, and would have exactly the same methods available (i.e. read and write) - the only difference would be in the representation on disk. Therefore, I had the compressed version extend the uncompressed version so that either kind of file could be used, just by optionally insantiating the other type.
public class CompressedSpecialFile extends SpecialFile(){ ... }
if (useCompression){
SpecialFile = new CompressedSpecialFile();
} else {
SpecialFile = new SpecialFile();
}
However, at a later point in the program, we use reflection:
Object[] values = new Object[]{SpecialFile sf, Integer param1, String param2, ...}
Class myclass = Class.forName(algorithmName);
Class[] classes = // created by calling .getClass on each object in values
constructor = myclass.getConstructor(classes);
Algorithm = (Algorithm) constructor.newInstance(values)
Which all worked fine, but now the myclass.getConstructor class throws a NoSuchMethodException since the run-time type of the SpecialFile is CompressedSpecialFile.
However, I thought that was how extension is supposed to work - since CompressedSpecialFile extends SpecialFile, any parameter accepting a SpecialFile should accept a CompressedSpecialFile. Is this an error in Java's reflection, or a failure of my understanding?
Hmm, the response to this bug report seems to indicate that this is intentional.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4301875
We cannot make this change for compatibility reaons. Furthermore, we
would expect that getConstructor should behave analogously to getDeclaredMethod,
which also requires an exact match, thus it does not make sense to change one
without changing the other. It would be possible to add an additional suite of
methods that differed only in the way in which the argument types were matched,
however.
There are certainly cases where we might want to apply at runtime during
reflection the same overload-resolution algorithm used statically by the
compiler, i.e., in a debugger. It is not difficult to implement this
functionality with the existing API, however, so the case for adding this
functionality to core reflection is weak.
That bug report was closed as a duplicate of the following one, which provides a bit more implementation detail:
http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=1b08c721077da9fffffffff1e9a6465911b4e?bug_id=4287725
Work Around
Users of getMethod must be precise identifying the Class passed to the argument.
Evaluation
The essence of this request is that the user would like for Class.getMethod
to apply the same overloading rules as the compiler does. I think this is
a reasonable request, as I see a need for this arising frequently in certain
kinds of reflective programs, such as debuggers and scripting interpreters,
and it would be helpful to have a standard implementation so that everybody
gets it right. For compatibility, however, the behavior of the existing
Class.getMethod should be left alone, and a new method defined. There is
a case for leaving this functionality out on the basis of footprint, as it
can be implemented using existing APIs, albeit somewhat inefficiently.
See also 4401287.
Consensus appears to be that we should provide overload resolution in
reflection. Exactly when such functionality is provided would depend largely
on interest and potential uses.
For compatibility reasons, the Class.get(Declared)+{Method,Constructor}
implementation should not change; new method should be introduced. The
specification for these methods does need to be modified to define "match". See
bug 4651775.
You can keep digging into those referenced bugs and the actual links I provided (where there's discussion as well as possible workarounds) but I think that gets at the reasoning (though why a new method reflecting java's oop in reflection as well has not yet been implemented, I don't know).
In terms of workarounds, I suppose that for the one-level-deep version of inheritance, you can just call getSuperclass() on each class whose name is that of the extending class, but that's extremely inelegant and tied to you using it only on your classes implementing in the prescribed manner. Very kludgy. I'll try and look for another option though.

Explicit typing in Groovy: sometimes or never?

[Later: Still can't figure out if Groovy has static typing (seems that it does not) or if the bytecode generated using explicit typing is different (seems that it is). Anyway, on to the question]
One of the main differences between Groovy and other dynamic languages -- or at least Ruby -- is that you can statically explicitly type variables when you want to.
That said, when should you use static typing in Groovy? Here are some possible answers I can think of:
Only when there's a performance problem. Statically typed variables are faster in Groovy. (or are they? some questions about this link)
On public interfaces (methods, fields) for classes, so you get autocomplete. Is this possible/true/totally wrong?
Never, it just clutters up code and defeats the purpose of using Groovy.
Yes when your classes will be inherited or used
I'm not just interested in what YOU do but more importantly what you've seen around in projects coded in Groovy. What's the norm?
Note: If this question is somehow wrong or misses some categories of static-dynamic, let me know and I'll fix it.
In my experience, there is no norm. Some use types a lot, some never use them. Personally, I always try to use types in my method signatures (for params and return values). For example I always write a method like this
Boolean doLogin(User user) {
// implementation omitted
}
Even though I could write it like this
def doLogin(user) {
// implementation omitted
}
I do this for these reasons:
Documentation: other developers (and myself) know what types will be provided and returned by the method without reading the implementation
Type Safety: although there is no compile-time checking in Groovy, if I call the statically typed version of doLogin with a non-User parameter it will fail immediately, so the problem is likely to be easy to fix. If I call the dynamically typed version, it will fail some time after the method is invoked, and the cause of the failure may not be immediately obvious.
Code Completion: this is particularly useful when using a good IDE (i.e. IntelliJ) as it can even provide completion for dynamically added methods such as domain class' dynamic finders
I also use types quite a bit within the implementation of my methods for the same reasons. In fact the only times I don't use types are:
I really want to support a wide range of types. For example, a method that converts a string to a number could also covert a collection or array of strings to numbers
Laziness! If the scope of a variable is very short, I already know which methods I want to call, and I don't already have the class imported, then declaring the type seems like more trouble than it's worth.
BTW, I wouldn't put too much faith in that blog post you've linked to claiming that typed Groovy is much faster than untyped Groovy. I've never heard that before, and I didn't find the evidence very convincing.
I worked on a several Groovy projects and we stuck to such conventions:
All types in public methods must be specified.
public int getAgeOfUser(String userName){
...
}
All private variables are declared using the def keyword.
These conventions allow you to achieve many things.
First of all, if you use joint compilation your java code will be able to interact with your groovy code easily. Secondly, such explicit declarations make code in large projects more readable and sustainable. And of-course auto-completion is an important benefit too.
On the other hand, the scope of a method is usually quite small that you don't need to declare types explicitly. By the way, modern IDEs can auto-complete your local variables even if you use defs.
I have seen type information used primarily in service classes for public methods. Depending on how complex the parameter list is, even here I usually see just the return type typed. For example:
class WorkflowService {
....
WorkItem getWorkItem(processNbr) throws WorkflowException {
...
...
}
}
I think this is useful because it explicitly tells the user of the service what type they will be dealing with and does help with code assist in IDE's.
Groovy does not support static typing. See it for yourself:
public Foo func(Bar bar) {
return bar
}
println("no static typing")
Save and compile that file and run it.

Categories