Java inheritance not recognised in reflection

Java inheritance not recognised in reflection - java

I generally oppose extension since it creates a very strong connection between classes, which is easy to accidentally break.
However, I finally thought I'd found a reasonable case for it - I want to optionally use a compressed version of a file type in an existing system. The compressed version would be almost as quick as the uncompressed, and would have exactly the same methods available (i.e. read and write) - the only difference would be in the representation on disk. Therefore, I had the compressed version extend the uncompressed version so that either kind of file could be used, just by optionally insantiating the other type.
public class CompressedSpecialFile extends SpecialFile(){ ... }
if (useCompression){
SpecialFile = new CompressedSpecialFile();
} else {
SpecialFile = new SpecialFile();
}
However, at a later point in the program, we use reflection:
Object[] values = new Object[]{SpecialFile sf, Integer param1, String param2, ...}
Class myclass = Class.forName(algorithmName);
Class[] classes = // created by calling .getClass on each object in values
constructor = myclass.getConstructor(classes);
Algorithm = (Algorithm) constructor.newInstance(values)
Which all worked fine, but now the myclass.getConstructor class throws a NoSuchMethodException since the run-time type of the SpecialFile is CompressedSpecialFile.
However, I thought that was how extension is supposed to work - since CompressedSpecialFile extends SpecialFile, any parameter accepting a SpecialFile should accept a CompressedSpecialFile. Is this an error in Java's reflection, or a failure of my understanding?

Hmm, the response to this bug report seems to indicate that this is intentional.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4301875
We cannot make this change for compatibility reaons. Furthermore, we
would expect that getConstructor should behave analogously to getDeclaredMethod,
which also requires an exact match, thus it does not make sense to change one
without changing the other. It would be possible to add an additional suite of
methods that differed only in the way in which the argument types were matched,
however.
There are certainly cases where we might want to apply at runtime during
reflection the same overload-resolution algorithm used statically by the
compiler, i.e., in a debugger. It is not difficult to implement this
functionality with the existing API, however, so the case for adding this
functionality to core reflection is weak.
That bug report was closed as a duplicate of the following one, which provides a bit more implementation detail:
http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=1b08c721077da9fffffffff1e9a6465911b4e?bug_id=4287725
Work Around
Users of getMethod must be precise identifying the Class passed to the argument.
Evaluation
The essence of this request is that the user would like for Class.getMethod
to apply the same overloading rules as the compiler does. I think this is
a reasonable request, as I see a need for this arising frequently in certain
kinds of reflective programs, such as debuggers and scripting interpreters,
and it would be helpful to have a standard implementation so that everybody
gets it right. For compatibility, however, the behavior of the existing
Class.getMethod should be left alone, and a new method defined. There is
a case for leaving this functionality out on the basis of footprint, as it
can be implemented using existing APIs, albeit somewhat inefficiently.
See also 4401287.
Consensus appears to be that we should provide overload resolution in
reflection. Exactly when such functionality is provided would depend largely
on interest and potential uses.
For compatibility reasons, the Class.get(Declared)+{Method,Constructor}
implementation should not change; new method should be introduced. The
specification for these methods does need to be modified to define "match". See
bug 4651775.
You can keep digging into those referenced bugs and the actual links I provided (where there's discussion as well as possible workarounds) but I think that gets at the reasoning (though why a new method reflecting java's oop in reflection as well has not yet been implemented, I don't know).
In terms of workarounds, I suppose that for the one-level-deep version of inheritance, you can just call getSuperclass() on each class whose name is that of the extending class, but that's extremely inelegant and tied to you using it only on your classes implementing in the prescribed manner. Very kludgy. I'll try and look for another option though.

Related

Tracking method implementation changes in class bytecode

I have some abstract project (let's call it The Project) bytecode (of it's every class) inside some kotlin code, and each class bytecode is stored as ByteArray; the task is to tell which specific methods in each class are being modified from build to build of The Project. In other words, there are two ByteArrays of a same class of The Project, but they belong to different versions of it, and I need to compare them accurate. A simple example. Let's assume we have a trivial class:
class Rst {
fun getjson(): String {
abc("""ss""");
return "jsonValid"
}
public fun abc(s: String) {
println(s)
}
}
It's bytecode is stored in oldByteCode. Now some changes happened to the class:
class Rst {
fun getjson(): String {
abc("""ss""");
return "someOtherValue"
}
public fun newMethod(s: String) {
println("it's not abc anymore!")
}
}
It's bytecode is stored in newByteCode.
That's the main goal: compare oldByteCode to newByteCode.
Here we have the following changes:
getjson() method had been changed;
abc() method had been removed;
newMethod() had been created.
So, a method is changed, if it's signature remains the same. If not, it's already some different method.
Now back to the actual problem. I have to know every method's exact status by it's bytecode. What I have at the moment is the jacoco analyzer, which parses class bytecode to "bundles". In these bundles I have hierarchy of packages, classes, methods, but only with their signatures, so I cant tell if a method's body has any changes. I can only track signature differences.
Are there any tools, libs to split class bytecode to it's methods bytecodes? With those I could, for example, calculate hashes and compare them. Maybe asm library has any deal with that?
Any ideas are welcome.

TL;DR you approach of just comparing bytecode or even hashes won’t lead to a reliable solution, in fact, there is no solution with a reasonable effort to this kind of problem at all.
I don’t know, how much of it applies to the Kotlin compiler, but as elaborated in Is the creation of Java class files deterministic?, Java compilers are not required to produce identical bytecode even if the same version is used to compile exactly the same source code. While they may have an implementation that tries to be as deterministic as possible, things change when looking at different versions or alternative implementations, as explained in Do different Java Compilers (where the vendor is different) produce different bytecode.
Even when we assume that the Kotlin compiler is outstandingly deterministic, even across versions, it can’t ignore the JVM evolution. E.g. the removal of the jsr/ret instructions could not be ignored by any compiler, even when trying to be conservative. But it’s rather likely that it will incorporate other improvements as well, even when not being forced¹.
So in short, even when the entire source code did not change, it’s not a safe bet to assume that the compiled form has to stay the same. Even with an explicitly deterministic compiler we would have to be prepared for changes when recompiling with newer versions.
Even worse, if one method changes, it may have an impact on the compiled form of others, as instructions refer to items of a constant pool whenever constants or linkage information are needed and these indices may change, depending on how the other methods use the constant pool. There’s also an optimized form for certain instructions when accessing one of the first 255 pool indices, so changes in the numbering may require changing the form of the instruction. This in turn may have an impact on other instructions, e.g. switch instructions have padding bytes, depending on their byte code position.
On the other hand, a simple change of a constant value used in only one method may have no impact on the method’s bytecode at all, if the new constant happened to end up at the same place in the pool than the old constant.
So, to determine whether the code of two methods does actually the same, there is no way around parsing the instructions and understanding their meaning to some degree. Comparing just bytes or hashes won’t work.
¹ to name some non-mandatory changes, the compilation of class literals changed, likewise string concatenation changed from using StringBuffer to use StringBuilder and changed again to use StringConcatFactory, the use of getClass() for intrinsic null checks changed to requireNonNull(…), etc. A compiler for a different language doesn’t have to follow, but no-one wants to be left behind…
There are also bugs to fix, like obsolete instructions, which no compiler would keep just to stay deterministic.

Why is there no #ConformsTo for #FunctionalInterface in Java?

Java (1.8+) has an #FunctionalInterface annotation which (basically) suggests that you can pass a method reference in place of an Interface implementation to another method call. The useful one I was playing with today is:
DateTimeFormatter.parse(String, TemporalQuery<T>)
It's nice as it lets you tell the formatter what kind of result to hand back to you. The javadoc even gives you a nice example:
The query is typically a method reference to a from(TemporalAccessor) method. For example:
LocalDateTime dt = parser.parse(str, LocalDateTime::from);
Once I got my head around what the #FunctionalInterface is and means, I started to wonder how a consumer of an API is to figure out what they can actually use in its place. The example above tells you what you can use, and if you trace through the java.time package, you can find other method references you can use. However, any contributors to the API need to read through the entire javadoc to make sure they don't break any implicit contracts mentioned in other places (of course they should, especially for the JDK, but that's not the purpose of javadoc!)
So.. If a contributor to this API were to change the signature of LocalDateTime::from, then there's no compile time checking to say that this method no longer conforms to the FuncitonalInterface of 'TemporalQuery'. This would obviously break any consumers of the API and they could change their code to use an explicit lambda instead. I do understand that it does not need to, but if an annotation, similar to the optional '#Override' annotation were available, then it would provide some compile time checks as well as the possibility of introspecting/reflecting to discover available method references.
e.g.
#ConformsTo(TemporalQuery.class)
public static LocalDateTime from(TemporalAccessor temporal)
It would also then be possible to find, through introspection, any other method references that can be used for a FunctionalInterface.
So, to be clear, I understand that this is not necessary, but do think it seems to be an oversight not to include it as an optional Annotation. Is there any particular reason this could/should not exist?

The problems that arise from changing the signature or return type of a method, e.g. LocalDateTime::from isn't limited to functional interfaces. Even before Java 8 changing those things risked breaking existing code that relied on those things. That's why designing an API is always a challenge because changes to existing code can mean a lot of work.
Additionally, assuming the functional interface and the matching methods are part of different libraries, would you really want that they are closely coupled, i.e. both need to change when one changes? What if they are maintained by different organizations (let's say different open source projects or companies) - how should they coordinate?
As an example take Comparator.comparing(Function<? super T, ? extends U> keyExtractor). That basically accepts a reference to any method that takes no parameter and returns something comparable. There are so many libraries that already provide those methods, would you want them all to have to add #ConformsTo?
That said, a #ConformsTo would at best be incomplete and might even be misleading/outdated.
Edit:
Let's tackle both annotations from the view of the compiler.
#FunctionalInterface tells the compiler that it should complain when you define more than one abstract method or use it on something else other than an interface.
That means that the requirements/contract definition ("this interface is a functional interface") and the implementation (the interface itself) are contained in the same file and thus have to be changed together anyways.
#ConformsTo could tell the compiler to check the requirements of the functional interface (or even interfaces) and see if that method satisfies them.
So far so good, but the problem arises when the interface changes: it would couple the method and the interface which could be part of different and otherwise totally unrelated libraries. And even if they were part of the same library you could run into problems when the method itself wouldn't be recompiled - the compiler might miss that incompatibility and thus defy the purpose of that annotation (if it were only meant for humans then a simple comment would be sufficient as well).

Using reflection to modify the structure of an object

From wikipedia:
reflection is the ability of a computer program to examine and modify the structure and behavior (specifically the values, meta-data, properties and functions) of an object at runtime.
Can anyone give me a concrete example of modifying the structure of an object? I'm aware of the following example.
Object foo = Class.forName("complete.classpath.and.Foo").newInstance();
Method m = foo.getClass().getDeclaredMethod("hello", new Class<?>[0]);
m.invoke(foo);
Other ways to get the class and examine structures. But the questions is how modify is done?

Just an additional hint since the previous answers and comments answer the question concerning reflection.
To really change the structur of a class and therefore its behaviour during runtime look at Byte code instrumentaion and in this case javassist and asm libs. In any case this is not trivial task.
Additionally you might have a look at aspect programming technic, which enables you to enhance methods with some functionallity. Often used to introduce logging without the need to have a dependency of the logging classes within your class and also dont have the invocations of the logging methods between the problem related code.

In English reflection means "mirror image".
So I'd disagree with the Wikipedia definition. For me, reflection is about runtime inspection of code, not manipulation.
In java, you can modify the bytecode at runtime using byte code manipulation. One well known library and in wide spread use is CGLIB.

In java, reflection is not fully supported as defined by the wikipedia.
Only Field.setAccessible(true) or Method.setAccessible(true) really modifies a class, and still it only changes security, not behaviour.
Frameworks like e.g. hibernate use this to add behaviour to a class by e.g. generating a subclass in bytecode that accesses private fields in the parent class.
Java is still a static typed language, unlike javascript where you can change any behaviour at runtime.

The only method in reflection (java.lang.reflect) to modify object's class behaviour is to change the accessibility flag of Constructor, Method and Field - setAccessible, whatever wiki says. Though there are libraries like http://ru.wikipedia.org/wiki/Byte_Code_Engineering_Library for decomposing, modifying, and recomposing binary Java classes

How are java interfaces implemented internally? (vtables?)

C++ has multiple inheritance. The implementation of multiple inheritance at the assembly level can be quite complicated, but there are good descriptions online on how this is normally done (vtables, pointer fixups, thunks, etc).
Java doesn't have multiple implementation inheritance, but it does have multiple interface inheritance, so I don't think a straight forward implementation with a single vtable per class can implement that. How does java implement interfaces internally?
I realize that contrary to C++, Java is Jit compiled, so different pieces of code might be optimized differently, and different JVMs might do things differently. So, is there some general strategy that many JVMs follow on this, or does anyone know the implementation in a specific JVM?
Also JVMs often devirtualize and inline method calls in which case there are no vtables or equivalent involved at all, so it might not make sense to ask about actual assembly sequences that implement virtual/interface method calls, but I assume that most JVMs still keep some kind of general representation of classes around to use if they haven't been able to devirtualize everything. Is this assumption wrong? Does this representation look in any way like a C++ vtable? If so do interfaces have separate vtables and how are these linked with class vtables? If so can object instances have multiple vtable pointers (to class/interface vtables) like object instances in C++ can? Do references of a class type and an interface type to the same object always have the same binary value or can these differ like in C++ where they require pointer fixups?
(for reference: this question asks something similar about the CLR, and there appears to be a good explanation in this msdn article though that may be outdated by now. I haven't been able to find anything similar for Java.)
Edit:
I mean 'implements' in the sense of "How does the GCC compiler implement integer addition / function calls / etc", not in the sense of "Java class ArrayList implements the List interface".
I am aware of how this works at the JVM bytecode level, what I want to know is what kind of code and datastructures are generated by the JVM after it is done loading the class files and compiling the bytecode.

The key feature of the HotSpot JVM is inline caching.
This doesn't actually mean that the target method is inlined, but means that an assumption
is put into the JIT code that every future call to the virtual or interface method will target
the very same implementation (i.e. that the call site is monomorphic). In this case, a
check is compiled into the machine code whether the assumption actually holds (i.e. whether
the type of the target object is the same as it was last time), and then transfer control
directly to the target method - with no virtual tables involved at all. If the assertion fails, an attempt may be made to convert this to a megamorphic call site (i.e. with multiple possible types); if this also fails (or if it is the first call), a regular long-winded lookup is performed, using vtables (for virtual methods) and itables (for interfaces).
Edit: The Hotspot Wiki has more details on the vtable and itable stubs. In the polymorphic case, it still puts an inline cache version into the call site. However, the code actually is a stub that performs a lookup in a vtable, or an itable. There is one vtable stub for each vtable offset (0, 1, 2, ...). Interface calls add a linear search over an array of itables before looking into the itable (if found) at the given offset.

Explicit typing in Groovy: sometimes or never?

[Later: Still can't figure out if Groovy has static typing (seems that it does not) or if the bytecode generated using explicit typing is different (seems that it is). Anyway, on to the question]
One of the main differences between Groovy and other dynamic languages -- or at least Ruby -- is that you can statically explicitly type variables when you want to.
That said, when should you use static typing in Groovy? Here are some possible answers I can think of:
Only when there's a performance problem. Statically typed variables are faster in Groovy. (or are they? some questions about this link)
On public interfaces (methods, fields) for classes, so you get autocomplete. Is this possible/true/totally wrong?
Never, it just clutters up code and defeats the purpose of using Groovy.
Yes when your classes will be inherited or used
I'm not just interested in what YOU do but more importantly what you've seen around in projects coded in Groovy. What's the norm?
Note: If this question is somehow wrong or misses some categories of static-dynamic, let me know and I'll fix it.

In my experience, there is no norm. Some use types a lot, some never use them. Personally, I always try to use types in my method signatures (for params and return values). For example I always write a method like this
Boolean doLogin(User user) {
// implementation omitted
}
Even though I could write it like this
def doLogin(user) {
// implementation omitted
}
I do this for these reasons:
Documentation: other developers (and myself) know what types will be provided and returned by the method without reading the implementation
Type Safety: although there is no compile-time checking in Groovy, if I call the statically typed version of doLogin with a non-User parameter it will fail immediately, so the problem is likely to be easy to fix. If I call the dynamically typed version, it will fail some time after the method is invoked, and the cause of the failure may not be immediately obvious.
Code Completion: this is particularly useful when using a good IDE (i.e. IntelliJ) as it can even provide completion for dynamically added methods such as domain class' dynamic finders
I also use types quite a bit within the implementation of my methods for the same reasons. In fact the only times I don't use types are:
I really want to support a wide range of types. For example, a method that converts a string to a number could also covert a collection or array of strings to numbers
Laziness! If the scope of a variable is very short, I already know which methods I want to call, and I don't already have the class imported, then declaring the type seems like more trouble than it's worth.
BTW, I wouldn't put too much faith in that blog post you've linked to claiming that typed Groovy is much faster than untyped Groovy. I've never heard that before, and I didn't find the evidence very convincing.

I worked on a several Groovy projects and we stuck to such conventions:
All types in public methods must be specified.
public int getAgeOfUser(String userName){
...
}
All private variables are declared using the def keyword.
These conventions allow you to achieve many things.
First of all, if you use joint compilation your java code will be able to interact with your groovy code easily. Secondly, such explicit declarations make code in large projects more readable and sustainable. And of-course auto-completion is an important benefit too.
On the other hand, the scope of a method is usually quite small that you don't need to declare types explicitly. By the way, modern IDEs can auto-complete your local variables even if you use defs.

I have seen type information used primarily in service classes for public methods. Depending on how complex the parameter list is, even here I usually see just the return type typed. For example:
class WorkflowService {
....
WorkItem getWorkItem(processNbr) throws WorkflowException {
...
...
}
}
I think this is useful because it explicitly tells the user of the service what type they will be dealing with and does help with code assist in IDE's.

Groovy does not support static typing. See it for yourself:
public Foo func(Bar bar) {
return bar
}
println("no static typing")
Save and compile that file and run it.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.