What's invokedynamic and how do I use it? - java

I keep hearing about all the new cool features that are being added to the JVM and one of those cool features is invokedynamic. I would like to know what it is and how does it make reflective programming in Java easier or better?

It is a new JVM instruction which allows a compiler to generate code which calls methods with a looser specification than was previously possible -- if you know what "duck typing" is, invokedynamic basically allows for duck typing. There's not too much you as a Java programmer can do with it; if you're a tool creator, though, you can use it to build more flexible, more efficient JVM-based languages. Here is a really sweet blog post that gives a lot of detail.

As part of my Java Records article, I articulated about the motivation behind Invoke Dynamic. Let's start with a rough definition of Indy.
Introducing Indy
Invoke Dynamic (Also known as Indy) was part of JSR 292 intending to enhance the JVM support for Dynamic Type Languages. After its first release in Java 7, The invokedynamic opcode along with its java.lang.invoke luggage is used quite extensively by dynamic JVM-based languages like JRuby.
Although indy specifically designed to enhance the dynamic language support, it offers much more than that. As a matter of fact, it’s suitable to use wherever a language designer needs any form of dynamicity, from dynamic type acrobatics to dynamic strategies!
For instance, the Java 8 Lambda Expressions are actually implemented using invokedynamic, even though Java is a statically typed language!
User-Definable Bytecode
For quite some time JVM did support four method invocation types: invokestatic to call static methods, invokeinterface to call interface methods, invokespecial to call constructors, super() or private methods and invokevirtual to call instance methods.
Despite their differences, these invocation types share one common trait: we can’t enrich them with our own logic. On the contrary, invokedynamic enables us to Bootstrap the invocation process in any way we want. Then the JVM takes care of calling the Bootstrapped Method directly.
How Does Indy Work?
The first time JVM sees an invokedynamic instruction, it calls a special static method called Bootstrap Method. The bootstrap method is a piece of Java code that we’ve written to prepare the actual to-be-invoked logic:
Then the bootstrap method returns an instance of java.lang.invoke.CallSite. This CallSite holds a reference to the actual method, i.e. MethodHandle.
From now on, every time JVM sees this invokedynamic instruction again, it skips the Slow Path and directly calls the underlying executable. The JVM continues to skip the slow path unless something changes.
Example: Java 14 Records
Java 14 Records are providing a nice compact syntax to declare classes that are supposed to be dumb data holders.
Considering this simple record:
public record Range(int min, int max) {}
The bytecode for this example would be something like:
Compiled from "Range.java"
public java.lang.String toString();
descriptor: ()Ljava/lang/String;
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokedynamic #18, 0 // InvokeDynamic #0:toString:(LRange;)Ljava/lang/String;
6: areturn
In its Bootstrap Method Table:
BootstrapMethods:
0: #41 REF_invokeStatic java/lang/runtime/ObjectMethods.bootstrap:
(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;
Ljava/lang/invoke/TypeDescriptor;Ljava/lang/Class;
Ljava/lang/String;[Ljava/lang/invoke/MethodHandle;)Ljava/lang/Object;
Method arguments:
#8 Range
#48 min;max
#50 REF_getField Range.min:I
#51 REF_getField Range.max:I
So the bootstrap method for Records is called bootstrap which resides in the java.lang.runtime.ObjectMethods class. As you can see, this bootstrap method expects the following parameters:
An instance of MethodHandles.Lookup representing the lookup context
(The Ljava/lang/invoke/MethodHandles$Lookup part).
The method name (i.e. toString, equals, hashCode, etc.) the bootstrap
is going to link. For example, when the value is toString, bootstrap
will return a ConstantCallSite (a CallSite that never changes) that
points to the actual toString implementation for this particular
Record.
The TypeDescriptor for the method (Ljava/lang/invoke/TypeDescriptor
part).
A type token, i.e. Class<?>, representing the Record class type. It’s
Class<Range> in this case.
A semi-colon separated list of all component names, i.e. min;max.
One MethodHandle per component. This way the bootstrap method can
create a MethodHandle based on the components for this particular
method implementation.
The invokedynamic instruction passes all those arguments to the bootstrap method. Bootstrap method, in turn, returns an instance of ConstantCallSite. This ConstantCallSite is holding a reference to requested method implementation, e.g. toString.
Why Indy?
As opposed to the Reflection APIs, the java.lang.invoke API is quite efficient since the JVM can completely see through all invocations. Therefore, JVM may apply all sorts of optimizations as long as we avoid the slow path as much as possible!
In addition to the efficiency argument, the invokedynamic approach is more reliable and less brittle because of its simplicity.
Moreover, the generated bytecode for Java Records is independent of the number of properties. So, less bytecode and faster startup time.
Finally, let’s suppose a new version of Java includes a new and more efficient bootstrap method implementation. With invokedynamic, our app can take advantage of this improvement without recompilation. This way we have some sort of Forward Binary Compatibility. Also, That’s the dynamic strategy we were talking about!
Other Examples
In addition to Java Records, the invoke dynamic has been used to implement features like:
Lambda Expressions in Java 8+: LambdaMetafactory
String Concatenation in Java 9+: StringConcatFactory

Some time ago, C# added a cool feature, dynamic syntax within C#
Object obj = ...; // no static type available
dynamic duck = obj;
duck.quack(); // or any method. no compiler checking.
Think of it as syntax sugar for reflective method calls. It can have very interesting applications. see http://www.infoq.com/presentations/Statically-Dynamic-Typing-Neal-Gafter
Neal Gafter, who's responsible for C#'s dynamic type, just defected from SUN to MS. So it's not unreasonable to think that the same things had been discussed inside SUN.
I remember soon after that, some Java dude announced something similar
InvokeDynamic duck = obj;
duck.quack();
Unfortunately, the feature is no where to be found in Java 7. Very disappointed. For Java programmers, they have no easy way to take advantage of invokedynamic in their programs.

There are two concepts to understand before continuing to invokedynamic.
1. Static vs. Dynamic Typing
Static - preforms type checking at compile time (e.g. Java)
Dynamic - preforms type checking at runtime (e.g. JavaScript)
Type checking is a process of verifying that a program is type safe, this is, checking typed information for class and instance variables, method parameters, return values, and other variables.
E.g. Java knows about int, String,.. at compile time, while type of an object in JavaScript can only be determined at runtime
2. Strong vs. Weak typing
Strong - specifies restrictions on the types of values supplied to its operations (e.g. Java)
Weak - converts (casts) arguments of an operation if those arguments have incompatible types (e.g. Visual Basic)
Knowing that Java is a Statically and Weakly typed, how do you implement Dynamically and Strongly typed languages on the JVM?
The invokedynamic implements a runtime system that can choose the most appropriate implementation of a method or function — after the program has been compiled.
Example:
Having (a + b) and not knowing anything about the variables a,b at compile time, invokedynamic maps this operation to the most appropriate method in Java at runtime. E.g., if it turns out a,b are Strings, then call method(String a, String b). If it turns out a,b are ints, then call method(int a, int b).
invokedynamic was introduced with Java 7.

The short answer is invokedynamic is a new opcode in the JVM that didn't exist prior to JAVA 7.
As far as reflection, within the context of this definition: Java Reflection is a process of examining or modifying the run time behavior of a class at run time., however, I believe more explanation is needed.
From the article below:
For example, reflection predates both collections and generics. As a
result, method signatures are represented by Class[] in the Reflection
API. This can be cumbersome and error-prone, and it is hampered by the
verbose nature of Java’s array syntax. It is further complicated by
the need to manually box and unbox primitive types and to work around
the possibility of void methods.
Method handles to the rescue
Instead of forcing the programmer to deal
with these issues, Java 7 introduced a new API, called MethodHandles,
to represent the necessary abstractions. The core of this API is the
package java.lang.invoke and especially the class MethodHandle.
Instances of this type provide the ability to call a method, and they
are directly executable. They are dynamically typed according to their
parameter and return types, which provides as much type safety as
possible, given the dynamic way in which they are used. The API is
needed for invokedynamic, but it can also be used alone, in which case
it can be considered a modern, safe alternative to reflection.
Quoting from Understanding Java method invocation with invokedynamic
These four are the bytecode representations of the standard forms of
method invocation used in Java 8 and Java 9, and they are
invokevirtual, invokespecial, invokeinterface, and invokestatic.
This raises the question of how the fifth opcode, invokedynamic,
enters the picture. The short answer is that, as of Java 9, there was
no direct support for invokedynamic in the Java language.
In fact, when invokedynamic was added to the runtime in Java 7, the
javac compiler would not emit the new bytecode under any circumstances
whatsoever.
As of Java 8, invokedynamic is used as a primary implementation
mechanism to provide advanced platform features. One of the clearest
and simplest examples of this use of the opcode is in the
implementation of lambda expressions.
So again, invokedynamic is a new opcode that allows for a new object reference type in JAVA, a Lambda.

Related

In Java what is the point of invokevirtual, if anyway the index of a method in a method table is known at compile-time?

I am reading about dynamic dispatch, as I have an exam tomorrow.
In C++ we have conforming subclasses, so through the static type of the identifier we know what index to access in the virtual method table of the runtime object.
From what I am reading, Java has conformance for subclasses as well, but instead of including the known index of a method in the virtual method table in the compiled code, it only includes a symbolic reference to the method, that needs to be resolved.
What is the point of this if the static type does not refer to an interface? It could be much faster to do it the C++ way.
The Java platform defines linkage as a step taken at runtime. Virtual method tables aren't even involved in the JVM specification; they are just a typical way to implement linkage.
Note, however, that after the symbolic reference is resolved into a direct reference, there is nothing stopping the runtime from using very fast code paths for method invocation sites. That includes special-case optimizations such as monomorphic call sites, which have a hardwired direct pointer to the method code and are thus faster than vtable lookups. Monomorphic sites then become an easy target for method inlining, which opens a whole new field of applicable optimizations. Another option is an n-polymorphic site, accommodating up to n different target types in an inline cache.
As opposed to C++, all these optimizing decisions happen at runtime, subject to the specific conditions at work: the exact set of loaded classes, profiling data for each individual call site, etc. This gives managed-runtime platforms such as Java advantages of their own.

What scala statements or code can produce a byte-code which can not be translated to java?

I have read an answer to a question about converting Scala code to Java code. It says:
I don't think it's possible to convert from scala back to standard java since Scala does some pretty low-level byte-code manipulation. I'm 90% sure they do some things that can't exactly be translated back into normal Java code.
So what Scala statements or code can produce bytecode which can not be translated to java?
P.S. I generally agree with that answer, but want a concrete example for learning purposes.
The answer really depends on how hard you want to try to convert the code.
Since Java and Scala are both turing complete, any program in one can trivially be converted to the other, but this isn't really interesting or useful.
What you really want is to convert the results to readable, idiomatic code. From this perspective, even Java code can't automatically be converted to Java because compilation loses information (though relatively little compared to C) and machines aren't as good as humans at writing human readable code anyway.
If you got a Java and Scala expert, they could probably rewrite your Scala codebase in Java and end up with reasonably idiomatic Java code. But it wouldn't be as readable as Scala due to the simple fact that Scala is a language designed to improve on Java. Scala tries to remove the warts from Java and provide powerful high level programming features, removing the need for all the classic Java boilerplate. So the Java equivalent codebase will not be as readable.
From this perspective, the answer is "any feature in Scala that is not in Java".
Scala's nested blocks do not have a Java equivalent.
Nested block in Scala (taken from this question):
def apply(x: Boolean) = new Tuple2(null, {
while (x) { }
null
})
Produces the bytecode
0: new #12 // class scala/Tuple2
3: dup
4: aconst_null
5: iload_1
6: ifne 5
9: aconst_null
10: invokespecial #16 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
13: areturn
At instruction 0 an uninitialised object is pushed onto the stack, and then initialised at instruction 10. Between these two points there is a backwards jump from 6 to 5. This actually reveals a bug in the OpenJDK bytecode verifier as it rejects this code despite the fact that it is acceptable by the JVM specifications. This probably got through testing as this bytecode can't be generated from Java.
As in Java nested blocks are not expressions that evaluate to a value then the the closest Java equivalent would be
public Tuple2 apply(boolean x){
while(x){}
return new Tuple2(null,null);
}
Which would compile to something akin to
0: iload_1
1: ifne 0
3: new #12 // class scala/Tuple2
6: dup
7: aconst_null
8: dup
9: invokespecial #16 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
12: areturn
Note that this doesn't have the uninitialised object on the stack at the time of the backwards jump. (N.B. bytecode was written by hand, do not execute!)
This paper from Li, White, and Singer shows differences in JVM languages including the bytecode that they compile to. It finds that in an N-gram analysis of bytecodes that 58.5% of 4-grams executed by Scala are not found in bytecode executed by Java. This is not to say that Java can't produce these bytecodes, but that they weren't present in the Java corpus.
As you noted, Scala eventually compiles to JVM bytecode. An obvious instruction from the JVM instruction set, that has no equivalent in the Java language, is goto.
A Scala compiler might use goto for instance to optimize loops or tail-recursive methods. In this case, in Java you would have to emulate the behavior of a goto.
As Antimony hinted, a Turing complete language can at least emulate another Turing complete language. However the resulting program may be heavyweight and suboptimal.
As a final note, decompilers may help. I'm not familiar with the intrinsics of decompilers, but I assume that they rely a lot on patterns. I mean, for example, Java source pattern f(x) compiles to Bytecode pattern f'(x), so with a lot of hard work and experience, some manage to decompile Bytecode f'(y) to Java source f(y).
However, I've not heard of Scala decompilers yet (maybe someone's working on that).
[EDIT] About what I originally meant by emulating the behavior of a goto:
I had in mind switch/case statements inside a loop, and cdshines showed another way by using labeled break/continue in a loop (though I believe that using "disregarded and condemned" features is not standard).
In either of these cases, in order to jump back to an earlier instruction, an idiomatic Java loop (for/while/do-while) is required (any other suggestion?). An endless loop makes it easy to implement, a conditional loop would require more work, but I assume this is doable.
Also, goto isn't limited to loops. In order to jump forward, Java would require other contructs.
A counterexample: in C, there are limitations but you don't have to go through such great lengths, because there's a goto instruction.
As a related topic, if you're interested in non-idiomatic jumps in Scala, c.f. this old Q&A of mine. My point being, not only a Scala compiler might emit goto in a way that's not natural in Java, but a developer can have a tight control on that with the help of Scala macros.
LabelDef: A labelled expression. Not expressible in language syntax, but generated by the compiler to simulate while/do-while loops, and also by the pattern matcher. In my past tests, it could be used for forward jumps as well. In Scala Internals, developers wrote about to removing LabelDef, but I don't know if and when they would.
Therefore, yes you can reproduce the behavior of goto in Java, but because of the complexity involved in so doing, that is not what I would call standard Java, IMHO. Maybe my wording is incorrect, but in my mind the reproduction of an elementary behavior by complex means is an "emulation" of that behavior.
Cheers.
It really depends on how you define So what Scala statements or code can produce bytecode which can not be translated to java?.
Ultimately, some of the scala features are backed by the so named ScalaSignature (scala signature) that stores meta information. As of 2.10, it may be deemed as a secret api which is abstracted by the scala reflection mechanisms (which are radically different from java reflection). The documentation is scarce, but you can check out this pdf to get the details (there could be major changes since then). There is no way to produce identical structures in native java, unless you're fallback bytecode manipulation tools.
In a more relaxed sense, there are macroses and implicts which interact solely with a scalac and have no direct analog in java. Yes, you can write java code, identical to result produced by scalac, but you can't write this dynamic instructions that will direct compiler.
I happen to work with a lot of byte code and I once wrote a summary of byte code features that are not reproduceable by writing Java code. However, all these non-existing features are rather conventions of composing byte code instructions. By Java 8, any existing opcode was used by a Java class file format. This is not too surprising as the Java language sort of drives the evolution of the Java byte code format. An exception might be the INVOKEDYNAMIC instruction which was introduced for better supporting dynamic languages on the JVM but even this instruction is used in Java 8 for implementing lambda expressions. Thus, there might be combinations / orders of byte code instructions that are not produced by the javac compiler but there is not a specific instruction that is only used by another JVM language.
Of the byte code feautes that I named in the summary, I would most noteably say that throw undeclared checked exceptions without catching them is a feature that is supported by Scala but not by Java. Otherwise, I would however say that there is no low-level byte-code manipulation by scalac that is unknown to javac. By my experience, most Scala classes can also be written explicitly in Java.
I think there is no such code. AFAIK there is only one jvm instruction that java can not generate -- invoke_dynamic. This instruction is for dynamic language and scala is a static type language which means it can not generate it either. So it it possible to translate scala code to java code, and probably un-readable java code.

Why most of the java.lang.reflect.Array class methods are 'native'

I have gone through What code and how does java.lang.reflect.Array create a new array at runtime?,. I understand that they are implemented in native language ('C'), But my question is why almost all methods java.lang.reflect.Array class methods are native .
My guess and understanding is that
To improve performance ? or to allocate continuous memory for arrays by JVM ?
Is my understanding correct about native methods in Array class or Do i miss anything ?
The reflect.Array.newInstance method uses native code because it must use native code. This has nothing inherently to do with performance but is a result of the fact that the Java language cannot express this operation.
To show that it's a language limitation and not strictly related to performance, here is some valid code which creates a new array without directly invoking any native method.
Object x = new String[0];
However, newInstance takes an arbitrary value of Class<?> and then creates the corresponding array with the represented type. However, this construct is not possible in plain Java and it cannot be expressed by the type-system or corresponding normal "new array" syntax.
// This production is NOT VALID in Java, as T is not a type
// (T is variable that evaluates to an object representing a type)
Class<?> T = String.class;
Object x = new T[0];
// -> error: cannot find symbol T
Because such a production is not allowed, a native method (which has access to the JVM internals) is used to create the new array instance of the corresponding type.
While the above argues for the case of newInstance needing to be native, I believe many of the other reflect.Array methods (which are get/set methods) could be handled in plain Java with the use of specialized casting; in these cases the argument for performance holds sway.
However, most code does not use the Array reflection (this includes "multi-valued data structures" such as ArrayList), but simply uses normal Java array access which is directly translated to the appropriate Java bytecode without going through reflect.Array or the native methods it uses.
Conclusion:
Java already provides fast array access through the JVM's execution of the bytecode. HotSpot, the "official" JVM, is written in C++ which is "native" code - but this execution of array-related bytecode is independent of reflect.Array and the use native methods.
newInstance uses a native method because it must use a native method or otherwise dynamically generate and execute bytecode.
Other reflect.Array methods that could be expressed in Java are native methods for a combination of performance, dispatch simplicity, and "why not" - it's just as easy to add a second or third native method.
Arrays are at the heart of all multi-valued data structures. Arrays require using segments of memory on the host machine, which means accessing memory in a safe, and machine specific manner - that requires calls to the underlying operating system.
Such calls are native because to perform them you must move out of java and into the host environment to complete them. At some point every operation must be handed over to the host machine to actually implement it using the local OS and hardware.

MethodHandle - What is it all about?

I am studying new features of JDK 1.7 and I just can't get it what MethodHandle is designed for? I understand (direct) invocation of the static method (and use of Core Reflection API that is straightforward in this case). I understand also (direct) invocation of the virtual method (non-static, non-final) (and use of Core Reflection API that requires going through Class's hierarchy obj.getClass().getSuperclass()). Invocation of non-virtual method can be treated as special case of the former one.
Yes, I aware that there is an issue with overloading. If you want to invoke method you have to supply the exact signature. You can't check for overloaded method in easy way.
But, what is MethodHandle about? Reflection API allows you to "look on" the object internals without any pre-assumption (like implemented the interface). You can inspect the object for some purpose. But what is MethodHandle is designed too? Why and when should I use it?
UPDATE: I am reading now this http://blog.headius.com/2008/09/first-taste-of-invokedynamic.html article. According to it, the main goal is to simplify life for scripting languages that runs atop of JVM, and not for Java Language itself.
UPDATE-2: I finish to read the link above, some quotation from there:
The JVM is going to be the best VM for building dynamic languages, because it already is a dynamic language VM. And InvokeDynamic, by promoting dynamic languages to first-class JVM citizens, will prove it.
Using reflection to invoke methods works great...except for a few problems. Method objects must be retrieved from a specific type, and can't be created in a general way.<...>
...reflected invocation is a lot slower than direct invocation. Over the years, the JVM has gotten really good at making reflected invocation fast. Modern JVMs actually generate a bunch of code behind the scenes to avoid a much of the overhead old JVMs dealt with. But the simple truth is that reflected access through any number of layers will always be slower than a direct call, partially because the completely generified "invoke" method must check and re-check receiver type, argument types, visibility, and other details, but also because arguments must all be objects (so primitives get object-boxed) and must be provided as an array to cover all possible arities (so arguments get array-boxed).
The performance difference may not matter for a library doing a few reflected calls, especially if those calls are mostly to dynamically set up a static structure in memory against which it can make normal calls. But in a dynamic language, where every call must use these mechanisms, it's a severe performance hit.
http://blog.headius.com/2008/09/first-taste-of-invokedynamic.html
So, for Java programmer it is essentially useless. Am I right? From this point of view, It can be only considered as alternative way for Core Reflection API.
UPDATE-2020: Indeed, MethodHandle can be thought as s more powerful alternative to Core Reflection API. Starting with JDK 8 there are also Java Language features that use it.
What you can do with MethodHandles is curry methods, change the types of parameters and change their order.
Method Handles can handle both methods and fields.
Another trick which MethodHandles do is use primitive direct (rather than via wrappers)
MethodHandles can be faster than using reflection as there is more direct support in the JVM e.g they can be inlined. It uses the new invokedynamic instruction.
Think of MethodHandle as a modern, more flexible, more typesafe way of doing reflection.
It's currently in the early stages of its lifecycle - but over time has the potential to be optimized to become must faster than reflection - to the point that it can become as fast as a regular method call.
java.lang.reflect.Method is relatively slow and expensive in terms of memory. Method handles are supposed to be a "lightweight" way of passing around pointers to functions that the JVM has a chance of optimising. As of JDK8 method handles aren't that well optimised, and lambdas are likely to be initially implemented in terms of classes (as inner classes are).
Almost 9 years past since I've asked this question.
JDK 14 is last stable version that has massive usage of MethodHandle...
I've create mini-series of articles about invokedynamic https://alex-ber.medium.com/explaining-invokedynamic-introduction-part-i-1079de618512. Below, I'm quoting the relevant parts from their.
MethodHandle can be thought as s more powerful alternative to Core Reflection API. MethodHandle is such an Object which stores the metadata about the method (constructor, field, or similar low-level operation), such as the name of the method signature of the method etc. One way took on it is a destination of the pointer to method (de-referenced method (constructor, field, or similar low-level operation)).
Java code can create a method handle that directly accesses any method, constructor, or field that is accessible to that code. This is done via a reflective, capability-based API called MethodHandles.Lookup For example, a static method handle can be obtained from Lookup.findStatic. There are also conversion methods from Core Reflection API objects, such as Lookup.unreflect.
It is important to understand 2 key difference from Core Reflection API and MethodHandle.
With MethodHandle access check is done only once in construction time, with Core Reflection API it is done on every call to invoke method (and Securty Manager is invoked each time, slowing down the performance).
Core Reflection API invoke method is regular method. In MethodHandle all invoke* variances are signature polymorphic methods.
Basically, access check means whether you can access method (constructor, field, or similar low-level operation). For example, if the method (constructor, field, or similar low-level operation) is private, you can’t normally invoke it (get value from the field).
As opposed to the Reflection API, the JVM can completely see-through MethodHandles and will try to optimize them, hence the better performance.
Note: With MethodHandle you can also generate implementation logic. See Dynamical hashCode implementation. Part V https://alex-ber.medium.com/explaining-invokedynamic-dynamical-hashcode-implementation-part-v-16eb318fcd47 for details.

How are java interfaces implemented internally? (vtables?)

C++ has multiple inheritance. The implementation of multiple inheritance at the assembly level can be quite complicated, but there are good descriptions online on how this is normally done (vtables, pointer fixups, thunks, etc).
Java doesn't have multiple implementation inheritance, but it does have multiple interface inheritance, so I don't think a straight forward implementation with a single vtable per class can implement that. How does java implement interfaces internally?
I realize that contrary to C++, Java is Jit compiled, so different pieces of code might be optimized differently, and different JVMs might do things differently. So, is there some general strategy that many JVMs follow on this, or does anyone know the implementation in a specific JVM?
Also JVMs often devirtualize and inline method calls in which case there are no vtables or equivalent involved at all, so it might not make sense to ask about actual assembly sequences that implement virtual/interface method calls, but I assume that most JVMs still keep some kind of general representation of classes around to use if they haven't been able to devirtualize everything. Is this assumption wrong? Does this representation look in any way like a C++ vtable? If so do interfaces have separate vtables and how are these linked with class vtables? If so can object instances have multiple vtable pointers (to class/interface vtables) like object instances in C++ can? Do references of a class type and an interface type to the same object always have the same binary value or can these differ like in C++ where they require pointer fixups?
(for reference: this question asks something similar about the CLR, and there appears to be a good explanation in this msdn article though that may be outdated by now. I haven't been able to find anything similar for Java.)
Edit:
I mean 'implements' in the sense of "How does the GCC compiler implement integer addition / function calls / etc", not in the sense of "Java class ArrayList implements the List interface".
I am aware of how this works at the JVM bytecode level, what I want to know is what kind of code and datastructures are generated by the JVM after it is done loading the class files and compiling the bytecode.
The key feature of the HotSpot JVM is inline caching.
This doesn't actually mean that the target method is inlined, but means that an assumption
is put into the JIT code that every future call to the virtual or interface method will target
the very same implementation (i.e. that the call site is monomorphic). In this case, a
check is compiled into the machine code whether the assumption actually holds (i.e. whether
the type of the target object is the same as it was last time), and then transfer control
directly to the target method - with no virtual tables involved at all. If the assertion fails, an attempt may be made to convert this to a megamorphic call site (i.e. with multiple possible types); if this also fails (or if it is the first call), a regular long-winded lookup is performed, using vtables (for virtual methods) and itables (for interfaces).
Edit: The Hotspot Wiki has more details on the vtable and itable stubs. In the polymorphic case, it still puts an inline cache version into the call site. However, the code actually is a stub that performs a lookup in a vtable, or an itable. There is one vtable stub for each vtable offset (0, 1, 2, ...). Interface calls add a linear search over an array of itables before looking into the itable (if found) at the given offset.

Categories