Efficiency: generic array vs object array - java

Suppose, you have to call very often operation T get(int) which returns an object from the underlying array. Basically, this can be implemented in two ways:
class GenericArray<T> {
final T[] underlying;
GenericArray(Class<T> clazz, int length) {
underlying = (T[]) Array.newInstance(clazz, length);
}
T get(int i) { return underlying[i]; }
}
and
class ObjectArray<T> {
final Object[] underlying;
ObjectArray(int length) {
underlying = new Object[length];
}
T get(int i) { return (T) underlying[i]; }
}
The first one is using reflection, so it would be slower at creation time. The second one is using downcasting which introduces some overhead. Due of generic type erasure at runtime, there has to be some implicit casting mechanism.
So is it true, that these two are equal when it comes to get(i)?

Let's check the bytecode:
Compiled from "ObjectArray.java"
class lines.ObjectArray<T> {
final java.lang.Object[] underlying;
lines.ObjectArray(int);
Code:
0: aload_0
1: invokespecial #10 // Method java/lang/Object."<init>":()V
4: aload_0
5: iload_1
6: anewarray #3 // class java/lang/Object
9: putfield #13 // Field underlying:[Ljava/lang/Object;
12: return
T get(int);
Code:
0: aload_0
1: getfield #13 // Field underlying:[Ljava/lang/Object;
4: iload_1
5: aaload
6: areturn
}
Compiled from "GenericArray.java"
class lines.GenericArray<T> {
final T[] underlying;
lines.GenericArray(java.lang.Class<T>, int);
Code:
0: aload_0
1: invokespecial #13 // Method java/lang/Object."<init>":()V
4: aload_0
5: aload_1
6: iload_2
7: invokestatic #16 // Method java/lang/reflect/Array.newInstance:(Ljava/lang/Class;I)Ljava/lang/Object;
10: checkcast #22 // class "[Ljava/lang/Object;"
13: putfield #23 // Field underlying:[Ljava/lang/Object;
16: return
T get(int);
Code:
0: aload_0
1: getfield #23 // Field underlying:[Ljava/lang/Object;
4: iload_1
5: aaload
6: areturn
}
As you see, bytecode for get is identical.

This answer proves that the two get methods are equivalent. Here is a bytecode free answer explaining how I understand the topic. Remember that generics in Java are implemented using type erasure. Loosely speaking this means that T is replaced by Object and casts are inserted where necessary (actually it's not always Object - if you write class Foo<T extends Number> { ... }, then T within the body of the class is replaced by Number).
This means that the code for your ObjectArray class is transformed into something like this
class ObjectArray {
final Object[] underlying;
ObjectArray(int length) {
underlying = new Object[length];
}
Object get(int i) {
return underlying[i];
}
}
Notice that there is no cast in the get method. The (T) is only needed to make your code compile. It has no impact at runtime, and can never throw a ClassCastException.
The code for the other class is transformed into something like this:
class GenericArray {
final Object[] underlying;
GenericArray(Class clazz, int length) {
underlying = (Object[]) Array.newInstance(clazz, length);
}
Object get(int i) {
return underlying[i];
}
}
So the get methods are equivalent. The only difference between the two classes is that reflection is used to generate the array, so it would result in an ArrayStoreException if you tried to store an object of the wrong type. Since this could only happen if you abused generics by using raw types anyway, it is probably not worth using reflection for this in most situations.

Related

Ambiguous behaviour in casting

I was teaching students the old-school Generics and came across an unseen! behavior while I was presenting! :(
I have a simple class
public class ObjectUtility {
public static void main(String[] args) {
System.out.println(castToType(10,new HashMap<Integer,Integer>()));
}
private static <V,T> T castToType(V value, T type){
return (T) value;
}
}
This gives output as 10,without any error!!! I was expecting this to give me a ClassCastException, with some error like Integer cannot be cast to HashMap.
Curious and Furious, I tried getClass() on the return value, something like below
System.out.println(castToType(10,new HashMap<Integer,Integer>()).getClass());
which is throwing a ClassCastException as I expected.
Also, when I break the same statement into two, something like
Object o = castToType(10,new HashMap<Integer,Integer>());
System.out.println(o.getClass());
It is not throwing any error and prints class java.lang.Integer
All are executed with
openjdk version "1.7.0_181"
OpenJDK Runtime Environment (Zulu 7.23.0.1-macosx) (build 1.7.0_181-b01)
OpenJDK 64-Bit Server VM (Zulu 7.23.0.1-macosx) (build 24.181-b01, mixed mode)
Can someone point me in the right direction on Why this behaviour is happening?
T doesn't exist at runtime. It resolves to the lower bound of the constraint. In this case, there are none, so it resolves to Object. Everything can be cast to Object, so no class cast exception.
If you were to do change the constraint to this
private static <V,T extends Map<?,?>> T castToType(V value, T type){
return (T) value;
}
then the cast to T becomes a cast to the lower bound Map, which obviously Integer is not, and you get the class cast exception you're expecting.
Also, when I break the same statement into two, something like
Object o = castToType(10,new HashMap<Integer,Integer>());
System.out.println(o.getClass());
It is not throwing any error
castToType(10,new HashMap<Integer,Integer>()).getClass()
This throws a class cast exception because it statically links to the method HashMap::getClass (not Object::getClass) since the signature says to expect HashMap as a return value. This necessitates an implicit cast to HashMap which fails because castToType returns an Integer at runtime.
When you use this first
Object o = castToType(10,new HashMap<Integer,Integer>());
you are now statically linking against Object::getClass which is fine regardless of what's actually returned.
The "unsplit" version is equivalent to this
final HashMap<Integer, Integer> map = castToType(10, new HashMap<>());
System.out.println(map.getClass());
which hopefully demonstrates the difference
You could see the differences using javap tool.
The compiling process by default makes code optimizations that changes the Generic types into the primitive ones
First code:
public class ObjectUtility {
public static void main(String[] args) {
System.out.println(castToType(10,new java.util.HashMap<Integer,Integer>()));
}
private static <V,T> T castToType(V value, T type){
return (T) value;
}
}
Real PseudoCode:
Compiled from "ObjectUtility.java"
public class ObjectUtility {
public ObjectUtility();
descriptor: ()V
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
Code:
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: bipush 10
5: invokestatic #3 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
8: new #4 // class java/util/HashMap
11: dup
12: invokespecial #5 // Method java/util/HashMap."<init>":()V
15: invokestatic #6 // Method castToType:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
18: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
21: return
LineNumberTable:
line 4: 0
line 5: 21
private static <V, T> T castToType(V, T);
descriptor: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
Code:
0: aload_0
1: areturn
LineNumberTable:
line 8: 0
}
The calls of the Generic types are changed to Object and an Integer.valueOf is added on the System out print.
Second code:
public class ObjectUtility {
public static void main(String[] args) {
System.out.println(castToType(10,new java.util.HashMap<Integer,Integer>()).getClass());
}
private static <V,T> T castToType(V value, T type){
return (T) value;
}
}
Real Pseudo Code:
Compiled from "ObjectUtility.java"
public class ObjectUtility {
public ObjectUtility();
descriptor: ()V
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
Code:
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: bipush 10
5: invokestatic #3 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
8: new #4 // class java/util/HashMap
11: dup
12: invokespecial #5 // Method java/util/HashMap."<init>":()V
15: invokestatic #6 // Method castToType:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
18: checkcast #4 // class java/util/HashMap
21: invokevirtual #7 // Method java/lang/Object.getClass:()Ljava/lang/Class;
24: invokevirtual #8 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
27: return
LineNumberTable:
line 4: 0
line 5: 27
private static <V, T> T castToType(V, T);
descriptor: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
Code:
0: aload_0
1: areturn
LineNumberTable:
line 8: 0
}
The checkcast is invoqued over HashMap but the signature is changed to Object and the returnt is the value as int without the cast inside castToType. The "int" primitive type causes an invalid cast
Third Code:
public class ObjectUtility {
public static void main(String[] args) {
Object o = castToType(10,new java.util.HashMap<Integer,Integer>());
System.out.println(o.getClass());
}
private static <V,T> T castToType(V value, T type){
return (T) value;
}
}
Real Pseudo Code:
Compiled from "ObjectUtility.java"
public class ObjectUtility {
public ObjectUtility();
descriptor: ()V
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
Code:
0: bipush 10
2: invokestatic #2 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
5: new #3 // class java/util/HashMap
8: dup
9: invokespecial #4 // Method java/util/HashMap."<init>":()V
12: invokestatic #5 // Method castToType:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
15: astore_1
16: getstatic #6 // Field java/lang/System.out:Ljava/io/PrintStream;
19: aload_1
20: invokevirtual #7 // Method java/lang/Object.getClass:()Ljava/lang/Class;
23: invokevirtual #8 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
26: return
LineNumberTable:
line 4: 0
line 5: 16
line 6: 26
private static <V, T> T castToType(V, T);
descriptor: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
Code:
0: aload_0
1: areturn
LineNumberTable:
line 9: 0
}
At this case the method is similar to the first one. castToType returns the first parameter without change.
As you can see the java compiler mades some "performance" changes that could affect in some cases. The Generics are an "invention" of the source code that are finally converted to the real type required in any case.

Understanding dynamic polymorphsim byte code

I am a novice to Java byte code and would like to understand the following byte code of Dispatch.class relative to Dispatch.java source code below :
Compiled from "Dispatch.java"
class Dispatch {
Dispatch();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: new #2 // class B
3: dup
4: invokespecial #3 // Method B."<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #4 // Method A.run:()V
12: return
}
//=====================Dispatch.java==============================
class Dispatch{
public static void main(String args[]){
A var = new B();
var.run(); // prints : This is B
}
}
//======================A.java===========================
public class A {
public void run(){
System.out.println("This is A");
}
}
//======================B.java===========================
public class B extends A {
public void run(){
System.out.println("This is B");
}
}
After doing some reading on the internet I had a first grasp of how JVM stack and opcodes work. I still however do not get what these command lines are good for :
3: dup //what are we duplicating here exactly?
4: invokespecial #3 //what does the #3 in operand stand for?
invokevirtual VS invokespecial //what difference there is between these opcodes?
It really sounds like you need to read the docs some more, but to answer your updated questions,
dup duplicates the top value on the operand stack. In this case, it would be the uninitialized B object that was pushed by the previous new instruction.
The #3 means that invokespecial is operating on the 3rd slot in the classfile's constant pool. This is where the method to be invoked is specified. You can see the constant pool by passing -c -verbose to javap.
invokevirtual is used for ordinary (non interface) virtual method calls. (Ignoring default interface methods for the moment) invokespecial is used for a variety of special cases - private method calls, constructor invocations, and superclass method calls.

Scala - How is val immutability guaranteed at run time

When we create a final in java it is guaranteed that it cannot be changed even at run time because the JVM guarantees it.
Java class:
public class JustATest {
public final int x = 10;
}
Javap decompiled:
Compiled from "JustATest.java"
public class JustATest {
public final int x;
public JustATest();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: aload_0
5: bipush 10
7: putfield #2 // Field x:I
10: return
}
But in scala, if we declare a val, it compiles into a normal integer and there is no difference between var and val in terms of decompilation output.
Original Scala class:
class AnTest {
val x = 1
var y = 2
}
Decompiled output:
Compiled from "AnTest.scala"
public class AnTest {
public int x();
Code:
0: aload_0
1: getfield #14 // Field x:I
4: ireturn
public int y();
Code:
0: aload_0
1: getfield #18 // Field y:I
4: ireturn
public void y_$eq(int);
Code:
0: aload_0
1: iload_1
2: putfield #18 // Field y:I
5: return
public AnTest();
Code:
0: aload_0
1: invokespecial #25 // Method java/lang/Object."<init>":()V
4: aload_0
5: iconst_1
6: putfield #14 // Field x:I
9: aload_0
10: iconst_2
11: putfield #18 // Field y:I
14: return
}
With that information, the concept of immutability of a val is controlled only at compile time by the scala compiler? How is this guaranteed at run time?
In Scala, conveying immutability via val is a compile time enforcement which has nothing to do with the emitted byte code. In Java, you state that when the field is final in order for it not to be reassigned, where in Scala, declaring a variable with val only means it can't be reassigned, but it can be overridden. If you want a field to be final, you'll need to specify it as you do in Java:
class AnTest {
final val x = 10
}
Which yields:
public class testing.ReadingFile$AnTest$1 {
private final int x;
public final int x();
Code:
0: bipush 10
2: ireturn
public testing.ReadingFile$AnTest$1();
Code:
0: aload_0
1: invokespecial #19 // Method java/lang/Object."<init>":()V
4: return
}
Which is equivalent to the byte code you see in Java, except the compiler has emitted a getter for x.
The really simple answer is: there are some Scala features which can be encoded in JVM bytecode, and some which can't.
In particular, there are some constraints which cannot be encoded in JVM bytecode, e.g. sealed or private[this], or val. Which means that if you get your hands on the compiled JVM bytecode of a Scala source file, then you can do stuff that you can't do from Scala by interacting with the code through a language that is not Scala.
This is not specific to the JVM backend, you have similar, and even more pronounced problems with Scala.js, since the compilation target here (ECMAScript) offers even less ways of expressing constraints than JVM bytecode does.
But really, this is just a general problem: I can take a language as safe and pure as Haskell, compile it to native code, and if I get my hands on the compiled binary, all safety will be lost. In fact, most Haskell compilers perform (almost) complete type erasure, so there are literally no types, and no type constraints left after compilation.

Default variables' values vs initialization with default

We all know, that according to JLS7 p.4.12.5 every instance variable is initialized with default value. E.g. (1):
public class Test {
private Integer a; // == null
private int b; // == 0
private boolean c; // == false
}
But I always thought, that such class implementation (2):
public class Test {
private Integer a = null;
private int b = 0;
private boolean c = false;
}
is absolutely equal to example (1). I expected, that sophisticated Java compiler see that all these initialization values in (2) are redundant and omits them.
But suddenly for this two classes we have two different byte-code.
For example (1):
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: return
For example (2):
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: aload_0
5: aconst_null
6: putfield #2; //Field a:Ljava/lang/Integer;
9: aload_0
10: iconst_0
11: putfield #3; //Field b:I
14: aload_0
15: iconst_0
16: putfield #4; //Field c:Z
19: return
The question is: Why? But this is so obvious thing to be optimized. What's the reason?
UPD: I use Java 7 1.7.0.11 x64, no special javac options
No, they're not equivalent. Default values are assigned immediately, on object instantiation. The assignment in field initializers happens when the superclass constructor has been called... which means you can see a difference in some cases. Sample code:
class Superclass {
public Superclass() {
someMethod();
}
void someMethod() {}
}
class Subclass extends Superclass {
private int explicit = 0;
private int implicit;
public Subclass() {
System.out.println("explicit: " + explicit);
System.out.println("implicit: " + implicit);
}
#Override void someMethod() {
explicit = 5;
implicit = 5;
}
}
public class Test {
public static void main(String[] args) {
new Subclass();
}
}
Output:
explicit: 0
implicit: 5
Here you can see that the explicit field initialization "reset" the value of explicit back to 0 after the Superclass constructor finished but before the subclass constructor body executed. The value of implicit still has the value assigned within the polymorphic call to someMethod from the Superclass constructor.

Issue with constructors of nested class

This question is about interesting behavior of Java: it produces
additional (not default) constructor for nested classes in some
situations.
This question is also about strange anonymous class, which Java
produces with that strange constructor.
Consider the following code:
package a;
import java.lang.reflect.Constructor;
public class TestNested {
class A {
A() {
}
A(int a) {
}
}
public static void main(String[] args) {
Class<A> aClass = A.class;
for (Constructor c : aClass.getDeclaredConstructors()) {
System.out.println(c);
}
}
}
This will prints:
a.TestNested$A(a.TestNested)
a.TestNested$A(a.TestNested,int)
Ok. Next, lets make constructor A(int a) private:
private A(int a) {
}
Run program again. Receive:
a.TestNested$A(a.TestNested)
private a.TestNested$A(a.TestNested,int)
It is also ok. But now, lets modify main() method in such way (addition of new instance of class A creation):
public static void main(String[] args) {
Class<A> aClass = A.class;
for (Constructor c : aClass.getDeclaredConstructors()) {
System.out.println(c);
}
A a = new TestNested().new A(123); // new line of code
}
Then input becomes:
a.TestNested$A(a.TestNested)
private a.TestNested$A(a.TestNested,int)
a.TestNested$A(a.TestNested,int,a.TestNested$1)
What is it: a.TestNested$A(a.TestNested,int,a.TestNested$1) <<<---??
Ok, lets again make constructor A(int a) package local:
A(int a) {
}
Rerun program again (we don't remove line with instance of A creation!), output is as in the first time:
a.TestNested$A(a.TestNested)
a.TestNested$A(a.TestNested,int)
Questions:
1) How this could be explained?
2) What is this third strange constructor?
UPDATE: Investigation shown following.
1) Lets try to call this strange constructor using reflection from other class.
We will not able to do this, because there isn't any way to create instance of that strange TestNested$1 class.
2) Ok. Lets do the trick. Lets add to the class TestNested such static field:
public static Object object = new Object() {
public void print() {
System.out.println("sss");
}
};
Well? Ok, now we could call this third strange constructor from another class:
TestNested tn = new TestNested();
TestNested.A a = (TestNested.A)TestNested.A.class.getDeclaredConstructors()[2].newInstance(tn, 123, TestNested.object);
Sorry, but I absolutely don't understand it.
UPDATE-2: Further questions are:
3) Why Java use special anonymous inner class for an argument type for this third synthetic constructor? Why not just Object type, of constructor with special name?
4) What Java could use already defined anonymous inner class for those purposes? Isn't this some kind of violation of security?
The third constructor is a synthetic constructor generated by the compiler, in order to allow access to the private constructor from the outer class. This is because inner classes (and their enclosing classes' access to their private members) only exist for the Java language and not the JVM, so the compiler has to bridge the gap behind the scenes.
Reflection will tell you if a member is synthetic:
for (Constructor c : aClass.getDeclaredConstructors()) {
System.out.println(c + " " + c.isSynthetic());
}
This prints:
a.TestNested$A(a.TestNested) false
private a.TestNested$A(a.TestNested,int) false
a.TestNested$A(a.TestNested,int,a.TestNested$1) true
See this post for further discussion: Eclipse warning about synthetic accessor for private static nested classes in Java?
EDIT: interestingly, the eclipse compiler does it differently than javac. When using eclipse, it adds an argument of the type of the inner class itself:
a.TestNested$A(a.TestNested) false
private a.TestNested$A(a.TestNested,int) false
a.TestNested$A(a.TestNested,int,a.TestNested$A) true
I tried to trip it up by exposing that constructor ahead of time:
class A {
A() {
}
private A(int a) {
}
A(int a, A another) { }
}
It dealt with this by simply adding another argument to the synthetic constructor:
a.TestNested$A(a.TestNested) false
private a.TestNested$A(a.TestNested,int) false
a.TestNested$A(a.TestNested,int,a.TestNested$A) false
a.TestNested$A(a.TestNested,int,a.TestNested$A,a.TestNested$A) true
First of all, thank you for this interesting question. I was so intrigued that I could not resist taking a look at the bytecode. This is the bytecode of TestNested:
Compiled from "TestNested.java"
public class a.TestNested {
public a.TestNested();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc_w #2 // class a/TestNested$A
3: astore_1
4: aload_1
5: invokevirtual #3 // Method java/lang/Class.getDeclaredConstructors:()[Ljava/lang/reflect/Constructor;
8: astore_2
9: aload_2
10: arraylength
11: istore_3
12: iconst_0
13: istore 4
15: iload 4
17: iload_3
18: if_icmpge 41
21: aload_2
22: iload 4
24: aaload
25: astore 5
27: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
30: aload 5
32: invokevirtual #5 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
35: iinc 4, 1
38: goto 15
41: new #2 // class a/TestNested$A
44: dup
45: new #6 // class a/TestNested
48: dup
49: invokespecial #7 // Method "<init>":()V
52: dup
53: invokevirtual #8 // Method java/lang/Object.getClass:()Ljava/lang/Class;
56: pop
57: bipush 123
59: aconst_null
60: invokespecial #9 // Method a/TestNested$A."<init>":(La/TestNested;ILa/TestNested$1;)V
63: astore_2
64: return
}
As you can see, the constructor a.TestNested$A(a.TestNested,int,a.TestNested$1) is invoked from your main method. Furthermore, null is passed as the value of the a.TestNested$1 parameter.
So let's take a look at the mysterious anonymous class a.TestNested$1:
Compiled from "TestNested.java"
class a.TestNested$1 {
}
Strange - I would have expected this class to actually do something. To understand it, let's take a look at the constructors in a.TestNested$A:
class a.TestNested$A {
final a.TestNested this$0;
a.TestNested$A(a.TestNested);
Code:
0: aload_0
1: aload_1
2: putfield #2 // Field this$0:La/TestNested;
5: aload_0
6: invokespecial #3 // Method java/lang/Object."<init>":()V
9: return
private a.TestNested$A(a.TestNested, int);
Code:
0: aload_0
1: aload_1
2: putfield #2 // Field this$0:La/TestNested;
5: aload_0
6: invokespecial #3 // Method java/lang/Object."<init>":()V
9: return
a.TestNested$A(a.TestNested, int, a.TestNested$1);
Code:
0: aload_0
1: aload_1
2: iload_2
3: invokespecial #1 // Method "<init>":(La/TestNested;I)V
6: return
}
Looking at the package-visible constructor a.TestNested$A(a.TestNested, int, a.TestNested$1), we can see that the third argument is ignored.
Now we can explain the constructor and the anonymous inner class. The additional constructor is required in order to circumvent the visibility restriction on the private constructor. This additional constructor simply delegates to the private constructor. However, it cannot have the exact same signature as the private constructor. Because of this, the anonymous inner class is added to provide a unique signature without colliding with other possible overloaded constructors, such as a constructor with signature (int,int) or (int,Object). Since this anonymous inner class is only needed to create a unique signature, it does not need to be instantiated and does not need to have content.

Categories