Is this really widening vs autoboxing? - java

I saw this in an answer to another question, in reference to shortcomings of the Java spec:
There are more shortcomings and this is a subtle topic. Check this out:
public class methodOverloading{
public static void hello(Integer x){
System.out.println("Integer");
}
public static void hello(long x){
System.out.println("long");
}
public static void main(String[] args){
int i = 5;
hello(i);
}
}
Here "long" would be printed (haven't checked it myself), because the compiler chooses widening over auto-boxing. Be careful when using auto-boxing or don't use it at all!
Are we sure that this is actually an example of widening instead of autoboxing, or is it something else entirely?
On my initial scanning, I would agree with the statement that the output would be "long" on the basis of i being declared as a primitive and not an object. However, if you changed
hello(long x)
to
hello(Long x)
the output would print "Integer"
What's really going on here? I know nothing about the compilers/bytecode interpreters for java...

In the first case, you have a widening conversion happening. This can be see when runinng the "javap" utility program (included w/ the JDK), on the compiled class:
public static void main(java.lang.String[]);
Code:
0: iconst_ 5
1: istore_ 1
2: iload_ 1
3: i2l
4: invokestatic #6; //Method hello:(J)V
7: return
}
Clearly, you see the I2L, which is the mnemonic for the widening Integer-To-Long bytecode instruction. See reference here.
And in the other case, replacing the "long x" with the object "Long x" signature, you'll have this code in the main method:
public static void main(java.lang.String[]);
Code:
0: iconst_ 5
1: istore_ 1
2: iload_ 1
3: invokestatic #6; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
6: invokestatic #7; //Method hello:(Ljava/lang/Integer;)V
9: return
}
So you see the compiler has created the instruction Integer.valueOf(int), to box the primitive inside the wrapper.

Yes it is, try it out in a test. You will see "long" printed. It is widening because Java will choose to widen the int into a long before it chooses to autobox it to an Integer, so the hello(long) method is chosen to be called.
Edit: the original post being referenced.
Further Edit: The reason the second option would print Integer is because there is no "widening" into a larger primitive as an option, so it MUST box it up, thus Integer is the only option. Furthermore, java will only autobox to the original type, so it would give a compiler error if you leave the hello(Long) and removed hello(Integer).

Another interesting thing with this example is the method overloading. The combination of type widening and method overloading only working because the compiler has to make a decision of which method to choose. Consider the following example:
public static void hello(Collection x){
System.out.println("Collection");
}
public static void hello(List x){
System.out.println("List");
}
public static void main(String[] args){
Collection col = new ArrayList();
hello(col);
}
It doesn't use the run-time type which is List, it uses the compile-time type which is Collection and thus prints "Collection".
I encourage your to read Effective Java, which opened my eyes to some corner cases of the JLS.

Related

Default Instance Value Initialization in java

I am trying to check for default value of instance variables (i.e. 0 here) in the generated bytecode.
I can see <init>() getting called and if I print myvar instance variable inside constructor then I see getfield called for myvar but then where was this default set first?
Please answer on following:
When is default value being set in myvar? (after compilation or object creation time)
Who(compiler or jvm) is initializing(or let's say setting default value) in the instance variable?
public class FieldInit {
int myvar;
public static void main(String[] args) {
new FieldInit(); // and what would happen if I comment out this
}
}
I am trying to deassemble bytecode using javap but not able to see the <clinit>() method, I guess here this may be happening. Please let me know if it is possible to see <clinit>() method and if so, how?
In JVM, an object instantiation is split into two bytecode instructions:
new allocates a new uninitialized object;
invokespecial calls a constructor that initializes the object.
The JVM Specification for the new bytecode says:
Memory for a new instance of that class is allocated from the
garbage-collected heap, and the instance variables of the new object
are initialized to their default initial values
The JVM sets all instance fields to zeroes when executing new instruction. So, by the time a constructor is invoked, all fields are already set to their default values. You will not find this "zeroing" in the bytecode - it's done implicitly by the JVM during object allocation.
Your question is not answerable because it's more complicated than you think it is.
fields in java are encoded in a field datastructure and this datastructure includes room for the initial value. However, the only possible initial values in this data structure are numbers and strings.
Let's see it in action! (and note that L is just a java syntax thing to tell java: This number is a long, not an int. 5 is a constant, so is 5L).
class Test {
public static final long mark = 5L;
}
javac Test.java
javap -v c Test
.....
public static final long mark;
descriptor: J
flags: (0x0019) ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 5l
Hey look there it is, constant value 5L.
But what if it isn't constant?
Ah, that's a problem. You can't encode that here.
So, instead, it's syntax sugar time!
You can write a special method in any class that is run during a class's static initialization. You can also write a special method that is run as a new instance is made. That one is almost entirely the same as a constructor, with only exotic differences. It looks like this:
public class Test {
static {
System.out.println("What voodoo magic is this?");
}
public static void main(String[] args) {
System.out.println("In main");
}
}
Let's see it in action!
javac Test.java
java Test
What voodoo magic is this?
In main
javap -c -v Test
...
static {};
descriptor: ()V
flags: (0x0008) ACC_STATIC
Code:
stack=2, locals=0, args_size=0
0: getstatic #7 // Field >java/lang/System.out:Ljava/io/PrintStream;
3: ldc #21 // String Voodoo...
5: invokevirtual #15 // Method >java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
LineNumberTable:
line 3: 0
line 4: 8
}
As you can see, that weird static{} thing was compiled to something that looks exactly like a method, bytecode and all, but the name of the method is bizarre, it's just static{}.
Here comes the clue
But what happens if we make that a little more complicated. Let's initialize this field to the current time, instead!
class Test {
public static final long mark = System.currentTimeMillis();
}
This is just syntax sugar. That explains how this works, because as I told you, at the class file level, you cannot initialize fields with non-constants. Thus, that compiles to the same thing as:
class Test {
public static final long mark;
static {
mark = System.currentTimeMillis();
}
}
and you can javap to confirm this.
One is called a 'compile time constant'. The other isn't. This shows up in various ways. For example, you can pass a CTC as annotation parameter. Try that: try using static final long MARK = 5L; (you will be able to), then static final long MARK = System.currentTimeMillis(); - that won't be allowed.
So, where is that initial value? If it's a constant value, javap -c -v will show it. If it isn't, it's stuck in the static block.

JAR replacing constants with values [duplicate]

In java, say I have the following
==fileA.java==
class A
{
public static final int SIZE = 100;
}
Then in another file I use this value
==fileB.java==
import A;
class b
{
Object[] temp = new Object[A.SIZE];
}
When this gets compiled does SIZE get replaced with the value 100, so that if I were to replace the FileA.jar but not FileB.jar, would the object array get the new value or would it have been hardcoded to 100 because that's the value when it was originally built?
Yes, the Java compiler does replace static constant values like SIZE in your example with their literal values.
So, if you would later change SIZE in class A but you don't recompile class b, you will still see the old value in class b. You can easily test this out:
file A.java
public class A {
public static final int VALUE = 200;
}
file B.java
public class B {
public static void main(String[] args) {
System.out.println(A.VALUE);
}
}
Compile A.java and B.java. Now run: java B
Change the value in A.java. Recompile A.java, but not B.java. Run again, and you'll see the old value being printed.
You can keep the constant from being compiled into B, by doing
class A
{
public static final int SIZE;
static
{
SIZE = 100;
}
}
Another route to proving that the behavior is to looking at the generated bytecode. When the constant is "small" (presumably < 128):
public B();
Code:
0: aload_0
1: invokespecial #10; //Method java/lang/Object."<init>":()V
4: aload_0
5: bipush 42
7: anewarray #3; //class java/lang/Object
10: putfield #12; //Field temp:[Ljava/lang/Object;
13: return
}
(I used 42 instead of 100 so it stands out more). In this case, it is clearly substituted in the byte code. But, say the constant is "bigger." Then you get byte code that looks like this:
public B();
Code:
0: aload_0
1: invokespecial #10; //Method java/lang/Object."<init>":()V
4: aload_0
5: ldc #12; //int 86753098
7: anewarray #3; //class java/lang/Object
10: putfield #13; //Field temp:[Ljava/lang/Object;
13: return
When it is bigger, the opcode "ldc" is used, which according to the JVM documentation "an unsigned byte that must be a valid index into the runtime constant pool of the current class".
In either case, the constant is embedded into B. I imagine, since that in opcodes you can only access the current classes runtime constant pool, that this the decision to write the constant into the class file is independent of implementation (but I don't know that for a fact).
Woo - you learn something new everyday!
Taken from the Java spec...
Note: If a primitive type or a string
is defined as a constant and the value
is known at compile time, the compiler
replaces the constant name everywhere
in the code with its value. This is
called a compile-time constant. If the
value of the constant in the outside
world changes (for example, if it is
legislated that pi actually should be
3.975), you will need to recompile any classes that use this constant to get
the current value.
The important concept here is that the static final field is initialised with a compile-time constant, as defined in the JLS. Use a non-constant initialiser (or non-static or non-final) and it wont be copied:
public static final int SIZE = null!=null?0: 100;
(null is not a *compile-time constant`.)
Actually I ran into this bizarreness a while ago.
This will compile "100" into class b directly. If you just recompile class A, this will not update the value in class B.
On top of that, the compiler may not notice to recompile class b (at the time I was compiling single directories and class B was in a separate directory and compiling a's directory did not trigger a compile of B)
As an optimization the compiler will inline that final variable.
So at compile time it will look like.
class b
{
Object[] temp = new Object[100];
}
One thing should note is: static final value is known at compile time
if the value is not known at compile time, compiler won't replaces the constant name everywhere in the code with its value.
public class TestA {
// public static final int value = 200;
public static final int value = getValue();
public static int getValue() {
return 100;
}
}
public class TestB {
public static void main(String[] args) {
System.out.println(TestA.value);
}
}
first compile TestA and TestB, run TestB
then change TestA.getValue() to return 200, compile TestA, run TestB, TestB will get the new value
enter image description here
There is an exception to this:-
If static final field is null at the time of compiling then it doesn't get replaced with null (which is actually its value)
A.java
class A{
public static final String constantString = null;
}
B.java
class B{
public static void main(String... aa){
System.out.println(A.constantString);
}
}
Compile both A.java and B.java and run java B
Output will be null
Now Update A.java with following code and compile only this class.
class A{
public static final String constantString = "Omg! picking updated value without re-compilation";
}
Now run java B
Output will be Omg! picking updated value without re-compilation
Java does optimise these sorts of values but only if they are in the same class. In this case the JVM looks in A.SIZE rather than optimizing it because of the usage case you are considering.

Why will the compiler accept an invalid syntax call (<?>method) when statically called?

Is there a reason that a Java Generic Method cannot be called without the Static/Instance reference before the method? Like "case 2" and "case 5" on the example code.
In other words, why can we call a normal method without the static/instance reference (like in "case 3") and in Generic Methods we can't?
public class MyClass {
public static void main(String[] args) {
MyClass.<String>doWhatEver("Test Me!"); // case 1
<String>doWhatEver("Test Me2!"); // case 2 COMPILE ERROR HERE
doSomething("Test Me 3!"); // case 3 (just for compare)
new MyClass().<String>doMoreStuff("Test me 4"); // case 4
}
public void doX(){
<String>doMoreStuff("test me 5"); // case 5 COMPILE ERROR HERE
}
public static <T> void doWhatEver(T x){
System.out.println(x);
}
public static void doSomething(String x){
System.out.println(x);
}
public <T> void doMoreStuff(T x){
System.out.println(x);
}
}
You don't need to specify <String> for case 1 and 4, the compiler will handle this for you.
Now let's try to run your exemple and see what happen.
Exception in thread "main" java.lang.RuntimeException: Uncompilable
source code - illegal start of expression
It's that much simple, the answer to your question is because the syntax is invalid, it was not meant to be used that way in the javac specifications.
However, this has nothing to do with being static or not. Try it in a constructor removing the static keyword to doWhatEver method :
public MyClass()
{
<String>doWhatEver("Test Me2!"); //does not compile
doWhatEver("Test Me2!"); //compile
}
public <T> void doWhatEver(T x){
System.out.println(x);
}
Now if you are wondering why MyClass.<String>doWhat.. compiled while <String>doWhat.. did not compile even if we modify the static keyword, let's have a look at the generated bytecode.
Your line will be compiled to this :
6: invokestatic #5 // Method doWhatEver:(Ljava/lang/Object;)V
Which correct the syntax error you made, but why ?.
Try compiling for example these two lines
MyClass.<String>doWhatEver("Test Me2!");
MyClass.doWhatEver("Test Me3!");
then run javap -v on the .class file and you will notice that the both call was compiled to the same bytecode.
4: ldc #4 // String Test Me2!
6: invokestatic #5 // Method doWhatEver:(Ljava/lang/Object;)V
9: ldc #6 // String Test Me3!
11: invokestatic #5 // Method doWhatEver:(Ljava/lang/Object;)V
In the case were you call the non-static method, the generated bytecode will be invokevirtual instead :
17: invokevirtual #8 // Method doWhatEver2:(Ljava/lang/Object;)V
My guess is that invokestatic will search directly in the constant pool (where static method are stored) for the method corresponding to the specified call and will ommit the type declaration, while invokevirtual will search in the actual class.

Use Scala constants in Java

I am currently evaluating Scala for future projects and came across something strange. I created the following constant for us in a JSP:
val FORMATED_TIME = "formatedTime";
And it did not work. After some experimenting I decided to decompile to get to the bottom of it:
private final java.lang.String FORMATED_TIME;
public java.lang.String FORMATED_TIME();
Code:
0: aload_0
1: getfield #25; //Field FORMATED_TIME:Ljava/lang/String;
4: areturn
Now that is interesting! Personally I have been wondering for quite a while why an inspector needs the prefix get and a mutator the prefix set in Java as they live in different name-spaces.
However it might still be awkward to explain that to the rest of the team. So is it possible to have a public constant without the inspector?
This is because of the Uniform Access Principle, i.e.: Methods and Fields Are Indistinguishable
See this answer
In Scala 2.8.0 this means if you have
a companion object, you lose your
static forwarders)
If you have this in Scala:
//Scala
object CommonControler {
val FORMATED_TIME = "formatedTime";
}
You may use it like this from Java
//Java
// Variables become methods
CommonControler$.MODULE$.FORMATED_TIME();
// A static forwarder is avaliable
CommonControler.FORMATED_TIME();
Also see the book Scala in Depth
Also note the #scala.reflect.BeanProperty for classes.
I had a further look into the decompiled code and noted something else. The variables are not actually static. So my next idea was to use an object instead:
object KommenControler
{
val FORMATED_TIME = "formatedTime";
} // KommenControler
But now things turn really ugly:
public final class ….KommenControler$ extends java.lang.Object implements scala.ScalaObject{
public static final ….KommenControler$ MODULE$;
private final java.lang.String FORMATED_TIME;
public static {};
Code:
0: new #9; //class …/KommenControler$
3: invokespecial #12; //Method "<init>":()V
6: return
public java.lang.String FORMATED_TIME();
Code:
0: aload_0
1: getfield #26; //Field FORMATED_TIME:Ljava/lang/String;
4: areturn
So I get an additional class ending on $ which has a singleton instance called MOUDLE$. And there is still the inspector. So the access to the variable inside a jsp becomes:
final String formatedTime = (String) request.getAttribute (….KommenControler$.MODULE$.FORMATED_TIME ());
This works as expected and I personally can live with it but how am I going to explain that to the team?
Of course if there is a simpler way I like to hear of it.

What code does the compiler generate for autoboxing?

When the Java compiler autoboxes a primitive to the wrapper class, what code does it generate behind the scenes? I imagine it calls:
The valueOf() method on the wrapper
The wrapper's constructor
Some other magic?
You can use the javap tool to see for yourself. Compile the following code:
public class AutoboxingTest
{
public static void main(String []args)
{
Integer a = 3;
int b = a;
}
}
To compile and disassemble:
javac AutoboxingTest.java
javap -c AutoboxingTest
The output is:
Compiled from "AutoboxingTest.java"
public class AutoboxingTest extends java.lang.Object{
public AutoboxingTest();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: iconst_3
1: invokestatic #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
4: astore_1
5: aload_1
6: invokevirtual #3; //Method java/lang/Integer.intValue:()I
9: istore_2
10: return
}
Thus, as you can see, autoboxing invokes the static method Integer.valueOf(), and autounboxing invokes intValue() on the given Integer object. There's nothing else, really - it's just syntactic sugar.
I came up with a unit test that proves that Integer.valueOf() is called instead of the wrapper's constructor.
import static org.junit.Assert.assertNotSame;
import static org.junit.Assert.assertSame;
import org.junit.Test;
public class Boxing {
#Test
public void boxing() {
assertSame(5, 5);
assertNotSame(1000, 1000);
}
}
If you look up the API doc for Integer#valueOf(int), you'll see it was added in JDK 1.5. All the wrapper types (that didn't already have them) had similar methods added to support autoboxing. For certain types there is an additional requirement, as described in the JLS:
If the value p being boxed is true, false, a byte, a char in the range \u0000 to \u007f, or an int or short number between -128 and 127, then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2. §5.1.7
It's interesting to note that longs aren't subject to the same requirement, although Long values in the -128..127 range are cached in Sun's implementation, just like the other integral types.
I also just discovered that in my copy of The Java Programming Language, it says char values from \u0000 to \u00ff are cached, but of course the upper limit per the spec is \u007f (and the Sun JDK conforms to the spec in this case).
I'd recommend getting something like jad and decompiling code a lot. You can learn quite a bit about what java's actually doing.

Categories