Scala - How is val immutability guaranteed at run time - java

When we create a final in java it is guaranteed that it cannot be changed even at run time because the JVM guarantees it.
Java class:
public class JustATest {
public final int x = 10;
}
Javap decompiled:
Compiled from "JustATest.java"
public class JustATest {
public final int x;
public JustATest();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: aload_0
5: bipush 10
7: putfield #2 // Field x:I
10: return
}
But in scala, if we declare a val, it compiles into a normal integer and there is no difference between var and val in terms of decompilation output.
Original Scala class:
class AnTest {
val x = 1
var y = 2
}
Decompiled output:
Compiled from "AnTest.scala"
public class AnTest {
public int x();
Code:
0: aload_0
1: getfield #14 // Field x:I
4: ireturn
public int y();
Code:
0: aload_0
1: getfield #18 // Field y:I
4: ireturn
public void y_$eq(int);
Code:
0: aload_0
1: iload_1
2: putfield #18 // Field y:I
5: return
public AnTest();
Code:
0: aload_0
1: invokespecial #25 // Method java/lang/Object."<init>":()V
4: aload_0
5: iconst_1
6: putfield #14 // Field x:I
9: aload_0
10: iconst_2
11: putfield #18 // Field y:I
14: return
}
With that information, the concept of immutability of a val is controlled only at compile time by the scala compiler? How is this guaranteed at run time?

In Scala, conveying immutability via val is a compile time enforcement which has nothing to do with the emitted byte code. In Java, you state that when the field is final in order for it not to be reassigned, where in Scala, declaring a variable with val only means it can't be reassigned, but it can be overridden. If you want a field to be final, you'll need to specify it as you do in Java:
class AnTest {
final val x = 10
}
Which yields:
public class testing.ReadingFile$AnTest$1 {
private final int x;
public final int x();
Code:
0: bipush 10
2: ireturn
public testing.ReadingFile$AnTest$1();
Code:
0: aload_0
1: invokespecial #19 // Method java/lang/Object."<init>":()V
4: return
}
Which is equivalent to the byte code you see in Java, except the compiler has emitted a getter for x.

The really simple answer is: there are some Scala features which can be encoded in JVM bytecode, and some which can't.
In particular, there are some constraints which cannot be encoded in JVM bytecode, e.g. sealed or private[this], or val. Which means that if you get your hands on the compiled JVM bytecode of a Scala source file, then you can do stuff that you can't do from Scala by interacting with the code through a language that is not Scala.
This is not specific to the JVM backend, you have similar, and even more pronounced problems with Scala.js, since the compilation target here (ECMAScript) offers even less ways of expressing constraints than JVM bytecode does.
But really, this is just a general problem: I can take a language as safe and pure as Haskell, compile it to native code, and if I get my hands on the compiled binary, all safety will be lost. In fact, most Haskell compilers perform (almost) complete type erasure, so there are literally no types, and no type constraints left after compilation.

Related

Java returning assignement vs assigning then returning. (E.g. in Singletons)

Basically, you can have code like this:
From: Baeldung.com
public static synchronized ClassSingleton getInstance() {
if(instance == null) {
instance = new ClassSingleton();
}
return instance;
}
and you can have code like this:
My Version:
public static synchronized ClassSingleton getInstance() {
return instance = (instance == null) ? new ClassSingleton() : instance;
}
The code below is cleaner in any way, but SonarLints rule java:S1121 sees this as non-compliant (major, code smell)
So is there more behind this except the readability SonarLint is talking about?
I have a strange feeling that my version is doing always an assignment before returning, can this be a performance disbenefit?
Happy to hear what you guys have to say.
Yes it may have [probably negligible] performance implications. This can be tested by just trying to compile it and see what comes out. Here is the bytecode from Javac 16.0.1:
Compiled from "ClassSingleton.java"
public class ClassSingleton {
public ClassSingleton();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static synchronized ClassSingleton getInstance();
Code:
0: getstatic #7 // Field instance:LClassSingleton;
3: ifnonnull 16
6: new #8 // class ClassSingleton
9: dup
10: invokespecial #13 // Method "<init>":()V
13: putstatic #7 // Field instance:LClassSingleton;
16: getstatic #7 // Field instance:LClassSingleton;
19: areturn
public static synchronized ClassSingleton getInstanceShort();
Code:
0: getstatic #7 // Field instance:LClassSingleton;
3: ifnonnull 16
6: new #8 // class ClassSingleton
9: dup
10: invokespecial #13 // Method "<init>":()V
13: goto 19
16: getstatic #7 // Field instance:LClassSingleton;
19: dup
20: putstatic #7 // Field instance:LClassSingleton;
23: areturn
}
Via the age old metric of long=bad, the second version is clearly worse. In all seriousness though, it just performs 2 extra instructions in all cases. First dup duplicates the last item on the stack so it can then use that extra item to putstatic which assigns instance to the top value on the stack. We can speculate that if another thread is not synchronized on ClassSingleton, and instance has the correct attributes, then in theory we might get some weird behavior where instance might not be set correctly. However, that seems highly unlikely considering that synchronized handles most of that for us.
In the end though, the JIT compiler will probably remove the need for dup by using registers and it has a decent chance of figuring out tat it can get rid of that extra putstatic as well. However, I am not experienced enough with the JIT compiler to do more than speculate on how it might act.
That being said, just use the first version. It is way easier to read and generates shorter bytecode.

Understanding dynamic polymorphsim byte code

I am a novice to Java byte code and would like to understand the following byte code of Dispatch.class relative to Dispatch.java source code below :
Compiled from "Dispatch.java"
class Dispatch {
Dispatch();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: new #2 // class B
3: dup
4: invokespecial #3 // Method B."<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #4 // Method A.run:()V
12: return
}
//=====================Dispatch.java==============================
class Dispatch{
public static void main(String args[]){
A var = new B();
var.run(); // prints : This is B
}
}
//======================A.java===========================
public class A {
public void run(){
System.out.println("This is A");
}
}
//======================B.java===========================
public class B extends A {
public void run(){
System.out.println("This is B");
}
}
After doing some reading on the internet I had a first grasp of how JVM stack and opcodes work. I still however do not get what these command lines are good for :
3: dup //what are we duplicating here exactly?
4: invokespecial #3 //what does the #3 in operand stand for?
invokevirtual VS invokespecial //what difference there is between these opcodes?
It really sounds like you need to read the docs some more, but to answer your updated questions,
dup duplicates the top value on the operand stack. In this case, it would be the uninitialized B object that was pushed by the previous new instruction.
The #3 means that invokespecial is operating on the 3rd slot in the classfile's constant pool. This is where the method to be invoked is specified. You can see the constant pool by passing -c -verbose to javap.
invokevirtual is used for ordinary (non interface) virtual method calls. (Ignoring default interface methods for the moment) invokespecial is used for a variety of special cases - private method calls, constructor invocations, and superclass method calls.

How is the java "this" keyword implemented?

How does the this pointer points to the object itself? Is it a java implementation or is it a compiler implementation?
In the JVM bytecode, local variable 0 (basically register 0) points to the current object when a method is invoked. The compiler simply uses this as an alias for local variable 0.
So I guess the answer is that the compiler implements this.
Sounds like a philosophical question. I am not sure that a Java implementation is.
this is defined in the JLS and is a keyword in Java and the compile has to comply with that standard. When you have a method like
object.method(args)
what is actually called in byte code is a method which looks like
method(object, args);
where this is the first argument.
At the JVM level, the parameters don't have names and the JIT could optimise the argument away if its not actually used.
Well if you are interested why not look at the byte code generated by the compiler
class HelloWorld
{
private String hello = "Hello world!";
private void printHello(){
System.out.println (this.hello);
}
public static void main (String args[]){
HelloWorld hello = new HelloWorld();
hello.printHello();
}
}
Compile using
%JAVA_HOME%/bin/javac HelloWorld.java
Get bytecode using
javap -c HelloWorld
edit add output
enter code here
HelloWorld();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()
4: aload_0
5: ldc #2; //String Hello world!
7: putfield #3; //Field hello:Ljava/lang/String;
10: return
public static void main(java.lang.String[]);
Code:
0: new #6; //class HelloWorld
3: dup
4: invokespecial #7; //Method "<init>":()V
7: astore_1
8: aload_1
9: invokespecial #8; //Method printHello:()V
12: return
}

jvm differences between synchronized and non-synchronized methods

I have the following class:
public class SeqGenerator {
int last = 0;
volatile int lastVolatile = 0;
public int getNext() {
return last++;
}
public synchronized int getNextSync() {
return last++;
}
public int getNextVolatile() {
return lastVolatile++;
}
public void caller() {
int i1 = getNext();
int i2 = getNextSync();
int i3 = getNextVolatile();
}
}
When I look at the disassembled code I don't see the difference between the representation of three methods getNext(), getNextSync() and getNextVolatile() .
public int getNext();
Code:
0: aload_0
1: dup
2: getfield #2; //Field last:I
5: dup_x1
6: iconst_1
7: iadd
8: putfield #2; //Field last:I
11: ireturn
public synchronized int getNextSync();
Code:
0: aload_0
1: dup
2: getfield #2; //Field last:I
5: dup_x1
6: iconst_1
7: iadd
8: putfield #2; //Field last:I
11: ireturn
public int getNextVolatile();
Code:
0: aload_0
1: dup
2: getfield #3; //Field lastVolatile:I
5: dup_x1
6: iconst_1
7: iadd
8: putfield #3; //Field lastVolatile:I
11: ireturn
public void caller();
Code:
0: aload_0
1: invokevirtual #4; //Method getNext:()I
4: istore_1
5: aload_0
6: invokevirtual #5; //Method getNextSync:()I
9: istore_2
10: aload_0
11: invokevirtual #6; //Method getNextVolatile:()I
14: istore_3
15: return
How the JMV can distinguish between these methods?
The generated code is the same of these methods and also of their callers. How the JVM performs the synchronization?
The synchronized keyword applied to a method just sets the ACC_SYNCHRONIZED flag on that method definition, as defined in the JVM specification § 4.6 Methods. It won't be visible in the actual bytecode of the method.
The JLS § 8.4.3.6 synchronized Methods discusses the similarity of defining a synchronized method and declaring a synchronized block that spans the whole method body (and using the same object to synchronize on): the effect is exactly the same, but they are represented differently in the .class file.
A similar effect happens with volatile fields: It simply sets the ACC_VOLATILE flag on the field (JVM § 4.5 Fields). The code that accesses the field uses the same bytecode, but acts slightly different.
Also please note that using only a volatile field here is not threadsafe, because x++ on a volatile field x is not atomic!
The difference between the first two is right here:
public int getNext();
bytecodes follow...
public synchronized int getNextSync();
bytecodes follow...
As to the last one, volatile is a property of the variable, not of the method or the JVM bytecodes that access that variable. If you look at the top of the javap output, you'll see the following:
int last;
volatile int lastVolatile;
If/when the bytecodes are compiled into machine code by the JIT compiler, I am sure the resulting machine code will differ for the last method.

Which of those two pieces of code is better/faster/uses less memory?

Which one is more optimal or is there any difference at all?
String s = methodThatReturnsString();
int i = methodThatReturnsInt();
thirdMethod(s, i);
or
thirdMethod(methodThatReturnsString(), methodThatReturnsInt());
By optimal I mean optimal in the terms of memory usage etc.
It has nothing to do with optimization here, but it's more a question of readability of your code...
Which one is more optimal?
The one which is easier to read :-)
I would think that any difference is optimized away when compiled (provided that the declared variables are not used afterwards - i.e. the solutions are otherwise identical).
I highly suspect that both forms are identical, but don't take my word for it. Let's find out ourselves! :D
public class Tests {
public void test1() {
String s = methodThatReturnsString();
int i = methodThatReturnsInt();
thirdMethod(s, i);
}
public void test2() {
thirdMethod(methodThatReturnsString(), methodThatReturnsInt());
}
public String methodThatReturnsString() {
return "";
}
public int methodThatReturnsInt() {
return 0;
}
public void thirdMethod(String s, int i) {
}
}
Let's compile it:
> javac -version
javac 1.6.0_17
> javac Tests.java
Now, let's print out the bytecode instructions!
> javap -c Tests
Compiled from "Tests.java"
public class Tests extends java.lang.Object{
public Tests();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."":()V
4: return
public void test1();
Code:
0: aload_0
1: invokevirtual #2; //Method methodThatReturnsString:()Ljava/lang/String;
4: astore_1
5: aload_0
6: invokevirtual #3; //Method methodThatReturnsInt:()I
9: istore_2
10: aload_0
11: aload_1
12: iload_2
13: invokevirtual #4; //Method thirdMethod:(Ljava/lang/String;I)V
16: return
public void test2();
Code:
0: aload_0
1: aload_0
2: invokevirtual #2; //Method methodThatReturnsString:()Ljava/lang/String;
5: aload_0
6: invokevirtual #3; //Method methodThatReturnsInt:()I
9: invokevirtual #4; //Method thirdMethod:(Ljava/lang/String;I)V
12: return
public java.lang.String methodThatReturnsString();
Code:
0: ldc #5; //String
2: areturn
public int methodThatReturnsInt();
Code:
0: iconst_0
1: ireturn
public void thirdMethod(java.lang.String, int);
Code:
0: return
}
I thought this looked a bit strange - test1() and test2() are different. It looks like the compiler is adding the debugging symbols. Perhaps this is forcing it to explicitly assign return values to the local variables, introducing extra instructions.
Let's try recompiling it with no debugging:
> javac -g:none Tests.java
> javap -c Tests
public class Tests extends java.lang.Object{
public Tests();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."":()V
4: return
public void test1();
Code:
0: aload_0
1: invokevirtual #2; //Method methodThatReturnsString:()Ljava/lang/String;
4: astore_1
5: aload_0
6: invokevirtual #3; //Method methodThatReturnsInt:()I
9: istore_2
10: aload_0
11: aload_1
12: iload_2
13: invokevirtual #4; //Method thirdMethod:(Ljava/lang/String;I)V
16: return
public void test2();
Code:
0: aload_0
1: aload_0
2: invokevirtual #2; //Method methodThatReturnsString:()Ljava/lang/String;
5: aload_0
6: invokevirtual #3; //Method methodThatReturnsInt:()I
9: invokevirtual #4; //Method thirdMethod:(Ljava/lang/String;I)V
12: return
public java.lang.String methodThatReturnsString();
Code:
0: ldc #5; //String
2: areturn
public int methodThatReturnsInt();
Code:
0: iconst_0
1: ireturn
public void thirdMethod(java.lang.String, int);
Code:
0: return
}
Inconceivable!
So, according to my compiler (Sun JDK), the bytecode is shorter for the second version. However, the virtual machine will probably optimize any differences away. :)
Edit: Some extra clarification courtesy of Joachim Sauer's comment:
It's important to note that the byte
code tells only half the story: How it
is actually executed depends a lot on
the JVM (that's quite different to
C/C++, where you can see the assembler
code and it's exactly how it's
executed). I think you realize that,
but I think it should be made clearer
in the post.
I would prefer the first option. However this has nothing to do with speed, but with debuggability. In the second option I can not easily check what the values of s and i are. Performance-wise this will not make any difference at all.
There shouldn't be any difference. Both the temporarily used String and int have to reside somewhere and Java is, internally, a stack machine. So regardless of whether you give the return values of that method calls names or not, they have to be stored on the stack prior to execution of thirdMethod(String, int).
Implications of that for the resulting JITted code can be hard to find. That's on a completely different level ob abstraction.
If in doubt, profile. But I wouldn't expect any difference here.
It is the same thing. In both cases the same functions will be called and variables (automatic or explicitly defined will be allocated). The only difference is that in the second case, the variables will be ready to garbage collected whereas on the first one you need to wait to get out of scope.
Of course however the first one is much more readable.
There is no difference at all. In this case, you might want to consider readability and clearness.
Experiment and measure. If speed is what matters, measure speed. If memory usage matters, measure memory usage. If number of bytecode instructions is what matters, count bytecode instructions. If code readability is what matters, measure code readability. Figuring out how to measure code readability is your homework.
If you don't experiment and measure all you get will be opinion and argument.
Or, if you are very lucky, someone on SO will run your experiments for you.
PS This post is, of course, my opinion and argument
thirdMethod(metodThatReturnsString(), metodThatReturnsInt());
is more optimal...

Categories