Im learning about GraalVM and AOT and I was reading the spec for AOT and then I got confused, if AOT compiles my code to machine code (native) why do I need a JVM?
why I need this:
java -XX:AOTLibrary=./libHelloWorld.so HelloWorld
You still need the JVM, because you just compiled a small portion i.e. "your HelloWorld" to native code. You still require a lot of the JVM to run your program. e.g. the Java standard library (which you haven't compiled to native code), class loading, detection of the program entry point (finding your main method), and garbage collection. All of this is provided by the JVM.
In short, you just compiled a library, a small part of your program, to native code. You don't compile the complete program to native code.
This is also stated by the summary of JEP295:
Summary
Compile Java classes to native code prior to launching the virtual machine.
I think what you actually want to do, is to compile a native image of your program. This would include all the implications like garbage collection from the JVM into the executable.
Related
This question already has answers here:
Why a virtual machine is needed execute a java program.? [duplicate]
(3 answers)
Closed 2 years ago.
I am trying to wrap my head around the Java Virtual Machine and why it uses bytecode. I know it has been asked so many times, but somehow I couldn't finally make the correct assumption, so I researched many things and decided to explain how I think it works and if it's correct.
I understand that in C++, compiler compiles the source code on the specific (architecture + operating system). So, compiled version of C++ for (x86 + Windows) won't run on any other architecture or operating system.
My assumptions
When Java compiler compiles the source code into bytecode, It doesn't do the compilation depending on architecture or operating system. The source code will always be compiled to the same bytecode if it's compiled on Windows or Mac. Let's say we compiled and now, send the bytecode to another computer (x86 + windows). In order for that computer to run this bytecode, It needs JVM. Now, JVM knows what architecture + operating system it's running on. (x86 + windows). So, JVM will compile bytecode to x86 + Windows and it will produce machine code which can be run by the actual computer now.
So, even though we use Java Virtual Machine, we still run the actual machine code on our operating system and not on the virtual machine. Virtual Machine just helps us to transform bytecode into machine code.
This means that when using Java, the only thing we have to worry about is installing JVM and that's it.
I just always thought that the Virtual Machine is just a computer itself where it would run the code in its own isolated place, but in case of JVM, i don't think that's correct, because I think machine code JVM produces still has to be run on the actual operating system we have.
Do you think my assumptions are correct?
When Java compiler compiles the source code into bytecode, It doesn't do the compilation depending on architecture or operating system. The source code will always be compiled to the same bytecode if it's compiled on Windows or Mac.
All correct.
Let's say we compiled and now, send the bytecode to another computer (x86 + windows). In order for that computer to run this bytecode, It needs JVM. Now, JVM knows what architecture + operating system it's running on. (x86 + windows). So, JVM will compile bytecode to x86 + windows and it will produce machine code which can be run by the actual computer now.
This is mostly correct, but there are a couple things that are a bit "off"
First of all, to execute the bytecodes on any computer you need a JVM. That includes the computer on which you compiled the bytecodes.
(It is theoretically possible that a computer could be designed and implemented to execute the JVM bytecode instruction set as its native instruction set. But I don't know if anyone has ever seriously contemplated doing this. It would be pointless. Performance would not be comparable with hardware that you can by for a couple of hundred dollars. The JVM bytecode instruction set is designed to be compact and simple, and it is relatively easy to JIT compile. Not to be executed efficiently.)
Secondly a typical JVM actually operates in two modes:
It starts out executing the bytecodes in software using an interpreter.
After a bit, it selectively compiles bytecodes of heavily used methods to the platform's native instruction set and executes the native code. The compilation is done using a JIT compiler.
Note that the JIT compiler is platform specific.
So, even though, we use Java Virtual Machine, we still run the actual machine code on our operating system and not on the virtual machine.
That is correct.
[The Java] Virtual Machine just helps us to transform bytecode into machine code.
The JVM actually does a lot more. Things like:
Garbage collection
Bytecode loading and verification
Implementing reflection
Providing native code methods for bridging between Java classes and operating system functionality
Implementing infrastructure for monitoring, profiling, debugging and so on.
This means that when using Java, the only thing we have to worry about is installing JVM and that's it.
Yes. But in a modern JDK there are other alternatives; e.g. jlink will generate an executable that has a cut-down JRE embedded in it so that you don't need to install a JRE. And GraalVM supports ahead of time (AOT) compilation.
I just always thought that the Virtual Machine is just a computer itself where it would run the code in its own isolated place, but in case of JVM, i don't think that's correct, because I think machine code JVM produces still has to be run on the actual operating system we have.
Ah yes.
The term "virtual machine" has multiple meanings:
A Java Virtual Machine "executes" Java bytecode ... in the sense above.
A Linux or Windows virtual machine is where the user's application and the guest operating system are running under the control of a "hypervisor" operating system. The applications and guest OS use the native hardware to execute instructions, but they don't have full control of the hardware.
And there are potentially other shades of meaning.
If you conflate JVMs with other kinds of virtual machine, you can get yourself in knots. Don't. They are different enough that conflating the concepts in not going to help you understand.
When compiling java code, java-bytecode is created.
This bytecode is platform independent and can be run by the JVM. This means that other programming languages like Kotlin can also generate this java-bytecode and target the JVM.
So you are correct about the java-bytecode being the same for every platform.
Where you aren't entirely correct however is, that the JVM converts the java-bytecode to native bytecode.
The JVM executes the java-bytecode instead and gives the instructions to the operating system. While executing, the JVM compiles bunches of java-bytecode to native code, which the operating system can then execute. This happens while the code is executed so a class that is never loaded will also never be executed and will never end up as machine code.
Due to this we can use features like Reflection where some of the java-bytecode of a class files can be modified at runtime before being compiled to native machine code.
The JVM basically sits between the operating system and the java-bytecode. In some sense just like a normal VM.
The article below has a nice visualization of the JVM.
https://www.guru99.com/java-virtual-machine-jvm.html
The java byte code is a form of intermediate code.
That code can be interpreted, a JVM is such an interpreter emulating (simulating) every single byte code instruction. This is slow but easily portable.
That code can be compiled to native code, normally a JVM contains a just-in-time compiler for code that is indeed executed. This native code can be optimized for the current machine it is running on, and the native code does not need to be loaded from file, and code inlining can be done.
So the JVM principly is a turbo-charged interpreter, of java byte code, portable to many platforms.
Context:
[Other languages] Microsoft's C# language is more of a compiler.
[Other intermediate codes] LLVM IR is an intermediate code from the C side, compilable.
[Other JVMs] There are more than one JVM, with different techniques.
I had this question in software course:
True/False: The Java interpreter converts files from a byte-code format to executable files.
I think the statement is false. In class, they said the interpreter "executes" the byte-code files, on the system using the JVM (I didn't listen too much but I think I got it fairly correctly), but as I understood, it doesn't actually convert it to executable files (which presumably are .exe files), just runs it on the system directly.
"True/False: The Java interpreter converts files from a byte-code format to executable files".
The answer is false1.
The Java interpreter is one of the two components of the JVM that is responsible for executing Java code. It does it by "emulating" the execution of the Java Virtual Machine instructions (bytecodes); i.e. by pretending to be a "real" instance of the virtual machine.
The other JVM component that is involved is the Just In Time (JIT) compiler. This identifies Java methods that have been interpreted for a significant amount of time, and does an on-the-fly compilation to native code. This native code is then executed instead of interpreting the bytecodes.
But the JIT compiler does not write the compiled native code to the file system. Instead it writes it directly into a memory segment ready to be executed.
Java's interpret / JIT compile is more complicated, but it has a couple of advantages:
It means that it is not necessary to compile bytecodes to native code before the application can be run, which removes a significant impediment to portability.
It allows the JVM to gather runtime statistics on how the application is functioning, which can give hints as to the best way to optimize the native code. The result is faster execution for long-running applications.
The downside is that JIT compilation is one of the factors that tends to make Java applications slow to start (compared with C / C++ for example).
1 - ... for mainstream Java (tm) compilers. Android isn't Java (tm)2. Note that the first version of Java was interpreter only. I have also seen Java (not tm) implementations where the native code compilers were either ahead-of-time or eager ... or a combination of both.
2 - You are only permitted by Oracle to describe your "java-like" implementation as Java(tm) if it passes the Java compliance tests. Android wouldn't.
The Java compiler converts the source code to bytecode. This bytecode is then interpreted (or just-in-time-compiled and then executed) by the JVM. This bytecode is a kind of intermediate language that has not platform dependence. The virtual machine then is the layer that provides system specific functionality.
It is also possible to compile Java code to native code, a project aiming this is for example the GCJ.
To answer your question: no, a normal Java compiler does not emit an executable binary, but a set of classes that can be executed using a JVM. You can read more about this on Wikipedia.
False for regular JVMs. No executable files are created. The conversion from bytecode to native code for that platform takes place on the fly during execution. If the program is stopped, the compiled code is gone (was in memory only).
The new Android JVM ART does compile the bytecode into executables before to have better startup and runtime behavior. So ART creates files.
ART straddles an interesting mid-ground between compiled and interpreted code, called ahead-of-time (AOT) compilation. Currently with Android apps, they are interpreted at runtime (using the JIT), every time you open them up. This is slow. (iOS apps, by comparison, are compiled native code, which is much faster.) With ART enabled, each Android app is compiled to native code when you install it. Then, when it’s time to run the app, it performs with all the alacrity of a native app. http://www.extremetech.com/computing/170677-android-art-google-finally-moves-to-replace-dalvik-to-boost-performance-and-battery-life
The answer is false
reason:
JIT-just in time compiler and java interpreter does a same thing in different way but as per performance JIT wins. The main task is to convert the given bytecode into machine dependent Assembly language as of abstract information.Assembly level language is a low level language which understood by machine's assembler and after that assembler converts it to 01010111.....
I am trying to understand how .class files work in java and what's their purpose. I found some information online, but I get unsatisfying explanations.
As soon as we run the compiler we get the .class file, which is bytecode. Is this machine readable or not? And if not, this is why we need the interpreter for the program to run successfully?
Also, since the .class file is the equivalent of our .java programs, why can't somebody run a java program straight away by just running the .class file using VM and they would need to have the .java file as well?
The JVM is by definition a virtual machine, that is a software machine that simulates what a real machine does. Like real machines it has an instruction set (the bytecodes), a virtual computer architecture and an execution model. It is capable of running code written with this virtual instruction set, pretty much like a real machine can run machine code.
So, the class files contain the instructions in the virtual instruction set, and it is capable of running them. For that matter, a virtual machine can either interpret the code itself or compile it for the hardware architecture it is currently running. Some do both, some do just one of them (e.g. .net runtime compiles once the first time the method is called).
For instance, the Java HotSpot initially interprets bytecodes, and progressively compiles the code into machine code. This is called adaptive optimization. Some virtual machines always compile to machine code directly.
So, you can see there are two different "compiling concepts". One consists in the transformation of Java code to JVM bytecodes (From .java to .class). And a second compilation phase happens when the program runs, where the bytecodes may either be interpreted or compiled to actual machine code. This is done by the just-in-time compiler, within the JVM.
So, as you can see, a computer cannot run a Java program directly because the program is not written in a language that the computer understands. It is written in lingua-franca that all JVM implementations can understand. And there are implementations of that JVM for many operating systems and hardware architectures. These JVMs translate the programs in this lingua-franca (bytecodes) for any particular hardware (machine code). That's the beauty of the virtual machine.
The .class file is machine-readable. The machine that reads it is the Java Virtual Machine, which interprets it and compiles it to native code (executable by your computer).
You don't need the .java files to run Java code. The .class files are all you need.
It's machine readable, but does not execute on the bare hardware. It's run through the Java Virtual Machine which is an interpreter with a very high performance just-in time compiler. There are good reasons to have the interpreter only use the class file's bytecode. Briefly they are:
Easier to build the interpeter since the bytecode is much closer to instructions that can be turned into native machine code by the JIT.
Easier to resolve dependencies since the Java compiler does some syntactic sugar on them through the import command.
Java bytecode (.class file) is not directly executable.
It's an intermediate language that is interpreted by the underlying Java Virtual Machine. Of course some optimizations can happen (i.e. Just-in-time compilation).
To run a Java program you only need the bytecode files, .java files contains the source code.
Compiler Vs Interpreter:
Compiler Takes an entire program as
input
Interpreter Takes Single instructions as
input.
Intermediate Object Code is
Generated
No Intermediate Object Code is
Generated
Conditional Control Statements are
Executed faster
Conditional Control Statements are
Executed slower
Memory Requirement: More
(Since Object Code is Generated)
Memory Requirement: Less
Program need not be compiled every
time
Every time higher level program is
converted into lower level program
Errorsare displayed after entire
program is checked
Errors are displayed for every
instruction interpreted (if any)
Example: C Compiler
Example: BASIC
I have a very basic question about JVM: is it a compiler or an interpreter?
If it is an interpreter, then what about JIT compiler that exist inside the JVM?
If neither, then what exactly is the JVM? (I dont want the basic definition of jVM of converting byte code to machine specific code etc.)
First, let's have a clear idea of the following terms:
Javac is Java Compiler -- Compiles your Java code into Bytecode
JVM is Java Virtual Machine -- Runs/ Interprets/ translates Bytecode into Native Machine Code
JIT is Just In Time Compiler -- Compiles the given bytecode instruction sequence to machine code at runtime before executing it natively. Its main purpose is to do heavy optimizations in performance.
So now, Let's find answers to your questions:
JVM: is it a compiler or an interpreter?
An interpreter
What about JIT compiler that exist inside the JVM?
If you read this reply completely, you probably know it now.
What exactly is the JVM?
JVM is a virtual platform that resides on your RAM
Its component, Class loader loads the .class file into the RAM
The Byte code Verifier component in JVM checks if there are any access restriction violations in your code. (This is one of the principal reasons why java is secure)
Next, the Execution Engine component converts the Bytecode into executable machine code
It is a little of both, but neither in the traditional sense.
Modern JVMs take bytecode and compile it into native code when first needed. "JIT" in this context stands for "just in time." It acts as an interpreter from the outside, but really behind the scenes it is compiling into machine code.
The JVM should not be confused with the Java compiler, which compiles source code into bytecode. So it is not useful to consider it "a compiler" but rather to know that in the background it does do some compilation.
Like #delnan already stated in the comment section, it's neither.
JVM is an abstract machine running Java bytecode.
JVM has several implementations:
HotSpot (interpreter + JIT compiler)
Dalvik (interpreter + JIT compiler)
ART (AOT compiler + JIT compiler)
GCJ (AOT compiler)
JamVM (interpreter)
...and many others.
Most of the others answers when talking about JVM refer either to HotSpot or
some mixture of the above approaches to implementing the JVM.
It is both. It starts by interpreting bytecode and can (should it decide it is worth it) then compile that bytecode to native machine code.
It's both. It can interpret bytecode, and compile it to native code.
Javac is a compiler but not a traditional compiler.
A compiler typically converts source code to Machine level language for execution and that is done in a single shot i.e. entire code is taken and converted to machine level language at ONCE. (more on this below).
While, JavaC converts it to Bytecode instead of machine level language.
JIT is a Java compiler but also acts as an interpreter. A typical compiler will convert all the code at once from source code to machine level language. Instead, JIT goes line by line (line by line execution is a feature of Interpreters) and converts bytecode generated by JavaC into machine level language and executes it. JVM which has JIT in it has multiple implementations. Hotspot being one of the major ones for Java programming. Hotspot implementation makes JIT optimize the execution by converting chunks of code which are repetitive into Machine level language at once (like a compiler as mentioned above) so that they can be executed faster instead of converting each line of code 1 by 1.
So, the answer is not Black and White with respect to the typical definitions of Compiler and Interpreter.
This is my understanding after reading several online answers, blogs, etc. If somebody has suggestions to improve this understanding, please feel free to suggest.
JVM have both compiler and interpreter. Because the compiler compiles the code and generates bytecode. After that the interpreter converts bytecode to machine understandable code.
Example: Write and compile a program and it runs on Windows. Take the .class file to another OS (Unix) and it will run because of interpreter that converts the bytecode to machine understandable code.
In the past I have used C++ as a programming language. I know that the code written in C++ goes through a compilation process until it becomes object code "machine code".
I would like to know how Java works in that respect. How is the user written Java code run by the computer?
Java implementations typically use a two-step compilation process. Java source code is compiled down to bytecode by the Java compiler. The bytecode is executed by a Java Virtual Machine (JVM). Modern JVMs use a technique called Just-in-Time (JIT) compilation to compile the bytecode to native instructions understood by hardware CPU on the fly at runtime.
Some implementations of JVM may choose to interpret the bytecode instead of JIT compiling it to machine code, and running it directly. While this is still considered an "interpreter," It's quite different from interpreters that read and execute the high level source code (i.e. in this case, Java source code is not interpreted directly, the bytecode, output of Java compiler, is.)
It is technically possible to compile Java down to native code ahead-of-time and run the resulting binary. It is also possible to interpret the Java code directly.
To summarize, depending on the execution environment, bytecode can be:
compiled ahead of time and executed as native code (similar to most C++ compilers)
compiled just-in-time and executed
interpreted
directly executed by a supported processor (bytecode is the native instruction set of some CPUs)
Code written in Java is:
First compiled to bytecode by a program called javac as shown in the left section of the image above;
Then, as shown in the right section of the above image, another program called java starts the Java runtime environment and it may compile and/or interpret the bytecode by using the Java Interpreter/JIT Compiler.
When does java interpret the bytecode and when does it compile it? The application code is initially interpreted, but the JVM monitors which sequences of bytecode are frequently executed and translates them to machine code for direct execution on the hardware. For bytecode which is executed only a few times, this saves the compilation time and reduces the initial latency; for frequently executed bytecode, JIT compilation is used to run at high speed, after an initial phase of slow interpretation. Additionally, since a program spends most time executing a minority of its code, the reduced compilation time is significant. Finally, during the initial code interpretation, execution statistics can be collected before compilation, which helps to perform better optimization.
The terms "interpreted language" or "compiled language" don't make sense, because any programming language can be interpreted and/or compiled.
As for the existing implementations of Java, most involve a compilation step to bytecode, so they involve compilation. The runtime also can load bytecode dynamically, so some form of a bytecode interpreter is always needed.
That interpreter may or may not in turn use compilation to native code internally.
These days partial just-in-time compilation is used for many languages which were once considered "interpreted", for example JavaScript.
Java is compiled to bytecode, which then goes into the Java VM, which interprets it.
Java is a compiled programming language, but rather than compile straight to executable machine code, it compiles to an intermediate binary form called JVM byte code. The byte code is then compiled and/or interpreted to run the program.
Kind of both. Firstly java compiled(some would prefer to say "translated") to bytecode, which then either compiled, or interpreted depending on mood of JIT.
Java does both compilation and interpretation,
In Java, programs are not compiled into executable files; they are compiled into bytecode (as discussed earlier), which the JVM (Java Virtual Machine) then interprets / executes at runtime. Java source code is compiled into bytecode when we use the javac compiler. The bytecode gets saved on the disk with the file extension .class.
When the program is to be run, the bytecode is converted the bytecode may be converted, using the just-in-time (JIT) compiler. The result is machine code which is then fed to the memory and is executed.
Javac is the Java Compiler which Compiles Java code into Bytecode. JVM is Java Virtual Machine which Runs/ Interprets/ translates Bytecode into Native Machine Code. In Java though it is considered as an interpreted language, It may use JIT (Just-in-Time) compilation when the bytecode is in the JVM. The JIT compiler reads the bytecodes in many sections (or in full, rarely) and compiles them dynamically into machine code so the program can run faster, and then cached and reused later without needing to be recompiled. So JIT compilation combines the speed of compiled code with the flexibility of interpretation.
An interpreted language is a type of programming language for which most of its implementations execute instructions directly and freely, without previously compiling a program into machine-language instructions. The interpreter executes the program directly, translating each statement into a sequence of one or more subroutines already compiled into machine code.
A compiled language is a programming language whose implementations are typically compilers (translators that generate machine code from source code), and not interpreters (step-by-step executors of source code, where no pre-runtime translation takes place)
In modern programming language implementations like in Java, it is increasingly popular for a platform to provide both options.
Java is a byte-compiled language targeting a platform called the Java Virtual Machine which is stack-based and has some very fast implementations on many platforms.
Quotation from: https://blogs.oracle.com/ask-arun/entry/run_your_java_applications_faster
Application developers can develop the application code on any of the various OS that are available in the market today. Java language is agnostic at this stage to the OS. The brilliant source code written by the Java Application developer now gets compiled to Java Byte code which in the Java terminology is referred to as Client Side compilation. This compilation to Java Byte code is what enables Java developers to ‘write once’. Java Byte code can run on any compatible OS and server, hence making the source code agnostic of OS/Server. Post Java Byte code creation, the interaction between the Java application and the underlying OS/Server is more intimate. The journey continues - The enterprise applications framework executes these Java Byte codes in a run time environment which is known as Java Virtual Machine (JVM) or Java Runtime Environment (JRE). The JVM has close ties to the underlying OS and Hardware because it leverages resources offered by the OS and the Server. Java Byte code is now compiled to a machine language executable code which is platform specific. This is referred to as Server side compilation.
So I would say Java is definitely a compiled language.