The compilation and execution of a java program? - java

I am a beginner in java programming course and so far this is what I have understood about the whole java program being compiled and executed. Stating in brief:-
1) Source code (.java) file is converted into bytecode(.class) (which is an intermediate code) by the java compiler.
2) This bytecode(.class) file is platform independent so wooosh....I can copy it and take it to a different platform machine which has JVM.
3) When I run the bytecode The JVM which is a part of JRE first verifies the
bytecode, calls out JIT which at runtime makes the optimizations since
it
has access to dynamic
runtime information.
4) And finally JVM interprets the intermediate code into a
series of machine instructions for the processor to execute. (A processor can't execute the bytecode directly since it is not in native code)
Is my understanding correct? Anything that needs to be added or corrected?

Taking each of your points in turn:
1) This is correct. Java source is compiled by javac (although other tools could do the same thing) and class files are generated.
2) Again, correct. Class files contain platform-neutral bytecodes. These are loosely an instruction set for a 'virtual' machine (i.e. the JVM). This is how Java implements the "write once, run anywhere" idea it's had since it was launched.
3) Partially correct. When the JVM needs to load a class it runs a four-phase verification on the bytecodes of that class to ensure that the format of the bytecodes is legal in terms of the JVM. This is to prevent bytecode sequences being generated that could potentially subvert the JVM (i.e. virus-like behaviour). The JVM does not, however, run the JIT at this point. When bytecodes are executed they start in interpreted mode. Each bytecode is converted on the fly to the required native instructions and OS system calls.
4) This is sort of wrong when combined with point 3.
Here's the process explained briefly:
As the JVM interprets the bytecodes of the application it also profiles which groups of bytecodes are being run frequently. If you have a loop that repeatedly calls a method the JVM will notice this and identify that this is a hotspot in your code (hence the name of the Oracle JVM). Once a method has been called enough times (which is tunable), the JVM will call the Just In Time (JIT) compiler to generate native instructions for that method. When the method is called again the native code is used, eliminating the need for interpreting and thus improving the speed of the application. This profiling phase is what leads to the 'warm-up' behaviour of a Java application where relevant sections of the code are gradually compiled into native instructions.
For OpenJDK based JVMs there are two JIT compilers, C1 and C2 (sometimes called client and server). The C1 JIT will warm-up more quickly but have a lower optimum level of performance. C2 warms-up more slowly but applies a greater level of optimisation to the code, giving a higher overall performance level.
The JVM can also throw away compiled code, either because it hasn't been used for a long time (like in a cache) or an assumption that the JIT made (called a speculative optimisation) turns out to be wrong. This is called a deopt and results in the JVM going back to interpreted mode, reprofiling the code and potentially recompiling it with the JIT.

First and foremost, java is only a programming language. That means you could (theoretically) run a compiler to generate a native binary instad of this bytecode. (See: Compiling a java program into an executable )
The other thing I should mention are Java Processors which are able to execute java bytecode directly... because its their native instruction set (See: https://en.wikipedia.org/wiki/Java_processor )

Related

Steps of programm execution

After hours of research I haven't found a concrete answer for my question and I'm going maddd!:
The steps from editing to execution:
1 . (Compilation step) After writing the source code, i compile the program. In this step it is converted into bytecode. A java.class file (the bytecode) is generated.
2 .(Execution step) Now i execute the program.
(Interpretation step) When I do this, the JVM interprets the bytecode into machine code. So I understand that the machine code is only generated after execution!??
Now the steps are: code-->bytecode-->execution-->machinecode
All these steps are hardware- and software-independent.
Am i right?
This is called JIT (just in time compilation), so that when I execute the program the bytecode is compiled into machinecode, and only then.
So why is this step called interpretation?
I'm thanking you in advance for your answers!
In short because JVM doesn't have to have JIT. It can interpret the bytecode instead of compiling it. Of course an interpret-only JVM would be slow, but the JIT part is just an extra feature to improve performance, not a required property of a Java Virtual Machine. The -Xint command line parameter can be used to run a java program in interpret-only mode.
The reason it's compiled to bytecode and not machine code is to get the platform independence. Bytecode is platform independent, so the same code can run on any platform (as long as there's the JVM to interpret it). If it were compiled into machine code, it would be operating system and processor architecture dependent.
(Interpretating step) When i do this, the JVM interprets the bytecode into machine code. So i understand that the machine code is only generated after execution!??
Not exactly, and no. A JVM operating strictly as a bytecode interpreter does not transform bytecode into machine code and then execute that. The machine code executed by such a JVM is (comprised by) the pre-existing machine code of the JVM itself. The byte code is used to provide some of the data on which to operate and to direct which of the JVM's machine code is executed.
Now the steps are: code-->bytecode-->execution-->machinecode
All these steps are hardware- and software-independent. Am i right?
No, not at all. The particulars of the Java code --> bytecode transformation are somewhat dependent on which Java compiler (software) you use. The Java virtual machine you use must be specific to the hardware on which it runs, and it is itself a piece of software. Moreover, the operating environment is influenced by a lot of other software.
Java hardware independence, such as it is, means that a Java program (bytecode) will behave consistently on any hardware, but the details of how that consistent behavior is provided on any given machine are all kinds of hardware- and software-dependent.
This is called JIT(just in time compilation), so that when I execute the program the bytecode is compiled into machinecode, and only then. But why is this step called interpretating?
JIT is something else, and a JVM that performs JIT (as in fact most do) is not strictly an interpreter. Most such JVMs run some bytecode in an interpretative manner as described above, but compile some bytecode to native (machine) code, and run that machine code directly when subsequently needed. The latter manner of execution generally isn't called "interpreting".

Does the Java interpreter convert the byte-code files to an executable file?

I had this question in software course:
True/False: The Java interpreter converts files from a byte-code format to executable files.
I think the statement is false. In class, they said the interpreter "executes" the byte-code files, on the system using the JVM (I didn't listen too much but I think I got it fairly correctly), but as I understood, it doesn't actually convert it to executable files (which presumably are .exe files), just runs it on the system directly.
"True/False: The Java interpreter converts files from a byte-code format to executable files".
The answer is false1.
The Java interpreter is one of the two components of the JVM that is responsible for executing Java code. It does it by "emulating" the execution of the Java Virtual Machine instructions (bytecodes); i.e. by pretending to be a "real" instance of the virtual machine.
The other JVM component that is involved is the Just In Time (JIT) compiler. This identifies Java methods that have been interpreted for a significant amount of time, and does an on-the-fly compilation to native code. This native code is then executed instead of interpreting the bytecodes.
But the JIT compiler does not write the compiled native code to the file system. Instead it writes it directly into a memory segment ready to be executed.
Java's interpret / JIT compile is more complicated, but it has a couple of advantages:
It means that it is not necessary to compile bytecodes to native code before the application can be run, which removes a significant impediment to portability.
It allows the JVM to gather runtime statistics on how the application is functioning, which can give hints as to the best way to optimize the native code. The result is faster execution for long-running applications.
The downside is that JIT compilation is one of the factors that tends to make Java applications slow to start (compared with C / C++ for example).
1 - ... for mainstream Java (tm) compilers. Android isn't Java (tm)2. Note that the first version of Java was interpreter only. I have also seen Java (not tm) implementations where the native code compilers were either ahead-of-time or eager ... or a combination of both.
2 - You are only permitted by Oracle to describe your "java-like" implementation as Java(tm) if it passes the Java compliance tests. Android wouldn't.
The Java compiler converts the source code to bytecode. This bytecode is then interpreted (or just-in-time-compiled and then executed) by the JVM. This bytecode is a kind of intermediate language that has not platform dependence. The virtual machine then is the layer that provides system specific functionality.
It is also possible to compile Java code to native code, a project aiming this is for example the GCJ.
To answer your question: no, a normal Java compiler does not emit an executable binary, but a set of classes that can be executed using a JVM. You can read more about this on Wikipedia.
False for regular JVMs. No executable files are created. The conversion from bytecode to native code for that platform takes place on the fly during execution. If the program is stopped, the compiled code is gone (was in memory only).
The new Android JVM ART does compile the bytecode into executables before to have better startup and runtime behavior. So ART creates files.
ART straddles an interesting mid-ground between compiled and interpreted code, called ahead-of-time (AOT) compilation. Currently with Android apps, they are interpreted at runtime (using the JIT), every time you open them up. This is slow. (iOS apps, by comparison, are compiled native code, which is much faster.) With ART enabled, each Android app is compiled to native code when you install it. Then, when it’s time to run the app, it performs with all the alacrity of a native app. http://www.extremetech.com/computing/170677-android-art-google-finally-moves-to-replace-dalvik-to-boost-performance-and-battery-life
The answer is false
reason:
JIT-just in time compiler and java interpreter does a same thing in different way but as per performance JIT wins. The main task is to convert the given bytecode into machine dependent Assembly language as of abstract information.Assembly level language is a low level language which understood by machine's assembler and after that assembler converts it to 01010111.....

If JVM generates machine code, then where are the code files?

I read some materials about JVM and bytecode. I think it would be more efficient if JVM can translate bytecode into platform dependent machine code in the first time run, instead of interpreting them all the time.
However, I could not find such files in my project folders. There are only bin and src folders, which contain *.class bytecodes and *.java source codes.
So my questions are:
If Java interprets bytecode all the time, why not translate bytecode to machine code after the first run?
If they do generate machine code, where are the files?
Not an option since the environment can change between runs (e.g. upgrade of JVM)
In memory (or serialized to disk when needed)
If Java interprets bytecode all the time, why not translate bytecode
to machine code after the first run?
There are pros and cons to both ahead of time (AOT) and just in time (JIT) compilation.
The main advantage of AOT is that the compiler is generally allowed to take longer, so it can perform more sophisticated analysis and optimization. Another advantage is that the compiler doesn't have to be present at runtime on the target machine. The disadvantages are everything else.
The main advantage of JIT is that the compiler is able to make optimizations based on information known only at runtime. In fact, it is even possible to unoptimize and reoptimize code when conditions change. Furthermore, the JIT doesn't have to waste time optimizing code that is never or rarely run, unlike the AOT compiler.
Some languages are designed to favor one approach over the other. For example, C/C++ are designed for AOT, while Java is designed for JIT (though it can be compiled AOT with some restrictions). For example, Java has a heavy emphasis on virtual getters and setters, possibly for classes not loaded until runtime. But the JIT can see and inline these functions at runtime. By contrast, if you used virtual methods for every field access in C++, you'd pay a huge performance penalty.
It doesn't interpret code all the time. Interpreted code is translated into byte code after some time. You can tweak this "time" using -XX:CompileThreshold= (default is 10000) or you can turn off compilation completely.
In memory. There's a special area in memory called "Code cache". You can see how methods a compiled into the cache and how they are evicted from the cache using -XX:+PrintCompilation. The size of the cache is also configurable, see -XX:ReservedCodeCacheSize=.
Well, the JVM has preprocessed data but only for its own classes. Given the size of the JRE library and the fact that it usually doesn’t change, it’s a big win (you might look for files called classes.jsa).
However, even these files are not containing native code but only easier-to-process byte code.
The big point about code generation in Hot Spot JVMs is that they don’t compile code on a class or method basis as you seem to think. These JVMs compile code fragments spanning multiple interacting methods as the interaction is discovered during the self-profiling. These code blocks may span methods from the JRE, the extension libraries, 3rd party libraries in your class path and your application classes and hence are only valid for this specific combination.
During the compilation the information gathered about your program’s behavior will be used, e.g. code paths not taken might be elided and conditionals might be asserted to evaluate to a certain result as they did in previous evaluations. This yields to a high performance but it might happen that the JVM has to drop the code even during the same execution when one of the assertions does not hold anymore, e.g. the program might take a code path it didn’t before or a new class has been loaded into the JVM which extends a class whose code has been optimized as-if having no subclasses, etc.
So if optimized and compiled code might be rendered obsolete even within the same environment, it is even much likelier to be obsolete in the next execution. In the end, the JVM would have to check whether the old code is still appropriate which might turn out to be even costlier than simply gathering the new environment’s data and program behavior.

Is the JVM a compiler or an interpreter?

I have a very basic question about JVM: is it a compiler or an interpreter?
If it is an interpreter, then what about JIT compiler that exist inside the JVM?
If neither, then what exactly is the JVM? (I dont want the basic definition of jVM of converting byte code to machine specific code etc.)
First, let's have a clear idea of the following terms:
Javac is Java Compiler -- Compiles your Java code into Bytecode
JVM is Java Virtual Machine -- Runs/ Interprets/ translates Bytecode into Native Machine Code
JIT is Just In Time Compiler -- Compiles the given bytecode instruction sequence to machine code at runtime before executing it natively. Its main purpose is to do heavy optimizations in performance.
So now, Let's find answers to your questions:
JVM: is it a compiler or an interpreter?
An interpreter
What about JIT compiler that exist inside the JVM?
If you read this reply completely, you probably know it now.
What exactly is the JVM?
JVM is a virtual platform that resides on your RAM
Its component, Class loader loads the .class file into the RAM
The Byte code Verifier component in JVM checks if there are any access restriction violations in your code. (This is one of the principal reasons why java is secure)
Next, the Execution Engine component converts the Bytecode into executable machine code
It is a little of both, but neither in the traditional sense.
Modern JVMs take bytecode and compile it into native code when first needed. "JIT" in this context stands for "just in time." It acts as an interpreter from the outside, but really behind the scenes it is compiling into machine code.
The JVM should not be confused with the Java compiler, which compiles source code into bytecode. So it is not useful to consider it "a compiler" but rather to know that in the background it does do some compilation.
Like #delnan already stated in the comment section, it's neither.
JVM is an abstract machine running Java bytecode.
JVM has several implementations:
HotSpot (interpreter + JIT compiler)
Dalvik (interpreter + JIT compiler)
ART (AOT compiler + JIT compiler)
GCJ (AOT compiler)
JamVM (interpreter)
...and many others.
Most of the others answers when talking about JVM refer either to HotSpot or
some mixture of the above approaches to implementing the JVM.
It is both. It starts by interpreting bytecode and can (should it decide it is worth it) then compile that bytecode to native machine code.
It's both. It can interpret bytecode, and compile it to native code.
Javac is a compiler but not a traditional compiler.
A compiler typically converts source code to Machine level language for execution and that is done in a single shot i.e. entire code is taken and converted to machine level language at ONCE. (more on this below).
While, JavaC converts it to Bytecode instead of machine level language.
JIT is a Java compiler but also acts as an interpreter. A typical compiler will convert all the code at once from source code to machine level language. Instead, JIT goes line by line (line by line execution is a feature of Interpreters) and converts bytecode generated by JavaC  into machine level language and executes it. JVM which has JIT in it has multiple implementations. Hotspot being one of the major ones for Java programming. Hotspot implementation makes JIT optimize the execution by converting chunks of code which are repetitive into Machine level language at once (like a compiler as mentioned above) so that they can be executed faster instead of converting each line of code 1 by 1.
So, the answer is not Black and White with respect to the typical definitions of Compiler and Interpreter.
This is my understanding after reading several online answers, blogs, etc. If somebody has suggestions to improve this understanding, please feel free to suggest.
JVM have both compiler and interpreter. Because the compiler compiles the code and generates bytecode. After that the interpreter converts bytecode to machine understandable code.
Example: Write and compile a program and it runs on Windows. Take the .class file to another OS (Unix) and it will run because of interpreter that converts the bytecode to machine understandable code.

Is Java a Compiled or an Interpreted programming language ?

In the past I have used C++ as a programming language. I know that the code written in C++ goes through a compilation process until it becomes object code "machine code".
I would like to know how Java works in that respect. How is the user written Java code run by the computer?
Java implementations typically use a two-step compilation process. Java source code is compiled down to bytecode by the Java compiler. The bytecode is executed by a Java Virtual Machine (JVM). Modern JVMs use a technique called Just-in-Time (JIT) compilation to compile the bytecode to native instructions understood by hardware CPU on the fly at runtime.
Some implementations of JVM may choose to interpret the bytecode instead of JIT compiling it to machine code, and running it directly. While this is still considered an "interpreter," It's quite different from interpreters that read and execute the high level source code (i.e. in this case, Java source code is not interpreted directly, the bytecode, output of Java compiler, is.)
It is technically possible to compile Java down to native code ahead-of-time and run the resulting binary. It is also possible to interpret the Java code directly.
To summarize, depending on the execution environment, bytecode can be:
compiled ahead of time and executed as native code (similar to most C++ compilers)
compiled just-in-time and executed
interpreted
directly executed by a supported processor (bytecode is the native instruction set of some CPUs)
Code written in Java is:
First compiled to bytecode by a program called javac as shown in the left section of the image above;
Then, as shown in the right section of the above image, another program called java starts the Java runtime environment and it may compile and/or interpret the bytecode by using the Java Interpreter/JIT Compiler.
When does java interpret the bytecode and when does it compile it? The application code is initially interpreted, but the JVM monitors which sequences of bytecode are frequently executed and translates them to machine code for direct execution on the hardware. For bytecode which is executed only a few times, this saves the compilation time and reduces the initial latency; for frequently executed bytecode, JIT compilation is used to run at high speed, after an initial phase of slow interpretation. Additionally, since a program spends most time executing a minority of its code, the reduced compilation time is significant. Finally, during the initial code interpretation, execution statistics can be collected before compilation, which helps to perform better optimization.
The terms "interpreted language" or "compiled language" don't make sense, because any programming language can be interpreted and/or compiled.
As for the existing implementations of Java, most involve a compilation step to bytecode, so they involve compilation. The runtime also can load bytecode dynamically, so some form of a bytecode interpreter is always needed.
That interpreter may or may not in turn use compilation to native code internally.
These days partial just-in-time compilation is used for many languages which were once considered "interpreted", for example JavaScript.
Java is compiled to bytecode, which then goes into the Java VM, which interprets it.
Java is a compiled programming language, but rather than compile straight to executable machine code, it compiles to an intermediate binary form called JVM byte code. The byte code is then compiled and/or interpreted to run the program.
Kind of both. Firstly java compiled(some would prefer to say "translated") to bytecode, which then either compiled, or interpreted depending on mood of JIT.
Java does both compilation and interpretation,
In Java, programs are not compiled into executable files; they are compiled into bytecode (as discussed earlier), which the JVM (Java Virtual Machine) then interprets / executes at runtime. Java source code is compiled into bytecode when we use the javac compiler. The bytecode gets saved on the disk with the file extension .class.
When the program is to be run, the bytecode is converted the bytecode may be converted, using the just-in-time (JIT) compiler. The result is machine code which is then fed to the memory and is executed.
Javac is the Java Compiler which Compiles Java code into Bytecode. JVM is Java Virtual Machine which Runs/ Interprets/ translates Bytecode into Native Machine Code. In Java though it is considered as an interpreted language, It may use JIT (Just-in-Time) compilation when the bytecode is in the JVM. The JIT compiler reads the bytecodes in many sections (or in full, rarely) and compiles them dynamically into machine code so the program can run faster, and then cached and reused later without needing to be recompiled. So JIT compilation combines the speed of compiled code with the flexibility of interpretation.
An interpreted language is a type of programming language for which most of its implementations execute instructions directly and freely, without previously compiling a program into machine-language instructions. The interpreter executes the program directly, translating each statement into a sequence of one or more subroutines already compiled into machine code.
A compiled language is a programming language whose implementations are typically compilers (translators that generate machine code from source code), and not interpreters (step-by-step executors of source code, where no pre-runtime translation takes place)
In modern programming language implementations like in Java, it is increasingly popular for a platform to provide both options.
Java is a byte-compiled language targeting a platform called the Java Virtual Machine which is stack-based and has some very fast implementations on many platforms.
Quotation from: https://blogs.oracle.com/ask-arun/entry/run_your_java_applications_faster
Application developers can develop the application code on any of the various OS that are available in the market today. Java language is agnostic at this stage to the OS. The brilliant source code written by the Java Application developer now gets compiled to Java Byte code which in the Java terminology is referred to as Client Side compilation. This compilation to Java Byte code is what enables Java developers to ‘write once’. Java Byte code can run on any compatible OS and server, hence making the source code agnostic of OS/Server. Post Java Byte code creation, the interaction between the Java application and the underlying OS/Server is more intimate. The journey continues - The enterprise applications framework executes these Java Byte codes in a run time environment which is known as Java Virtual Machine (JVM) or Java Runtime Environment (JRE). The JVM has close ties to the underlying OS and Hardware because it leverages resources offered by the OS and the Server. Java Byte code is now compiled to a machine language executable code which is platform specific. This is referred to as Server side compilation.
So I would say Java is definitely a compiled language.

Categories