Is every `.class` file generated by `scalac` also obtainable via `javac`? - java

Is it true that for every .class file that was created by the Scala compiler scalac, it is theoretically possible to define a .java file that gets compiled, by javac, to exactly this same .class file?
If not, can you give one or more non-trivial examples of constructions in Scala that get compiled to JVM bytecode for which there is no corresponding Java construction?

Is it true that for every .class file that was created by the Scala compiler scalac, it is theoretically possible to define a .java file that gets compiled, by javac, to exactly this same .class file?
No.
If not, can you give one or more examples of constructions in Scala that get compiled to JVM bytecode for which there is no corresponding Java construction?
class `class`
class is a legal name for a class in Scala, but not in Java, so it is impossible to get the Java compiler to generate a bytecode file with a class named class.

Scala being compiled inserts ScalaSig (scala signatures) into bytecode, i.e. Scala-specific kind of comments. These "comments" will be absent in class files compiled from Java sources.
Java being compiled can generate annotations (#) visible with Java reflection at runtime while Scala can't. Scala can only generate annotations visible at sources and class files but not at runtime with Java reflection although they can be visible at runtime with Scala reflection (because some info is written to ScalaSigs).

Related

JVM language interoperability

Recently I've been writing a compiler for a JVM programming language and I've realised a problem.
I would like to access a Java method from my programming language and also allow a Java method to access a method in my language. The problem is that I need to know the Java methods signature to call it in the bytecode I generate and vice versa.
I've been trying to think of any methods for how Scala does this. Here are my thoughts.
Scala accesses the .java files on the class path and parses them, extracting the method signatures from there.
.java files are compiled to .class files. The Java ASM library is then used to access the .class files and get the method signatures. The problem with this method is that the .java files must be compiled first.
.java files are loaded dynamically using reflection. The problem with this is I believe that the JVM doesn't allow for loading classes that are outside of the compilers class path.
Looking into Scala it works well with other JVM languages but I can't find information on exactly how it does it.
How does Scala get method signatures of other JVM language methods?
I think you are confusing class path and source path: there are no .java or .scala files on the class path, there are .class files (possibly inside .jars). So for dependencies (on the class path), you don't need to do anything special. They can have their own dependencies on your language, including previous versions of your project, but they are already compiled by definition.
Now, for mixed projects, where you have Java and your language on the source path, scalac does parse Java with its own parser (i.e. your option 1).
The problem with option 3 is not that "the JVM doesn't allow for loading classes that are outside of the compilers class path", but that reflection also only works on classes, not on source files.

Dynamic linking in C/C++ (dll) vs JAVA (JAR)

I am new to java programming.
Basically when we work in c/c++ programming we create dll files using .h and .c files,where .h file contains declarations and .c file contains definitions of those classes, functions.
when we want to use those functionalities of created .dll file in other project, we include .h in preprocessor declaration to tell compiler about declarations of variables,functions and we provide respective dll file path during compilation so that at linker stage it communicates with dll.
Here is my question how do they manage this in java programming because it doesn't contain any header files. it has only .java files where these files are combined and created JAR file.when i want to use this jar file in another project we use "package" or "import" keyword but when it says import total file will be imported with logic and how do linker manage at compilation step??
how do they manage this in java programming because it doesn't contain any header files.
It manages this by placing all the information it needs for compiling against the class and at runtime (and possibly debugging) in the .class file so there is no need for additional information.
Often the source and javadoc are placed in JARs as well (sometimes the same JAR)
when i want to use this jar file in another project we use "package" or "import" keyword
You don't have to. This is just a short hand. You can use full package.ClassName and there is no need for an import. Note: this doesn't import any code or data, just allow you to use a shorter name for the class.
e.g. there is no difference between
java.util.Date date = new java.util.Date();
and
import java.util.Date;
Date date = new Date(); // don't need to specify the full package name.
when it says import total file will be imported with logic
There is no way, nor no need to do this. There is nothing like #include for example, and inlining only occurs at runtime, not compile time (except for constants known at compile time)
how do linker manage at compilation step
The linking and compiling to native code is performed at runtime. All the javac compiler does is check the validity of your code and generate byte code for the JVM to read.
Modern languages, Java (& C#) do not make a distinction between declaration and definition, so the concept of .H file is gone in these languages.
In many aspects (new languages) the dualism compile-time vs runtime is lost (mainly because they have strong reflection). Java Classes or JARs (or C# assemblies) have information required to compile (alike declarations). Java environment don't require special 'files for compilation'. The same JAR is 'compile file' and 'runtime binary dll'
Typical C thinking with .H, .C, .LIB files goes to niche (IMHO - I'm old C programmer and I feel good with new languages)

Compiled Java compiler in JAR form?

I have an application which requires compiling a .java file into a .class file within the application. Ideally, I'd like to have a JAR that I could use its api to compile by giving two arguments: the name of the file to be compiled and the directory to store the .class file. My application will use the Java compiler, which will be packaged and shipped with the software.
I actually have a small java compiler like I describe in JAR form, but it only has a subset of what java 7.0 has.
Is the java compiler available in Jar form like this?
the compiler is in tools.jar.
refer to http://docs.oracle.com/javase/6/docs/api/javax/tools/package-summary.html
Java Compiler API (JSR 199) was created for this purpose, you create a compiler instance like this:
JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
Here is a good educational tutorial on Generating Java classes dynamically
and here is a related question with example code

How does the java compiler find classes without header files?

When we refer to a class className in jar, how does it know whether it's defined or not when there's no header files(like in c/c++) ?
Java works with classloaders. Classes are needed for compilation, since it will perform static type checking to ensure that you are using the correct signatures of every method.
After compiling them, though, they are not linked like you have in a C/C++ compiler so basically every .class file is standalone. Of course this means that you will have to provide compiled classed used by your program when you are going to execute it. So it's a little bit different from how C and C++ prepare executables. You don't actually have a linking phase at all, it is not needed.
The classloader will dinamically load them by adding them to the runtime base used by the JVM.
Actually there are many classloaders that are used by the JVM that have different permissions and properties, you can also invoke it explicitly to ask for a class to be loaded. What happens can also be a sort of "lazy" loading in which the compiled .class code is loaded just when needed (and this loading process can throw a ClassNotFoundException if the asked class is not inside the classpath)
When you run the Java compiler or your application itself, you can specify a classpath which lists all the jars and directories you're loading classes from. A jar just contains a bunch of class files; these files have enough metadata in them that no extra header files are necessary.
The classes in the jar file contain all the required information (class names, method signatures etc) so header files are not needed.
When you compile multiple classes javac is clever enough to compile dependencies automatically so the system still works.
It looks at the classpath and tries to load the class from there to get its definition.
Java files are compiled into class files which are java bytecode. These class files reside in a file structure where the top level is pointed to by the classpath variable. Compiling in C/C++ creates object files which can be linked into executable binaries. Java only compiles into bytecode files which are pulled in by the JVM at runtime. The following provide more explanation.
http://en.wikipedia.org/wiki/Java_bytecode
http://en.wikipedia.org/wiki/Java_compiler
http://en.wikipedia.org/wiki/Java_Virtual_Machine

How exactly does java compilation take place?

Confused by java compilation process
OK i know this: We write java source code, the compiler which is platform independent translates it into bytecode, then the jvm which is platform dependent translates it into machine code.
So from start, we write java source code. The compiler javac.exe is a .exe file. What exactly is this .exe file? Isn't the java compiler written in java, then how come there is .exe file which executes it? If the compiler code is written is java, then how come compiler code is executed at the compilation stage, since its the job of the jvm to execute java code. How can a language itself compile its own language code? It all seems like chicken and egg problem to me.
Now what exactly does the .class file contain? Is it a abstract syntax tree in text form, is it tabular information, what is it?
can anybody tell me clear and detailed way about how my java source code gets converted in machine code.
OK i know this: We write java source code, the compiler which is platform independent translates it into bytecode,
Actually the compiler itself works as a native executable (hence javac.exe). And true, it transforms source file into bytecode. The bytecode is platform independent, because it's targeted at Java Virtual Machine.
then the jvm which is platform dependent translates it into machine code.
Not always. As for Sun's JVM there are two jvms: client and server. They both can, but not certainly have to compile to native code.
So from start, we write java source code. The compiler javac.exe is a .exe file. What exactly is this .exe file? Isn't the java compiler written in java, then how come there is .exe file which executes it?
This exe file is a wrapped java bytecode. It's for convenience - to avoid complicated batch scripts. It starts a JVM and executes the compiler.
If the compiler code is written is java, then how come compiler code is executed at the compilation stage, since its the job of the jvm to execute java code.
That's exactly what wrapping code does.
How can a language itself compile its own language code? It all seems like chicken and egg problem to me.
True, confusing at first glance. Though, it's not only Java's idiom. The Ada's compiler is also written in Ada itself. It may look like a "chicken and egg problem", but in truth, it's only a bootstrapping problem.
Now what exactly does the .class file contain? Is it an abstract syntax tree in text form, is it tabular information, what is it?
It's not Abstract Syntax Tree. AST is only used by tokenizer and compiler at compiling time to represent code in memory. .class file is like an assembly, but for JVM. JVM, in turn, is an abstract machine which can run specialized machine language - targeted only at virtual machine. In it's simplest, .class file has a very similar structure to normal assembly. At the beginning there are declared all static variables, then comes some tables of extern function signatures and lastly the machine code.
If You are really curious You can dig into classfile using "javap" utility. Here is sample (obfuscated) output of invoking javap -c Main:
0: new #2; //class SomeObject
3: dup
4: invokespecial #3; //Method SomeObject."<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #4; //Method SomeObject.doSomething:()V
12: return
So You should have an idea already what it really is.
can anybody tell me clear and detailed way about how my java source code gets converted in machine code.
I think it should be more clear right now, but here's short summary:
You invoke javac pointing to your source code file. The internal reader (or tokenizer) of javac reads your file and builds an actual AST out of it. All syntax errors come from this stage.
The javac hasn't finished its job yet. When it has the AST the true compilation can begin. It's using visitor pattern to traverse AST and resolves external dependencies to add meaning (semantics) to the code. The finished product is saved as a .class file containing bytecode.
Now it's time to run the thing. You invoke java with the name of .class file. Now the JVM starts again, but to interpret Your code. The JVM may, or may not compile Your abstract bytecode into the native assembly. The Sun's HotSpot compiler in conjunction with Just In Time compilation may do so if needed. The running code is constantly being profiled by the JVM and recompiled to native code if certain rules are met. Most commonly the hot code is the first to compile natively.
Edit: Without the javac one would have to invoke compiler using something similar to this:
%JDK_HOME%/bin/java.exe -cp:myclasspath com.sun.tools.javac.Main fileToCompile
As you can see it's calling Sun's private API so it's bound to Sun JDK implementation. It would make build systems dependent on it. If one switched to any other JDK (wiki lists 5 other than Sun's) then above code should be updated to reflect the change (since it's unlikely the compiler would reside in com.sun.tools.javac package). Other compilers could be written in native code.
So the standard way is to ship javac wrapper with JDK.
Isn't the java compiler written in java, then how come there is .exe file which executes it?
Where do you get this information from? The javac executable could be written in any programming language, it is irrelevant, all that is important is that it is an executable which turns .java files into .class files.
For details on the binary specification of a .class file you might find these chapters in the Java Language Specification useful (although possibly a bit technical):
Virtual Machine Startup
Loading of Classes and Interfaces
You can also take a look at the Virtual Machine Specification which covers:
The class file format
The Java Virtual Machine instruction set
Compiling for the Java Virtual Machine
The compiler javac.exe is a .exe file.
What exactly is this .exe file? Isn't
the java compiler written in java,
then how come there is .exe file which
executes it?
The Java compiler (at least the one that comes with the Sun/Oracle JDK) is indeed written in Java. javac.exe is just a launcher that processes the command line arguments, some of which are passed on to the JVM that runs the compiler, and others to the compiler itself.
If the compiler code is written is
java, then how come compiler code is
executed at the compilation stage,
since its the job of the jvm to
execute java code. How can a language
itself compile its own language code?
It all seems like chicken and egg
problem to me.
Many (if not most) compilers are written in the language they compile. Obviously, at some early stage the compiler itself had to be compiled by something else, but after that "bootstrapping", any new version of the compiler can be compiled by an older version.
Now what exactly does the .class file
contain? Is it a abstract syntax tree
in text form, is it tabular
information, what is it?
The details of the class file format are described in the Java Virtual Machine specification.
Well, javac and the jvm are typically native binaries. They're written in C or whatever. It's certainly possible to write them in Java, just you need a native version first. This is called "boot strapping".
Fun fact: Most compilers that compile to native code are written in their own language. However, they all had to have a native version written in another language first (usually C). The first C compiler, by comparison, was written in Assembler. I presume that the first assembler was written in machine code. (Or, using butterflies ;)
.class files are bytecode generated by javac. They're not textual, they're binary code similar to machine code (but, with a different instruction set and architechture).
The jvm, at run time, has two options: It can either intepret the byte code (pretending to be a CPU itself), or it can JIT (just-in-time) compile it into native machine code. The latter is faster, of course, but more complex.
The .class file contains bytecode which is sort of like very high-level Assembly. The compiler could very well be written in Java, but the JVM would have to be compiled to native code to avoid the chicken/egg problem. I believe it is written in C, as are the lower levels of the standard libraries. When the JVM runs, it performs just-in-time compilation to turn that bytecode into native instructions.
Short Explanation
Write code on a text editor, save it in a format that compiler understands - ".java" file extension, javac (java compiler) converts this to ".class" format file (byte code - class file). JVM executes the .class file on the operating system that it sits on.
Long Explanation
Always remember java is not the base language that operating system recognizes. Java source code is interpreted to the operating system by a translator called Java Virtual Machine (JVM). JVM cant understand the code that you write in a editor, it needs compiled code. This is where a compiler comes into picture.
Every computer process indulges in memory manipulation. We cant just write code in a text editor and compile it. We need to put it in the computer's memory, i.e save it before compiling.
How will the javac (java compiler) recognize the saved text as the one to be compiled? - We have a separate text format that the compiler recognizes, i.e .java. Save the file in .java extension and the compiler will recognize it and compile it when asked.
What happens while compiling? - Compiler is a second translator(not a technical term) involved in the process, it translates user understood language(java) into JVM understood language(Byte code - .class format).
What happens after compiling? - The compiler produces .class file that JVM understands. The program is then executed, i.e the .class file is executed by JVM on the operating system.
Facts you should know
1) Java is not multi-platform it is platform independent.
2) JVM is developed using C/C++. One of the reason why people call Java a slower language than C/C++
3) Java byte code (.class) is in "Assembly Language", the only language understood by JVM. Any code that produces .class file on compilation or generated Byte code can be run on the JVM.
Windows doesn't know how to invoke Java programs before installing a Java runtime, and Sun chose to have native commands which collect arguments and then invoke the JVM instead of binding the jar-suffix to the Java engine.
The compiler was originally written in C with bits of C++ and I assume that it still is (why do you think the compiler is written in Java as well?). javac.exe is just the C/C++ code that is the compiler.
As a side point you could write the compiler in java, but you're right, you have to avoid the chicken and egg problem. To do this you'd would typically write one or more bootstrapping tools in something like C to be able to compile the compiler.
The .class file contains the bytecodes, the output of the javac compilation process and these are the instructions that tell the JVM what to do. At runtime these bytecodes have are translated to native CPU instructions (machine code) so they can execute on the specific hardware under the JVM.
To complicate this a little, the JVM also optimises and caches machine code produced from the bytecodes to avoid repeatedly translating them. This is known as JIT compilation and occurs as the program is running and bytecodes are being interpreted.
.java file
compiler(JAVA BUILD)
.class(bytecode)
JVM(system software usually build with 'C')
OPERATING PLATFORM
PROCESSOR

Categories