Is there a way to run plain c code on top of the JVM?
Not connect via JNI, running, like you can run ruby code via JRuby, or javascript via Rhino.
If there is no current solution, what would you recommend I should do?
Obviously I want to use as many partials solutions as I can to make it happen.
ANTLR seems like a good place to start, having a full "ANSI C" grammar implementation...
should I build a "toy" VM over the JVM using ANTLR generated code?
Updated 2012-01-26: According to this page on the company's site the product has been bought out and is no longer available.
Yes.
Here's a commercial C compiler that produces JVM bytecode.
There are two other possibilities, both open-source:
JPC emulates an entire x86 pc within the JVM, and is capable of running both DOS and Linux.
NestedVM provides binary translation for Java Bytecode. This is done by having GCC compile to a MIPS binary which is then translated to a Java class file. Hence any application written in C, C++, Fortran, or any other language supported by GCC can be run in 100% pure Java with no source changes.
It seems that LLJVM can also meet your requirement.
LLJVM: Source code is first compiled to LLVM intermediate representation (IR) by a frontend such as llvm-gcc or clang. LLVM IR is then translated to Jasmin assembly code, linked against other Java classes, and then assembled to JVM bytecode.
As of 2016 there is a young but promising option called gcc-bridge. Its intend is to leverage the JVM's implementation of R. The goal is to use R-libraries written in C or Fortran. But gcc-bridge can be used independently as a regular maven plugin. Also see the gcc-brigde-example.
Related
A quick question that may seem out of the ordinary. (in reverse)
Instead of calling native code from an interpreted language; is there a way to compile Java or Python code to a .dll/.so and call the code from C/C++?
I'm willing to accept even answers such as manually spawning the interpreter or JVM and force it to read the .class/.py files. (is this a good solution?)
Thank you.
gcj can compile most Java source to native code (linked with a libgcj shared library) instead of to JVM bytecode.
There are a number of Python projects that are similar, like shedskin, but none as mature or active.
Cython is similar, but not quite the same—it compiles modules written in a Python-like language into native C extension modules for CPython. But if you put that together with embedding Python in a C app, it gives you most of what you want. But you are still running a Python interpreter loop to tie all those compiled-to-C functions together.
You can also do the same thing with Java—embed the JVM into your app, use gcj to compile any parts you want to native code, while compiling other parts to bytecode, and using JNI to communicate between them.
And of course you can use Jython to embed your Python code into the JVM, which you can embed into your C program, and because you can use JNI directly from Jython any pair of the three languages can effectively talk to each other without going through the third.
The idea of spawning a JVM or a CPython interpreter as a subprocess, which I think you were suggesting in your question, also works just fine. However, the only interface you will have to it in that case will be the child process's stdin/stdout/stderr (or any pipes or sockets you create manually), which isn't as flexible as being able to call methods directly on objects, etc. (Then again, sometimes that extra indirection can be a good thing, forcing you to define a cleanly-separated API between your components.)
You can embed a Python interpreter in your C/C++ program.
http://docs.python.org/2/extending/embedding.html
With Java your probably want the Java Native Interface (which works in both directions).
http://en.wikipedia.org/wiki/Java_Native_Interface
You can also look into Lua, while not as widely used as a lot of other scripting languages, it was meant to be embedded easily into executables. It's relatively small and fast. Just another option. If you want to call other languages from your c/c++ look into SWIG.
I know java generates bytecode but the JVM needs to interpret it everytime during runtime.
Does a compiler exist that generates machine independent code, lets say for C.
Then at a target machine this is permanently converted to its local machine code once rather than converting for each run?
Does this solve why many developers develop for windows but no linux?
Not really, but some stuff comes close.
C is regarded as low level as possible while being portable by some. (This, of course, excludes all APIs). The GHC Haskell compiler uses internally a very c-like language in that regard c--, that might be very close to the machine in depended code you are looking for.
Most modern compilers do have such intermediate Code, for example LLVM. There is even a assembler like (so even more low leven than C) for that. But note that LLVM intermediate code is not portable, as for example the pointer size has to be known at compile time. (all the sizeofs in C will fixed at this time)
But there is a IMO more simple solution: Compile the code for any platform, and if you are on a different platform you a dynamic recompiler like QEMU. That still does negatively impact performance.
It's certainly possible, and interpreters exist for C and C++. However, projects using these languages will often use platform-specific code (like the Windows APIs) which stops them from being portable. Interpreted languages generally supply platform-independent core libraries.
Modern compilers – like Clang, LLVM and GCC – all compile your source code to an intermediate language. This means that the same code-level optimizations can be applied to any language that the compiler can convert, and it also enables tools like Emscripten which can effectively compile C to JavaScript! I believe it was used for the recent JavaScript Unreal Engine demo.
A Java example: Android 4.4 introduced a new experimental runtime virtual machine, ART (Android Runtime).
ART straddles an interesting mid-ground between compiled and interpreted code, called ahead-of-time (AOT) compilation. Currently with Android apps, they are interpreted at runtime (using the JIT), every time you open them up. This is slow. (iOS apps, by comparison, are compiled native code, which is much faster.) With ART enabled, each Android app is compiled to native code when you install it. Then, when it’s time to run the app, it performs with all the alacrity of a native app.
Source
I have been hearing a lot lately regarding Scala, Clojure, etc which is supposed to run on JVM.
Does this means that those languages are implementing the Java API underneath?
What does it mean for a language to run under JVM?
Thanks.
It means that these languages can be compiled into Java bytecode, which the JVM executes.
It means that the language compiles down to JVM byte code at some point. The language doesn't need to implement the Java API; the Java API is already there (more or less all the time).
It just means if you have a JVM you should be able to run the language without another VM (although you'll need whatever class files the language compiler and libraries need, obviously).
There is a Virtual machine that java runs one (JVM ),which abstracts away more machine level worries. These languages just use it as an intermediate language oppose to writing architecture specific instructions.
Usually, it just means that you have to install JRE to make sure they can execute.
And usually they don't require JDK, which is used to compile .java code into .class byte file. Instead, they provide their own compiler which runs on the JRE you have installed.
So in summary, you just need a runtime support Java (some specific version).
if you need an in depth information: normabmcclelland#linuxmirroreast.com
Is it possible to translate x86 32 bit assembly code into equivalent JVM byte-code and execute it?
I have a Fortran library in .so form. I want to perform an assembly dump on it using GDB and then using a translator of some sort turn it into valid JVM bytecode.
Is this even possible?
For simplicity sake, let's assume I don't care about platform independence anymore. Both assembly and bytecode will run on the same machine.
Possible is nearly everything but I don't think you will find a tool that does this for you - therefore you would have to do it manually which could take you weeks or months depending on the size of the library.
Of course this may rise legal problems if the compiled library is a commercial one or copyright protected.
A better approach seems to me to develop a small Java Native Interface (JNI) wrapper in C/C++ and link the library to it. Then you will be able to call library functions from Java.
If you can get the Fortran source code you could try a JVM-Fortran compiler like Fortran-to-Java. Then you would get native JVM byte code.
I have heard that google app engine can run any programming language that can be transformed to Java bytecode via it's JVM. I wondered if it would be possible to convert LLVM bytecode to Java bytecode as it would be interesting to run languages that LLVM supports in the Google App Engine JVM.
It does now appear possible to convert LLVM IR bytecode to Java bytecode, using the LLJVM interpreter.
There is an interesting Disqus comment (21/03/11) from Grzegorz of kraytracing.com which explains, along with code, how he has modified LLJVM's Java class output routine to emit non-monolithic Java classes which agree in number with the input C/C++ modules. He suggests that his technique seems to avoid the excessively long 'compound' Java Constructor method argument signatures usually generated by LLJVM, and he provides links to his modifications and examples.
Although LLJVM doesn't look like it's been in active development for a couple of years now, its still hosted on Github and some documentation can still be found at its former repository at GoogleCode:
LLJVM # Github
LLJVM documentation # GoogleCode
I also came across the 'Proteuscc' project which also utilises LLVM to output Java Byte code (it suggests that this is specifically for C/C++, although I assume the project could be modified or fed LLVM Intermediate Representation (IR)). From http://proteuscc.sourceforge.net:
The general process of producing a Java executable with Proteus then
can be summarised as below.
Generate human readable representation of the LLVM intermediate
representation (ll file)
Pass this ll file as an argument to the
proteus compilation system
The above will produce a Java jar file
which can be executed or used as a library
I've extended a bash script to compile the latest versions of LLVM and Clang on Ubuntu, it can found be as a Github Gist,here.
[UPDATE 31/03/14] - LLJVM has seemed to have been dead for somewhile, however Howard Chu (https://github.com/hyc) looks to have made LLJVM compatible with the latest version of LLVM (3.3). See Howard's LLJVM-LLVM3.3 branch at Github, here
I doubt you can, at least not without significant effort and run-time abstractions (e.g. building half a Von Neumann machine to execute certain opcodes). LLVM bitcode allows the full range of low-level unsafe "do what you want but we won't clean up the mess" features, from direct, raw, constructor-free memory allocation up to completely unchecked casts - real casts, not conversions -you can take i32 and bitcast it to to a %stuff * if you wish. Also, JVMs are heavily geared towards objects and methods, while the LLVM guys are lucky they have function pointers and structs.
On the other hand, it seems that C can be compiled to Java bytecode and LLVM bitcode can be compiled to Javascript (although many features, e.g. dynamic loading and stdlib functions, are lacking), so it should be possible, given enough effort.
Late to the discussion: Sulong executes LLVM IR on the JVM. It creates executable nodes (which are Java objects) from the LLVM IR instead of converting the LLVM IR to Java bytecode. These executable nodes form an AST interpreter. You can check out the project at https://github.com/graalvm/sulong or read a paper about it at http://dl.acm.org/citation.cfm?id=2998416. Disclaimer: I'm working on this project.
Read this: http://vmkit.llvm.org/. I am not sure that it will help you but it seems to be relevant.
Note: This project is not more maintained.