Which is the fastest way of calling a native library from Java?
The ones I know about are
NativeCall - what we're currently using
JNA - haven't used it, but looks reasonable
JNI - looks horrendous to write, but we'll do it if we get the speed
Swig makes JNI easier too.
In terms of speed, I suspect there will be subtle variations - I strongly suggest you pick a call that you know you'll be making a lot, and benchmark all of the solutions offered.
JNI is the fastest. JNA is very slow compared to JNI (the call overhead is probably one order of magnitude), but it is a fantastic library because it makes native access so easy. JNA is great if you need to make an occasional call to some native API. If you care about performance, I wouldn't use it in any "tight loops."
I'm not sure where NativeCall fits in the spectrum.
Quite a few parameters influence the performances of interfaces between programing languages: what device the JVM runs on, who developed it (in case it's not the usual Sun JVM), whether you will need to call back Java code from native code, the threading model of the JVM on your operating system and how asynchronous will the native code be...
You may not find a reliable benchmark that measures exactly what you need, I'm afraid.
This blog entry claims that due to the introspection mechanisms used by JNA, it'll be significantly slower than JNI. I suspect that NativeCall will use similar mechanisms and thus perform in a similar fashion.
However you should probably benchmark based on the particular objects you're referencing and/or marshalling between Java and C.
I would second the recommendation of SWIG. That makes life particularly easy (easier) for the Java/C interfacing.
Related
I'm researching methods for computing expensive vector operations in Java, e.g. dot-products or multiplications between large matrices. There are a few good threads on here on this topic, like this and this. It appears that there is no reliable way of having the JIT compile code to use CPU vector instructions (SSE2, AVX, MMX...). Moreover, high-performance linear algebra libraries (ND4J, jblas, ...) do in fact make JNI calls to BLAS/LAPACK libraries for the core routines. And I understand BLAS/LAPACK packages to be the de facto standard choices for native linear algebra computations.
On the other hand others (JAMA, ...) implement algorithms in pure Java without native calls.
My questions are:
What are the best practices here?
Is making native calls to BLAS/LAPACK actually a recommended choice? Are there other libraries worth considering?
Is the overhead of JNI calls negligible compared to the performance gain? Does anyone have experience as to where the threshold lies (e.g. how small an input should be to make JNI calls more expensive than a pure Java routine?)
How big is the portability tradeoff?
I hope this question could be of help both for those who develop their own computation routines, and for those who just want to make an educated choice between different implementations.
Insights are appreciated!
There are no clear best practices for every case. Whether you could/should use a pure Java solution (not using SIMD instructions) or (optimized with SIMD) native code through JNI depends on your particular application and specifically the size of your arrays and possible restrictions on the target system.
There could be a requirement that you are not allowed to install specific native libraries in the target system and BLAS is not already installed. In that case you simply have to use a Java library.
Pure Java libraries tend to perform better for arrays with length much smaller than 100 and at some point after that you get better performance using native libraries through JNI. As always, your mileage may vary.
Pertinent benchmarks have been performed (in random order):
http://ojalgo.org/performance_ejml.html
http://lessthanoptimal.github.io/Java-Matrix-Benchmark/
Performance of Java matrix math libraries?
These benchmarks can be confusing as they are informative. One library may be faster for some operation and slower for some other. Also keep in mind that there may be more than one implementation of BLAS available for your system. I currently have 3 installed on my system blas, atlas and openblas. Apart from choosing a Java library wrapping a BLAS implementation you also have to choose the underlying BLAS implementation.
This answer has a fairly up to date list except it doesn't mention nd4j that is rather new. Keep in mind that jeigen depends on eigen so not on BLAS.
Does anybody know any libraries of CL-procedures (it will be better if there is a good documentation)?
And Im also interested in D-languages binding.
Has somebody seen benchmarks that compares performance of native code applications with OpenCL and/or OpenGL and performance of Java Binding? I know that DLL calls cause of performance decline. Does an application written on C/C++ will be anyway faster than the same on Java?
As Jakob already said, my D wrapper is # https://github.com/Trass3r/cl4d
With inlining, -version=NO_CL_EXCEPTIONS and proper dead code elimination the code should be nearly equivalent to a manually coded app using the C API directly.
So the wrapper introduces almost no overhead, performance depends on your kernels and clever memory transport.
How about JavaCL which works for me ?
As far as I have seen the cost of binding is fairly small compared to other overheads such as compiling the CL code and exchanging data with the GPU.
I develop applications/programs in C/C++. I am more versed in these two languages and love being a C++ developer. I am wondering how to create a Java program that contains all my C++ code.
I mean, I would like to wrap all my C++ code (that is already developed) inside Java class. But clueless how to do it.
Please post your responses or methods/steps on integrating C++ inside Java.
(using JNI is the way, but I could not figure it out on www how to use it)
FYI, I use Eclipse IDE to develop.
How and what packages should I include in my project workspace?
Instead of JNI, or JNI with some assist from an automatic wrapper generator like SWIG, or even JNA, you might consider separating the C/C++ and Java into separate processes and using some form of IPC and/or Java's Process abstraction to call to a program written in C/C++. This approach abandons "wrapping," so in some sense it isn't an answer to this question, but please read on before down-voting. I believe that this is a reasonable answer to the broader issue in some cases.
The reason for such an approach is that when you call C/C++ directly from Java, the JVM is put at risk of any error in the native code. The risk depends somewhat on how much of the native code is yours and how much you link to third party code (and how much access you have to the source code of such third party code).
I've run into a situation where I had to call a C/C++ library from Java and the C/C++ library had bugs that caused the JVM to crash. I didn't have the third party source code, so I couldn't fix the bug(s) in the native code. The eventual solution was to call a separate C/C++ program, linked to the third party library. The Java application then made calls to many ephemeral native processes whenever it needed to call the C/C++ stuff.
If the native code has a problem, you might be able to recover/retry in Java. If the native code is wrapped and called from the JVM process, it could take down the entire JVM.
This approach has performance/resource consumption implications and may not be a good fit for your application, but it is worth considering in certain situations.
Having a separate application that exercises the functionality of the C/C++ code is potentially useful as a stand-alone utility and for testing. And having some clean command-line or IPC interface could ease future integrations with other languages.
As another alternative, you could get into native signal handling to mitigate the risks to the integrity of the JVM process if you like and stick with a wrapping solution.
If you want to call C++ from Java, you'll need to use JNI - Java Native Interface.
Be warned that you lose some of the benefits of the garbage collector, since it can't deal with your C++ objects, and your code won't be portable anymore.
Maybe you'd be better served by learning to write 100% Java and leaving C++ behind, but that's just a suggestion.
You can't "just wrap it", you have to write some C/C++ glue.
For starters, SWIG can do most of the works for you.
There are plenty of tutorials for doing exactly what you want to do. For example, check out: http://www.javamex.com/tutorials/jni/getting_started.shtml
There are also plenty of caveats of using JNI. I've recently started working with it (just for fun, really), and it tends to be a lot less fun than I had first anticipated.
First of all, you have to deal with cryptic code such as:
#include "test_Test.h"
JNIEXPORT jint JNICALL Java_test_Test_getDoubled(JNIEnv *env, jclass clz, jint n) {
return n * 2;
}
Second of all, it tends to downplay one of the primary reasons why you use Java in the first place: WORA (Write Once, Run Anywhere). As duffymo mentioned, there can also be issues with the garbage collector, but I think that in recent years, the JVM has gotten pretty smart about JNI integration.
With that said, to port all of your C++ code to JNI, you'd need to refactor your interfaces (and maybe even do some internal gymnastics). It's not impossible, but it's really not recommended. The ideal solution is just re-writing your code in Java.
With that said, you could also "convert" your code from C/C++ into Java programatically, and there are multitudes of such utilities. But, of course, machines are dumber than people and they are also bound to make mistakes, depending how complex your class is.
I would avoid JNI because it's tedious to write, verbose, and just an altogether pain. Instead I'd use JNA library which makes writing native integration so simple.
https://github.com/twall/jna/
Good luck.
You can write C++ code through JNI but there isn't a direct mapping from C++ classes to Java classes.
I've used JNI to fix problems found in the android SDK (specifically, an incredibly slow FloatBuffer.put implementation) and I may end up using it for some performance critical areas. My advice would be to be use it sparingly and in a duck in, do the performance critical stuff and leave, without doing any memory allocation if you can help it. Also, don't forget to measure your code to see if it really is faster.
Out of interest, what platform are you developing for? The only platform where it would make sense to wrap a lot of C++ code in a light java layer would be Android - on other platforms, just compile in C++ and have done with it.
JNI module is not a Java classes. It's C. Using JNI incur many restrictions and some Java environment doesn't support JNI well.
There is no supported way of "wrap my C++ code inside Java class" (EDIT: I mean, without JNI noway but JNI is problematic.)
You could investigate custom C++ compiler emit Java byte codes, but nobody (include me) will recommend this approach.
BridJ was designed on purpose for that (and it's supported by JNAerator, which will parse your C/C++ headers and spit out the Java bindings for you).
It is a recent alternative to JNA, with support for C++.
I've looked at the related threads on StackOverflow and Googled with not much luck. I'm also very new to Java (I'm coming from a C# and .NET background) so please bear with me. There is so much available in the Java world it's pretty overwhelming.
I'm starting on a new Java-on-Linux project that requires some heavy and highly repetitious numerical calculations (i.e. statistics, FFT, Linear Algebra, Matrices, etc.). So maximizing the performance of the mathematical operations is a requirement, as is ensuring the math is correct. So hence I have an interest in finding a Java library that perhaps leverages native acceleration such as MKL, and is proven (so commercial options are definitely a possibility here).
In the .NET space there are highly optimized and MKL accelerated commercial Mathematical libraries such as Centerspace NMath and Extreme Optimization. Is there anything comparable in Java?
Most of the math libraries I have found for Java either do not seem to be actively maintained (such as Colt) or do not appear to leverage MKL or other native acceleration (such as Apache Commons Math).
I have considered trying to leverage MKL directly from Java myself (e.g. JNI), but me being new to Java (let alone interoperating between Java and native libraries) it seemed smarter finding a Java library that has already done this correctly, efficiently, and is proven.
Again I apologize if I am mistaken or misguided (even in regarding any libraries I've mentioned) and my ignorance of the Java offerings. It's a whole new world for me coming from the heavily commercialized Microsoft stack so I could easily be mistaken on where to look and regarding the Java libraries I've mentioned. I would greatly appreciate any help or advice.
For things like FFT (bulk operations on arrays), the range check in java might kill your performance (at least recently it did). You probably want to look for libraries which optimize the provability of their index bounds.
According to the The HotSpot spec
The Java programming language
specification requires array bounds
checking to be performed with each
array access. An index bounds check
can be eliminated when the compiler
can prove that an index used for an
array access is within bounds.
I would actually look at JNI, and do your bulk operations there if they are individually very large. The longer the operation takes (i.e. solving a large linear system, or large FFT) the more its worth it to use JNI (even if you have to memcpy there and back).
Personally, I agree with your general approach, offloading the heavyweight maths from Java to a commercial-grade library.
Googling around for Java / MKL integration I found this so what you propose is technically possible. Another option to consider would be the NAG libraries. I use the MKL all the time, though I program in Fortran so there are no integration issues. I can certainly recommend their quality and performance. We tested, for instance, the MKL version of FFTW against a version we built from sources ourselves. The MKL implementation was faster by a small integer multiple.
If you have concerns about the performance of calling a library through JNI, then you should plan to structure your application to make fewer larger calls in preference to more smaller ones. As to the difficulties of using JNI, my view (I've done some JNI programming) is that the initial effort you have to make in learning how to use the interface will be well rewarded.
I note that you don't seem to be overwhelmed yet with suggestions of what Java maths libraries you could use. Like you I would be suspicious of research-quality, low-usage Java libraries trawled from the net.
You'd probably be better off avoiding them I think. I could be wrong, it's not a bit I'm too familiar with, so don't take too much from this unless a few others agree with me, but calling up the JNI has quite a large overhead, since it has to go outside of the JRE and everything to do it, so unless you're grouping a lot of things together into a single function to put through at once, the slight benefit of the external library's will be outweighed hugely by the cost of calling them. I'd give up looking for an MKL library and find an optimized pure Java library. I can't say I know of any better than the standard one to recommend though, sorry.
I need an alternative to Java, because I am working on a genetics-calculation project.
It takes a lot of memory and the most of the cpu time. And therefore it won´t work when I deploy it on a server, because many people use the program at the same time.
Does anybody know another language that is not running in a virtual machine and is similar to Java (object-oriented, using exceptions and type-safety)?
Best regards,
Jonathan
To answer the direct question: there are dozens of languages that fit your explicit requirements. AmmoQ listed a few; Wikipedia has many more.
And I think that you'll be disappointed with every one of them.
Despite what Java haters want you to think, Java's performance is not much different than any other compiled language. Just changing languages won't improve performance much.
You'll probably do better by getting a profiler, and looking at the algorithms that you used.
Good luck!
If your apps is consuming most of the CPU and memory on a single-user workstation, I'm skeptical that translating it into some non-VM language is going to help much. With Java, you're depending on the VM for things like memory management; you're going to have to re-implement their equivalents in your non-VM language. Also, Java's memory management is pretty good. Your application probably isn't real-time sensitive, so having it pause once in a while isn't a problem. Besides, you're going to be running this on a multi-user system anyway, right?
Memory usage will have more to do with your underlying data structures and algorithms rather than something magical about the language. Unless you've got a really great memory allocator library for your chosen language, you may find you uses just as much memory (if not more) due to bugs in your program.
Since your app is compute-intensive, some other language is unlikely to make it less so, unless you insert some strategic sleep() calls throughout the code to deliberately make it yield the CPU more often. This will slow it down, but will be nicer to the other users.
Try running your app with Java's -server option. That will engage a VM designed for long-running programs and includes a JIT that will compile your Java into native code. It may make your program run a bit faster, but it will still be CPU and memory bound.
If you don't like C++, you might consider D, ObjectiveC or the new Go language from google.
You may try C++, it satisfies all your requirements.
Use Python along with numpy, scipy and matplotlib packages. numpy is a Python package which has all the number crunching code implemented in C. Hence runtime performance (bcoz of Python Virtual Machine) won't be an issue.
If you want compiled, statically typed language only, have a look at Haskell.
Can your algorithms be parallelised?
No matter what language you use you may come up against limitations at some point if you use a single process. Using something like Hadoop will mean you can retain Java and ease of use but you can run in parallel across many machines.
On the same theme as #Barry Brown's answer:
If your application is compute / memory intensive in Java, it will probably be compute / memory intensive in C++ or any other "more efficient" language. You might get some extra leeway ... but you'll soon run into the same performance wall.
IMO, you need to do the following things:
You need profile your application, and look for any major performance bottlenecks. You might find some real surprises.
In the light of the previous step, review the design and algorithms, paying attention to space and time complexity issues. Do some research to see if someone has discovered better algorithms for doing the computations that are problematic from a performance perspective.
If the previous steps don't get you ahead of the curve, see if you can upgrade your platform; get a bigger machine with more processors, more memory, etc.
If you are still stuck, your only other option is a scale-out design. Assuming that individual user requests are processed in a single-threaded, re-architect your system so that you can run "workers" across multiple servers, with a load balancer on the front. If you have a persistent back-end, look into how you can replicate that. And so on.
Figure out if the key algorithms can be parallelized / distributed so that the resource intensive parts of a user request execute in parallel on multiple processors / multiple servers; e.g. using a "map-reduce" framework.
OK, so there is no easy answer. But simply changing programming languages is NOT a good answer.
Regardless of language your program will need to share with others when running in multiple instances on a single machine. That is simply the way computers work.
The best way to allow your current program to scale to use the available hardware resources is to chop your amount of work into small, independent pieces, and make them implement the Callable interface. These can then be executed by a suitable Executor which can then be chosen according to the available hardware. See the Executors class for many preconfigured versions. THis is what I would recommend you to do here.
If you want to switch language then Mac OS X 10.6 allows for programming in the way described above with C and ObjectiveC and if you do it properly OS X can distribute the code over all available computing resources (both CPU and GPU and what have we).
If none of the above is interesting to you, then consider one of the Grid frameworks. Terracotta may be a good place to start.
F# or ruby, or python, they are very good for calculations, and many other things
NASA uses python
Well.. I think you are looking for C#.
C# is Object Oriented and has excellent support for Generics. You can use it do write both WinForm and server-side applications.
You can read more about C# generics here: http://msdn.microsoft.com/en-us/library/ms379564(VS.80).aspx
Edit:
My mistake, geneTIcs, not geneRIcs. It does not change the fact C# will do the job, and using generics will reduce load significantly.
You might find the computer language shootout here interesting.
For example, here's Java vs C++.
You might find Ocaml (from which F# is derived) worth a look; it meets your requirements for OO, exceptions, static types and it has a native compiler, however according to the shootout you may be trading less memory for lower speed.