JAMA Matrix performance - java

First of all, sorry for my bad English, but I need your help.
I have developed a simulation program with java swing where I have used lots of matrix calculations. My program is just finished but I need to speed up my performance. So I have used the java visual vm profiler to identify performance problems. I recognized that the initialization of Jama Matrices needs a lot of time. After running my program I had over 3 MB allocated Objects by JAMA. That's a lot of, isn't it? I think that's why the performance is bad.
Is there any better library than jama for matrices? I am using 3x3 matrices and I need multiplication and inverse operations or is there anything else i can do?

Usually matrix math libraries are not optimized for speed on small matrices.
You can see for yourself by taking a few stackshots, which are liable to show a large fraction of time in overhead functions like memory allocation and option checking.
What you can do instead (which I've done) is write special-purpose routines to do the multiplication and inverse, since you know the matrices are 3x3.
Multiplication is trivial and you can unroll the whole thing.
Inverse of a 3x3 matrix can also be done in less code than coffee :)
Wikipedia gives you the formula.
Whatever you do, try to minimize memory allocation.

Related

Multi-threaded Matrix initializing in Java

Currently using JAMA matrix.
my program currently initializes a LOT of small matrixs (20x20 tops in size) and then does some small calculations and reads the results.
about 80% of the run time is being spent reading and initializing the matrixs and I was wondering if there is a way I can do this multi-threaded for increased speeds. (I know that there are things like OjAlgo that are great for multi-thraded matrix manipulation) but all I am doing is initializing the matrix's and reading them again.
If I use another Matrix package will it initialize the matrixs with multiple threads or would the initialization still be single threaded but the algorthems done be multi-threaded?
Multi-threading "within" the matrices wont be of any benefit for such small matrices.
Switching to a library that internally uses double[] rather than double[][] could make a difference, but my guess is you should focus on how you (re)use the matrices. Maybe your program logic can be multi-threaded.
For very small matrices (2x2, 3x3, 4x4 ...) some libraries have specialised data structures and algorithms that could speed things up significantly.

Custom math functions vs. supplied Math functions?

I am basically making a Java program that will have to run a lot of calculations pretty quickly(each frame, aiming for at least 30 f/s). These will mostly be trigonometric and power functions.
The question I'm asking is:
Which is faster: using the already-supplied-by-Java Math functions? Or writing my own functions to run?
The built-in Math functions will be extremely difficult to beat, given that most of them have special JVM magic that makes them use hardware intrinsics. You could conceivably beat some of them by trading away accuracy with a lot of work, but you're very unlikely to beat the Math utilities otherwise.
You will want to use the java.lang.Math functions as most of them run native in the JVM. you can see the source code here.
Lots of very intelligent and well-qualified people have put a lot of effort, over many years, into making the Math functions work as quickly and as accurately as possible. So unless you're smarter than all of them, and have years of free time to spend on this, it's very unlikely that you'll be able to do a better job.
Most of them are native too - they're not actually in Java. So writing faster versions of them in Java is going to be a complete no-go. You're probably best off using a mixture of C and Assembly Language when you come to write your own; and you'll need to know all the quirks of whatever hardware you're going to be running this on.
Moreover, the current implementations have been tested over many years, by the fact that millions of people all around the world are using Java in some way. You're not going to have access to the same body of testers, so your functions will automatically be more error-prone than the standard ones. This is unavoidable.
So are you still thinking about writing your own functions?
If you can bear 1e-15ish relative error (or more like 1e-13ish for pow(double,double)), you can try this, which should be faster than java.lang.Math if you call it a lot : http://sourceforge.net/projects/jafama/
As some said, it's usually hard to beat java.lang.Math in pure Java if you want to keep similar (1-ulp-ish) accuracy, but a little bit less accuracy in double precision is often totally bearable (and still much more accurate than what you would have when computing with floats), and can allow for some noticeable speed-up.
What might be an option is caching the values. If you know you are only going to need a fixed set of values or if you can get away without perfect accuracy then this could save a lot of time. Say if you want to draw a lot of circles pre compute values of sin and cos for each degree. Then use these values when drawing. Most circles will be small enough that you can't see the difference and the small number which are very big can be done using the libraries.
Be sure to test if this is worth it. On my 5 year old macbook I can do a million evaluations of cos a second.

Few calculations with huge matrices vs. lots of calculations with small matrices

I am working on a Java project which has thousands of matrix calculations. But the matrices are at most 10x10 matrices.
I wonder if it is better to use a matrix library or use write the simple functions (determinant(), dotproduct() etc.) Because when small matrices are used, it is advised not to use libraries but do the operations by custom functions.
I know that matrix libraries like JAMA provides high performance when it comes to 10000x10000 matrices or so.
Instead making 5-6 calculations with 10000x10000 matrices, I make 100000 calculations with 10x10 matrices. Number of primitive operations are nearly the same.
Are both cases same in terms of performance? Should I treat myself as if I'm working with huge matrices and use a library?
I suspect for a 10x10 matrix you won't see much difference.
In tests I have done for hand coding a 4x4 matrix the biggest overhead was loading the data into the L1 cache and how you did it didn't matter very much. For a 3x3 matrix and smaller it did appear to make a significant difference.
Getting the maximum possible speed (with lots of effort)
For maximum possible speed I would suggest writing a C function that uses vector math intrinsics such as Streaming SIMD Extensions (SSE) or Advanced Vector Extensions (AVX) operations, together with multithreading (e.g. via OpenMP).
Your Java program would pass all 100k matrices to this native function, which would then handle all the calculations. Portability becomes an issue, e.g. AVX instructions are only supported on recent CPUs. Developer effort, especially if you are not familiar with SSE/AVX increases a lot too.
Reasonable speed without too much effort
You should use multiple threads by creating a class that extends java.lang.Thread or implements java.lang.Runnable. Each thread iterates through a subset of the matrices, calling your maths routine(s) for each matrix. This part is key to getting decent speed on multi-core CPUs. The maths could be your own Java function to do the calculations on a single matrix, or you could use a library's functions.
I wonder if it is better to use a matrix library or use write the
simple functions (determinant(), dotproduct() etc.) Because when small
matrices are used, it is advised not to use libraries but do the
operations by custom functions.
...
Are both cases same in terms of performance? Should I treat myself as
if I'm working with huge matrices and use a library?
No, using a library and writing your own function for the maths are not the same performance-wise. You may be able to write a faster function that is specialised to your application, but consider this:
The library functions should have fewer bugs than code you will write.
A good library will use implementations that are efficient (i.e. least amount of operations). Do you have the time to research and implement the most efficient algorithms?
You might find the Apache Commons Math library useful. I would encourage you to benchmark Apache Commons Math and JAMA to choose the fastest.

Performance comparison of array of arrays vs multidimensional arrays

When I was using C++ in college, I was told to use multidimensional arrays (hereby MDA) whenever possible, since it exhibits better memory locality since it's allocated in one big chunk. Array of arrays (AoA), on the other hand, are allocated in multiple smaller chunks, possibly scattered all over the place in the physical memory wherever vacancies are found.
So I guess the first question is: is this a myth, or is this an advice worth following?
Assuming that it's the latter, then the next question would be what to do in a language like Java that doesn't have true MDA. It's not that hard to emulate MDA with a 1DA, of course. Essentially, what is syntactic sugar for languages with MDA can be implemented as library support for languages without MDA.
Is this worth the effort? Is this too low level of an optimization issue for a language like Java? Should we just abandon arrays and use Lists even for primitives?
Another question: in Java, does allocating AoA at once (new int[M][N]) possibly yield a different memory allocation than doing it hierarchically (new int[M][]; for (... new int[N])?
Java and C# allocate memory in much different fashion that C++ does. In fact, in .NET for sure all the arrays of AoA will be close together if they are allocated one after another because memory there is just one continuous chunk without any fragmentation whatsoever.
But it is still true for C++ and still makes sense if you want maximum speed. Although you shouldn't follow that advise every time you want multidimensional array, you should write maintainable code first and then profile it if it is slow, premature optimization is root for all evil in this world.
Is this worth the effort? Is this too low level of an optimization issue for a language like Java?
Generally speaking, it is not worth the effort. The best strategy to to forget about this issue in the first version of your application, and implement in a straight-forward (i.e. easy to maintain) way. If the first version runs too slowly for your requirements, use a profiling tool to find the application's bottlenecks. If the profiling suggests that arrays of arrays is likely to be the problem, do some experiments to change your data structures to simulated multi-dimensional arrays and profile see if it makes a significant difference. [I suspect that it won't make much difference. But the most important things is to not waste your time optimizing something unnecessarily.]
Should we just abandon arrays and use Lists even for primitives?
I wouldn't go that far. Assuming that you are dealing with arrays of a predetermined size:
arrays of objects will be a bit faster than equivalent lists of objects, and
arrays of primitives will be considerably faster and take considerably less space than equivalent lists of primitive wrapper.
On the other hand, if your application needs to "grow" the arrays, using a List will simplify your code.
I would not waste the effort to use a 1D array as a multidim array in Java because there is no syntax to help. Of course one could define functions (methods) to hide the work for you, but you just end up with a function call instead of following a pointer when using an array of arrays. Even if the compiler/interpreter speeds this up for you, I still don't think it is worth the effort. In addition you can run into complications when trying to use code that expects 2D (or N-Dim) arrays that expect as arrays of arrays. I'm sure most general code out there will be written for arrays like this in Java. Also one can cheaply reassign rows (or columns if you decide to think like that).
If you know that this multidim array is a bottleneck, you may disregard what I said and see if manually optimizing helps.
From personal experience in Java, multidimensional arrays are far far slower than one dimensional arrays if one is loading a large amount of data, or accessing elements in the data which are at different positions. I wrote a program that took a screen shot image in BMP format, and then searched the screenshot for a smaller image. Loading the screenshot image (approx. 3 mb) into a multidimensional array (three dimensional, [xPos][yPos][color] (with color=0 being red value, and suchforth)) took 14 seconds. To load it into a single dimensional array took 1 second.
The gain for finding the smaller image in the larger image was similar. It took around 28 seconds to find the smaller image in the larger image when both images were stored as multi-dimensional arrays. It took around a second to find the smaller image in the larger image when both images were stored as one dimensional arrays. That said, I first wrote my program using a dimensional arrays for the sake of readability.

matlab matrix functions in java

I have noticed that matlab does some matrix function really fast for example adding 5 to all elements of an n*n array happens almost instantly even if the matrix is large because you don't have to loop through every element, doing the same in java the for loop takes forever if the matrix is large.
I have two questions, are there efficient built-in classes in java for doing matrix operations, second how can I code something to update all elements of a big matrix in java more efficiently.
Just stumbled into this posting and thought I would throw my two cents in. I am author of EJML and I am also working on a performance and stability benchmark for java libraries. While several issues go into determining how fast an algorithm is, Mikhail is correct that caching is a very important issue in performance of large matrices. For smaller matrices the libraries overhead becomes more important.
Due to overhead in array access, pure Java libraries are slower than highly optimized c libraries, even if the algorithms are exactly the same. Some libraries get around this issue by making calls to native code. You might want to check out
http://code.google.com/p/matrix-toolkits-java/
which does exactly that. There will be some overhead in copying memory from java to the native library, but for large matrices this is insignificant.
For a benchmark on pure java performance (the one that I'm working on) check out:
http://code.google.com/p/java-matrix-benchmark/
Another benchmark is here:
http://www.ujmp.org/java-matrix/benchmark/
Either of these benchmarks should give you a good idea of performance for large matrices.
Colt may be the fastest.
"Colt provides a set of Open Source Libraries for High Performance Scientific and Technical Computing in Java. " "For example, IBM Watson's Ninja project showed that Java can indeed perform BLAS matrix computations up to 90% as fast as optimized Fortran."
JAMA!
"JAMA is a basic linear algebra package for Java. It provides user-level classes for constructing and manipulating real, dense matrices."
Or the Efficient Java Matrix Library
"Efficient Java Matrix Library (EJML) is a linear algebra library for manipulating dense matrices. Its design goals are; 1) to be as computationally efficient as possible for both small and large matrices, and 2) to be accessible to both novices and experts."

Categories