I am making a voxel-based game where large structures made of voxels can collide with each other. The game runs smoothly until two structures get to close to each other where it quickly drops to about 3 updates/second.
My current solution to adding these colliders to JBullet is by using a compound shape and using a greedy-meshing algorithm to make larger colliders out of adjacent blocks. While this is a drastic improvement over each voxel having its own collider, it still isn't fast enough to simulate large structures colliding.
I did a bunch of research and it seems that octrees are the only way to make this run in real time, but can't figure out how to add support for them in JBullet.
I am working on a Java project which has thousands of matrix calculations. But the matrices are at most 10x10 matrices.
I wonder if it is better to use a matrix library or use write the simple functions (determinant(), dotproduct() etc.) Because when small matrices are used, it is advised not to use libraries but do the operations by custom functions.
I know that matrix libraries like JAMA provides high performance when it comes to 10000x10000 matrices or so.
Instead making 5-6 calculations with 10000x10000 matrices, I make 100000 calculations with 10x10 matrices. Number of primitive operations are nearly the same.
Are both cases same in terms of performance? Should I treat myself as if I'm working with huge matrices and use a library?
I suspect for a 10x10 matrix you won't see much difference.
In tests I have done for hand coding a 4x4 matrix the biggest overhead was loading the data into the L1 cache and how you did it didn't matter very much. For a 3x3 matrix and smaller it did appear to make a significant difference.
Getting the maximum possible speed (with lots of effort)
For maximum possible speed I would suggest writing a C function that uses vector math intrinsics such as Streaming SIMD Extensions (SSE) or Advanced Vector Extensions (AVX) operations, together with multithreading (e.g. via OpenMP).
Your Java program would pass all 100k matrices to this native function, which would then handle all the calculations. Portability becomes an issue, e.g. AVX instructions are only supported on recent CPUs. Developer effort, especially if you are not familiar with SSE/AVX increases a lot too.
Reasonable speed without too much effort
You should use multiple threads by creating a class that extends java.lang.Thread or implements java.lang.Runnable. Each thread iterates through a subset of the matrices, calling your maths routine(s) for each matrix. This part is key to getting decent speed on multi-core CPUs. The maths could be your own Java function to do the calculations on a single matrix, or you could use a library's functions.
I wonder if it is better to use a matrix library or use write the
simple functions (determinant(), dotproduct() etc.) Because when small
matrices are used, it is advised not to use libraries but do the
operations by custom functions.
...
Are both cases same in terms of performance? Should I treat myself as
if I'm working with huge matrices and use a library?
No, using a library and writing your own function for the maths are not the same performance-wise. You may be able to write a faster function that is specialised to your application, but consider this:
The library functions should have fewer bugs than code you will write.
A good library will use implementations that are efficient (i.e. least amount of operations). Do you have the time to research and implement the most efficient algorithms?
You might find the Apache Commons Math library useful. I would encourage you to benchmark Apache Commons Math and JAMA to choose the fastest.
His
I have a somewhat hypothetical question. We've just programmed some code implementing genetic algorithm to find a solution to a sudoku game as part of the Computational Intelligence course project. Unfortunately it runs very slowly which limits our ability to perform adequate number of runs to find the optimal parameters. The question is whether reprogramming the whole thing - the code basis is not that big - into java would be a viable solution to boost up the speed of the software. Like we need 10x performance improvement really and i am doubtful that a Java version would be so much snappier. Any thoughts?
Thanks
=== Update 1 ===
Here is the code of the function that is computationally most expensive. It's a GA fitness function, that iterates through the population (different sudoku boards) and computes for each row and column how many elements are duplicates. The parameter n is passed, and is currently set to 9. That is, the function computes how many elements a row has that come up within the range 1 to 9 more then once. The higher the number the less is the fitness of the board, meaning that it is a weak candidate for the next generation.
The profiler reports that the two lines calling intersect in the for loops causing the poor performance, but we don't know how to really optimize the code. It follows below:
function [fitness, finished,d, threshold]=fitness(population_, n)
finished=false;
threshold=false;
V=ones(n,1);
d=zeros(size(population_,2),1);
s=[1:1:n];
for z=1:size(population_,2)
board=population_{z};
t=0;
l=0;
for i=1:n
l=l+n-length(intersect(s,board(:,i)'));
t=t+n-length(intersect(s,board(i,:)));
end
k=sum(abs(board*V-t));
f=t+l+k/50;
if t==2 &&l==2
threshold=true;
end
if f==0
finished=true;
else
fitness(z)=1/f;
d(z)=f;
end
end
end
=== Update 2 ===
Found a solution here: http://www.mathworks.com/matlabcentral/answers/112771-how-to-optimize-the-following-function
Using histc(V, 1:9), it's much faster :)
This is rather impossible to say without viewing your code, knowing if you use parallelization, etc. Indeed, as MrAzzaman says, profiling is the first thing to do. If you find a single bottleneck, especially if it is loop-heavy, it might be sufficient to write that part in C and connect it to Matlab via MEX.
In genetics algorithms, I'd believe that a 10x speed increase could be obtained rather than not. I do not quite agree with MrAzzaman here - in some cases (for loops, working with dynamic objects) is much, much slower than C/C++/Java. That is not to say that Matlab is always slow, for it is not, but there is plenty of algorithms where it would be slow.
I.e., I'd say that if you don't spend so much time looping over things, don't use objects, are not limited by Matlab's data structures, you might be ok with Matlab. That said, if I was to write GAs in Java or Matlab, I'd rather pick the former (and I'm using Matlab a lot more than Java these days, it's not just a matter of habit).
Btw. if you don't want to program it yourself, have a look at JGAP, it's a rather useful Java library for GAs.
OK, the first step is just to write a faster MATLAB function. Save the new languages for later.
I'm going to make the assumption that the board is full of valid guesses: that is, each entry is in [1, 9]. Now, what we're really looking for are duplicate entries in each row/column. To find duplicates, we sort. On a sorted row, if any element is equal to its neighbor, we have a duplicate. In MATLAB, the diff function does sliding pairwise differencing, and a zero in its output means that two neighboring values are equal. Both sort and diff operate on entire matrices, so no need for looping. Here's the code for the columnwise check:
l=sum(sum(diff(sort(board)) == 0));
The rowwise check is exactly the same, just using the transpose. Now let's put that in a test harness to compare results and timing with the previous version:
n = 9;
% Generate a test board. Random integers numbers from 1:n
board = randi(n, n);
s = 1:n;
K=1000; % number of iterations to use for timing
% Repeat current code for comparison
tic
for k=1:K
t=0;
l=0;
for i=1:n
l=l+n-length(intersect(s,board(:,i)'));
t=t+n-length(intersect(s,board(i,:)));
end
end
toc
% New code based on sort/diff for finding repeated values
tic
for k=1:K
l2=sum(sum(diff(sort(board)) == 0));
t2=sum(sum(diff(sort(board.')) == 0));
end
toc
% Check that reported values match
disp([l l2])
disp([t t2])
I encourage you to break down the sort/diff/sum code, and build it up on a sample board right at the command line, and try to understand exactly how it works.
On my system, the new code is about 330x faster.
For traditional GA applications for studying and research purposes it is better to use a native machine compiled source code programming language, like C, C++. Which I used when working with Genetic
Programming in the past and it is really fast.
However if you are planning to put this inside a more modern type of application that can be deployed in a web container or run in a mobile device, different OS, etc. Then Java is your best alternative as it is platform independent.
Another thing that can be important is about concurrency. For example lets us suppose that you want to put your GA in the Internet and you will have a growing number of users that are connected concurrently and all of them want to solve a different sudoku, Java applications are very good for scaling horizontally and works great with big number of concurrent connections.
Other thing that can be good if you migrate to Java is the number of libraries and frameworks that you can use, the Java universe is so big that you can find useful tools for almost any kind of application.
Java is a Virtual Machine compiled language, but it is important to note that currently the JVMs are very good in performance and are able to optimize the programs, for example they will find which methods are being more heavily used and compile them to native code, which means that for some applications you will find a Java program to be almost same fast than a native compiled from C.
Matlab is a platform that is very useful for engineering training and math, vector, matrix based calculations, also for some control stuff with Simulink. I used these products when in my electrical engineering bachelor, however those product's goal is to be mainly a tool for academic purposes I won't definitely go for Matlab if I am wanting to build a production application for the real world. It is not scalable, it is expensive to maintain and fine-tune it, also there are not lot of infrastructure providers that will support this kind of technology.
About the complexity of rewriting your code to Java, the Matlab code and Java code syntax is pretty similar, they also live in the same paradigm: Procedural OOP, even if you are not using OO in your code it can be easy rewritten in Java, the painful stuff will be when working with Matlab shortcuts to Math structures like matrix and passing functions as parameters.
For the matrix stuff, there are lot of java libraries like EJML that will make your life easier. About assigning functions to variables and then pass them as parameters to another functions, Java is not currently able to do that (Java 8 will be with Lambda Expressions) but you can have a equivalent functionality by using Class closures. Maybe these will be the only little painful things that you will find if migrating.
Found a solution here: http://www.mathworks.com/matlabcentral/answers/112771-how-to-optimize-the-following-function
Using histc(V, 1:9), it's much faster :)
First of all, sorry for my bad English, but I need your help.
I have developed a simulation program with java swing where I have used lots of matrix calculations. My program is just finished but I need to speed up my performance. So I have used the java visual vm profiler to identify performance problems. I recognized that the initialization of Jama Matrices needs a lot of time. After running my program I had over 3 MB allocated Objects by JAMA. That's a lot of, isn't it? I think that's why the performance is bad.
Is there any better library than jama for matrices? I am using 3x3 matrices and I need multiplication and inverse operations or is there anything else i can do?
Usually matrix math libraries are not optimized for speed on small matrices.
You can see for yourself by taking a few stackshots, which are liable to show a large fraction of time in overhead functions like memory allocation and option checking.
What you can do instead (which I've done) is write special-purpose routines to do the multiplication and inverse, since you know the matrices are 3x3.
Multiplication is trivial and you can unroll the whole thing.
Inverse of a 3x3 matrix can also be done in less code than coffee :)
Wikipedia gives you the formula.
Whatever you do, try to minimize memory allocation.
I have noticed that matlab does some matrix function really fast for example adding 5 to all elements of an n*n array happens almost instantly even if the matrix is large because you don't have to loop through every element, doing the same in java the for loop takes forever if the matrix is large.
I have two questions, are there efficient built-in classes in java for doing matrix operations, second how can I code something to update all elements of a big matrix in java more efficiently.
Just stumbled into this posting and thought I would throw my two cents in. I am author of EJML and I am also working on a performance and stability benchmark for java libraries. While several issues go into determining how fast an algorithm is, Mikhail is correct that caching is a very important issue in performance of large matrices. For smaller matrices the libraries overhead becomes more important.
Due to overhead in array access, pure Java libraries are slower than highly optimized c libraries, even if the algorithms are exactly the same. Some libraries get around this issue by making calls to native code. You might want to check out
http://code.google.com/p/matrix-toolkits-java/
which does exactly that. There will be some overhead in copying memory from java to the native library, but for large matrices this is insignificant.
For a benchmark on pure java performance (the one that I'm working on) check out:
http://code.google.com/p/java-matrix-benchmark/
Another benchmark is here:
http://www.ujmp.org/java-matrix/benchmark/
Either of these benchmarks should give you a good idea of performance for large matrices.
Colt may be the fastest.
"Colt provides a set of Open Source Libraries for High Performance Scientific and Technical Computing in Java. " "For example, IBM Watson's Ninja project showed that Java can indeed perform BLAS matrix computations up to 90% as fast as optimized Fortran."
JAMA!
"JAMA is a basic linear algebra package for Java. It provides user-level classes for constructing and manipulating real, dense matrices."
Or the Efficient Java Matrix Library
"Efficient Java Matrix Library (EJML) is a linear algebra library for manipulating dense matrices. Its design goals are; 1) to be as computationally efficient as possible for both small and large matrices, and 2) to be accessible to both novices and experts."