Tricks for removing floating point rounding error?

Tricks for removing floating point rounding error? - java

First, let me start off by saying that I know there is no way to avoid rounding errors. My question rather lies in how to take rounding errors and fix them. "Why does it matter?", you may ask. In a java project of mine I have used the Separating Axis Theorem to implement collision detection/resolution for moving objects, and it works spectacularly well, but there is a flaw that I keep running into...
Rounding errors
I've made a picture to help illustrate what I'm talking about:
Above is how I do collision resolution in my java project. Let me show how rounding errors can be problematic in this situation.
SV = MV.scale(2f(distance)/3f(move length))
2f/3f = 0.666666667 (approximately what will be displayed if you print 2/3)
0.666666667 is very bad because it actually causes the scale to be too large
0.000000001 scale too large, which means SV will be larger than expected, causing it to barely clip into the other shape upon collision resolution. Yes this is so very small that it cant be detected by a human, but it does indeed cause clipping, however small it may be, and as such I consider it a FAILED collision resolution.
My working solution thus far has been something like such
MV.scale((int)(2f/3f)*10000f)/10000f)
While this will probably work for all applications I would use my algorithm for, the loss in precision and probable eventual breaking at high numbers (because of the precision loss encountered there due to the nature of floating point numbers) make it hard to accept. I wouldn't have an issue with choosing 0.666666665 (or something close to it) through an algorithm, but I can't find an algorithm for finding the next closest floating point number to a number (and if it exists I'd be wary of performance drain).
Any ideas or alternate strategies? I'm kinda at my wits end here. Thanks in advance!

Related

AnyLogic: compensating for double overflow in GISRegion.area()

I'm trying to sort a collection of AnyLogic GISRegions by their geographical area. Said area is calculated using GISRegion.area(units), which is straightforward enough. The areas I'm using, however, are city-scale and the method returns a double. This appears to cause overflow problems:
I don't think I'm doing anything wrong with my code, so presumably this is an AnyLogic problem. For brevity, I've included a line that prints each region's area rather than the sorting steps:
// For each region of the Australian Capital Territory, print its area in km^2:
areas.forEach(next -> traceln(""+next.name+": " + next.gisRegion.area(SQ_KILOMETER)));
Has anyone encountered this issue? How did you get around it?
For non-AnyLogic users, I have all the lat-long points in each geoshape. How might I calculate the area using those points?

[Not really a full answer, but the ideas are too long for a comment.]
I assume you've raised an AnyLogic support request since it seems 100% a bug. Since this is just a basic 'calculate area' function, I can't see any way round it other than, as you suggest, calculating it in an alternative manner from the vertex lat/longs that you have, and can get via getPoints() on the GISRegion.
Since this is just an N-sided polygon, surely there must be standard Java libraries that could calculate that, though that's not allowing for the GIS projection (not sure what level of error that might introduce); you'd expect open GIS libraries to cope with the latter. Since a GISRegion has a createOMGraphicObject() method to create an OpenMap standard(?) format graphic, that could be useful if that's a standard format other libraries can work with.
There's code on glennon's answer to this GIS StackOverflow question that claims to perform the calculation (or you make be able to hook in to PostGIS as in fmark's answer).

(cast)Int vs float performance

I was searching trough google for an answer but i can't find something usefull. I am making games with libGDX and till now performance was not a big issue. But the game that i am working on now, will need some more optimization so here is my question:
With libGDX there is a lot of floats. I know that int is faster but what if i cast floats into int? Is this faster then using a float numbers or should just go with floats?
I don't need very high precision, becouse i mostly use that numbers for coordinates. But with much multiplications (i don't use division becouse it is slower) though the code i wonder what i should use.

Without even seeing your code the answer is: you're probably looking in the wrong place. The performance problem is not due to the use of floats, and switching to ints will not fix it.
As soon as you get performance problems (and even before) you must find ways to benchmark and measure the performance of your code. You might use profiling, or internal tracing, or some other way but you must do it.
Most performance problems are due to bad algorithms and/or a few small functions called millions of times. Fix the algorithms and recode the small critical functions and you will probably solve the problem. Switching your code to int is something you do after you've fixed everything else and need just a little more.
Put up some code on CodeReview and I'm sure you'll get lots more help with it.

Artificial Neural network PSO training

I am working on a FF Neural network (used for classification problems) which I am training using a PSO. I only have one hidden layer and I can vary the amount of neurons in that layer.
My problem is that the NN can learn linearly separable problems quite easily but can not learn problems that are not linearly separable(like XOR) like it should be able to do.
I believe my PSO is working correctly because I cans see that it tries to minimises the error function of each particle (using mean squared error over the training set).
I have tried using a sigmoid and linear activation function with similar(bad) results. I also have a bias unit(which also doesn't help much).
What I want to know is if there is something specific that I might be doing wrong that might cause this type of problem, or maybe just some things I should look at where the error might be.
I am a bit lost at the moment
Thanks

PSO can train a neural network to non-solve linearly separable problems, like XOR. I've done this before, my algorithm takes about 50 or so iterations at most. Sigmoid is a good activation function for XOR. If it does converge for non-separable problems then my guess is somehow your hidden layer is not having an effect, or is bypassed. As the hidden layer is what typically allows non-separable.
When I debug AI I find it often useful to determine first if my training code or evaluation code (the neural network in this case) is at fault. You might want to create a 2nd trainer for your network. Then you can make sure your network code is calculating the output correctly. You could even do a simple "hill climber". Pick a random weight and change by a random small amount (up or down). Did your error get better? Keep the weight change and repeat. Did your error get worse, drop the change and try again.

What happends when you convert floating point numbers to fixed point numbers in java (java me specifically)

I want my java me program to run as efficiently as possible. my goal is to make a ray cast and want to know the best way to traverse voxels. I have heard that conversion and comparison of floating point numbers is very CPU intensive. So I figured why not add a certain distance to each rays x and y, truncate the remainder, and use those coordinates to then check an octree for a voxel. Basically, is there a better way of going about doing something like this for a java me program?
Truncating floating point numbers?

"Floating point math is slow" is old wisdom - however, it is also outdated wisdom. On modern desktop CPUs, floating point computations are fast, and there is little to gain on fixed-point computations.
Edit after having reread the question title: The approach you describe is perfectly viable, except that you need to multiply, not add to, each number. However, you should first write a small performance test program that checks whether the kind of computations you intend to do will actually benefit from fixed-point math on the hardware where you intend to run your program.

Solving nonlinear equations numerically

I need to solve nonlinear minimization (least residual squares of N unknowns) problems in my Java program. The usual way to solve these is the Levenberg-Marquardt algorithm. I have a couple of questions
Does anybody have experience on the different LM implementations available? There exist slightly different flavors of LM, and I've heard that the exact implementation of the algorithm has a major effect on the its numerical stability. My functions are pretty well-behaved so this will probably not be a problem, but of course I'd like to choose one of the better alternatives. Here are some alternatives I've found:
FPL Statistics Group's Nonlinear Optimization Java Package. This includes a Java translation of the classic Fortran MINPACK routines.
JLAPACK, another Fortran translation.
Optimization Algorithm Toolkit.
Javanumerics.
Some Python implementation. Pure Python would be fine, since it can be compiled to Java with jythonc.
Are there any commonly used heuristics to do the initial guess that LM requires?
In my application I need to set some constraints on the solution, but luckily they are simple: I just require that the solutions (in order to be physical solutions) are nonnegative. Slightly negative solutions are result of measurement inaccuracies in the data, and should obviously be zero. I was thinking to use "regular" LM but iterate so that if some of the unknowns becomes negative, I set it to zero and resolve the rest from that. Real mathematicians will probably laugh at me, but do you think that this could work?
Thanks for any opinions!
Update: This is not rocket science, the number of parameters to solve (N) is at most 5 and the data sets are barely big enough to make solving possible, so I believe Java is quite efficient enough to solve this. And I believe that this problem has been solved numerous times by clever applied mathematicians, so I'm just looking for some ready solution rather than cooking my own. E.g. Scipy.optimize.minpack.leastsq would probably be fine if it was pure Python..

The closer your initial guess is to the solution, the faster you'll converge.
You said it was a non-linear problem. You can do a least squares solution that's linearized. Maybe you can use that solution as a first guess. A few non-linear iterations will tell you something about how good or bad an assumption that is.
Another idea would be trying another optimization algorithm. Genetic and ant colony algorithms can be a good choice if you can run them on many CPUs. They also don't require continuous derivatives, so they're nice if you have discrete, discontinuous data.

You should not use an unconstrained solver if your problem has constraints. For
instance if know that some of your variables must be nonnegative you should tell
this to your solver.
If you are happy to use Scipy, I would recommend scipy.optimize.fmin_l_bfgs_b
You can place simple bounds on your variables with L-BFGS-B.
Note that L-BFGS-B takes a general nonlinear objective function, not just
a nonlinear least-squares problem.

I agree with codehippo; I think that the best way to solve problems with constraints is to use algorithms which are specifically designed to deal with them. The L-BFGS-B algorithm should probably be a good solution in this case.
However, if using python's scipy.optimize.fmin_l_bfgs_b module is not a viable option in your case (because you are using Java), you can try using a library I have written: a Java wrapper for the original Fortran code of the L-BFGS-B algorithm. You can download it from http://www.mini.pw.edu.pl/~mkobos/programs/lbfgsb_wrapper and see if it matches your needs.

The FPL package is quite reliable but has a few quirks (array access starts at 1) due to its very literal interpretation of the old fortran code. The LM method itself is quite reliable if your function is well behaved. A simple way to force non-negative constraints is to use the square of parameters instead of the parameters directly. This can introduce spurious solutions but for simple models, these solutions are easy to screen out.
There is code available for a "constrained" LM method. Look here http://www.physics.wisc.edu/~craigm/idl/fitting.html for mpfit. There is a python (relies on Numeric unfortunately) and a C version. The LM method is around 1500 lines of code, so you might be inclined to port the C to Java. In fact, the "constrained" LM method is not much different than the method you envisioned. In mpfit, the code adjusts the step size relative to bounds on the variables. I've had good results with mpfit as well.
I don't have that much experience with BFGS, but the code is much more complex and I've never been clear on the licensing of the code.
Good luck.

I haven't actually used any of those Java libraries so take this with a grain of salt: based on the backends I would probably look at JLAPACK first. I believe LAPACK is the backend of Numpy, which is essentially the standard for doing linear algebra/mathematical manipulations in Python. At least, you definitely should use a well-optimized C or Fortran library rather than pure Java, because for large data sets these kinds of tasks can become extremely time-consuming.
For creating the initial guess, it really depends on what kind of function you're trying to fit (and what kind of data you have). Basically, just look for some relatively quick (probably O(N) or better) computation that will give an approximate value for the parameter you want. (I recently did this with a Gaussian distribution in Numpy and I estimated the mean as just average(values, weights = counts) - that is, a weighted average of the counts in the histogram, which was the true mean of the data set. It wasn't the exact center of the peak I was looking for, but it got close enough, and the algorithm went the rest of the way.)
As for keeping the constraints positive, your method seems reasonable. Since you're writing a program to do the work, maybe just make a boolean flag that lets you easily enable or disable the "force-non-negative" behavior, and run it both ways for comparison. Only if you get a large discrepancy (or if one version of the algorithm takes unreasonably long), it might be something to worry about. (And REAL mathematicians would do least-squares minimization analytically, from scratch ;-P so I think you're the one who can laugh at them.... kidding. Maybe.)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.