Significant precision difference in floating point calculations in Java / C#

Significant precision difference in floating point calculations in Java / C# - java

I know that similar questions have been asked before, but none of the answers solves my problem.
I have 2 functions:
Java
public static void main(String[] args) {
double h, z, lat0, n0, eSq;
z = 4488055.516;
lat0 = 0.7853981634671384;
n0 = 6388838.290122733;
eSq = 0.0066943799901975545;
h = z / Math.sin(lat0) - n0 * (1 - eSq);
System.out.println(h);
}
C#
public static void Main (string[] args)
{
double h, z, lat0, n0, eSq;
z = 4488055.516;
lat0 = 0.7853981634671384;
n0 = 6388838.290122733;
eSq = 0.0066943799901975545;
h = z / Math.Sin(lat0) - n0 * (1 - eSq);
Console.WriteLine(h);
}
or
4488055,516/sin(0,7853981634671384)-6388838,290122733*(1-0,0066943799901975545)
for SpeedCrunch, Maxima and LibreOffice Calc.
Results are:
Java: 1000.0000555226579 (same with and without strictfp)
C#: 1000,00005552359
SpeedCrunch (15): 1000,000055524055155
SpeedCrunch (50): 1000,00005552405515548724762598846216107366705932894830
LibreOffice Calc: 1000,000055523590000
Maxima: 1000.000055523589
Maxima: 1.00000005552391142b3 (bfloat - fpprec:20)
As you can see, Java and C# are different at the 9th decimal place. Others are not so uniform either. This is tested on the same OS and same CPU. Tests are done also on 32-bit and 64-bit systems.
How to solve this kind of problem? I thought that precision should be equal to 15 decimal places.

The cause of not getting 15 digits is the subtraction of two similar numbers. z / Math.sin(lat0) is about 6347068.978968251. n0 * (1 - eSq) is about 6346068.978912728. With 7 digits before the decimal point, a change in the 9th decimal place in the subtraction result corresponds to a change of less than one part in 10^15 in one of those inputs.
The simplest solution to this type of problem is usually to display only the digits that are supported by reliable digits in the inputs. Very few measurements can be done to one part in 10^12, so in this case it is almost certain that the digits that differ due to floating point rounding error would be dropped.
For example, it looks as though your data relates to location. One of the most carefully measured pieces of such data is the height of Mount Everest. The current best estimate, based on high precision GPS measurement, is "29,035 feet, with an error margin of plus or minus 6.5 feet" Encyclopedia Britannica. An error in the 13th significant digit corresponds to an error of less than one thousandth of an inch in measuring the circumference of the earth.
If the rounding error really is significant relative to the result requirements, and to what is realistically achievable given the accuracy of the inputs, then you might need to look at more clever ways of arranging the calculation or at higher precision arithmetic.

The precision difference you are observing is not in floating point support per se. Java and C# will be using the same (IEE 768) floating point representations, and quite likely the same instructions.
What you are observing is probably differences in the algorithms that calculate transcendental functions; e.g. the sine function. The theoretical calculation involves summing an infinite series. To get the most accurate answer possible, you keep summing the series forever. To get the answer to a certain precision, you keep summing until the "deltas" are less than the required precision.
In practice, this approach is impractically slow. Practical algorithms are implemented using tables and interpolation, where possible. This is much faster, though you don't get maximal precision.
The precision of Java Math.sin is determined by the algorithms used to compute it. These algorithms are platform specific. Here is what the javadocs for Math say on the subject of precision.
Unlike some of the numeric methods of class StrictMath, all implementations of the equivalent functions of class Math are not defined to return the bit-for-bit same results. This relaxation permits better-performing implementations where strict reproducibility is not required.
And the precision guarantee for Math.sin is:
The computed result must be within 1 ulp of the exact result. Results must be semi-monotonic.
where "ulp" and "semi-monotonic" are specified in the javadoc.
By contrast the javadocs for StrictMath state that a specific version of a specific open-source library is used to do the calculations. The goal is reproducibility; i.e. the same answer on all Java platforms.
Q: So why didn't they make Math.sin more precise? It is possible to get to within 0.5 ulp ...
It is an engineering trade-off between speed and precision. Read this wikipedia article to get a feel for the problem.
My advice is that if you want maximal precision, look for a 3rd party open-source Java library that implements the transcendental functions. And be prepared for your code to be a lot slower.

Related

The accuracy of a double in general programming and Java

I understand that due to the nature of a float/double one should not use them for precision important calculations. However, i'm a little confused on their limitations due to mixed answers on similar questions, whether or not floats and doubles will always be inaccurate regardless of significant digits or are only inaccurate up to the 16th digit.
I've ran a few examples in Java,
System.out.println(Double.parseDouble("999999.9999999999");
// this outputs correctly w/ 16 digits
System.out.println(Double.parseDouble("9.99999999999999");
// This also outputs correctly w/ 15 digits
System.out.println(Double.parseDouble("9.999999999999999");
// But this doesn't output correctly w/ 16 digits. Outputs 9.999999999999998
I can't find the link to another answer that stated that values like 1.98 and 2.02 would round down to 2.0 and therefore create inaccuracies but testing shows that the values are printed correctly. So my first question is whether or not floating/double values will always be inaccurate or is there a lower limit where you can be assured of precision.
My second question is in regards to using BigDecimal. I know that I should be using BigDecimal for precision important calculations. Therefore I should be using BigDecimal's methods for arithmetic and comparing. However, BigDecimal also includes a doubleValue() method which will convert the BigDecimal to a double. Would it be safe for me to do a comparison between double values that I know for sure have less than 16 digits? There will be no arithmetic done on them at all so the inherent values should not have changed.
For example, is it safe for me to do the following?
BigDecimal myDecimal = new BigDecimal("123.456");
BigDecimal myDecimal2 = new BigDecimal("234.567");
if (myDecimal.doubleValue() < myDecimal2.doubleValue()) System.out.println("myDecimal is smaller than myDecimal2");
Edit: After reading some of the responses to my own answer i've realized my understanding was incorrect and have deleted it. Here are some snippets from it that might help in the future.
"A double cannot hold 0.1 precisely. The closest representable value to 0.1 is 0.1000000000000000055511151231257827021181583404541015625. Java Double.toString only prints enough digits to uniquely identify the double, not the exact value." - Patricia Shanahan
Sources:
https://stackoverflow.com/a/5749978 - States that a double can hold up to 15 digits

I suggest you read this page:
https://en.wikipedia.org/wiki/Double-precision_floating-point_format
Once you've read and understood it, and perhaps converted several examples to their binary representations in the 64 bit floating point format, then you'll have a much better idea of what significant digits a Double can hold.

As a side note, (perhaps trivial) a nice and reliable way to store a known precision of value is to simply multiply it by the relevant factor and store as some integral type, which are completely precise.
For example:
double costInPounds = <something>; //e.g. 3.587
int costInPence = (int)(costInPounds * 100 + 0.5); //359
Plainly some precision can be lost, but if a required/desired precision is known, this can save a lot of bother with floating point values, and once this has been done, no precision can be lost by further manipulations.
The + 0.5 is to ensure that rounding works as expected. (int) takes the 'floor' of the provided double value, so adding 0.5 makes it round up and down as expected.

Tan function in java [duplicate]

I came to know about the accuracy issues when I executed the following following program:
public static void main(String args[])
{
double table[][] = new double[5][4];
int i, j;
for(i = 0, j = 0; i <= 90; i+= 15)
{
if(i == 15 || i == 75)
continue;
table[j][0] = i;
double theta = StrictMath.toRadians((double)i);
table[j][1] = StrictMath.sin(theta);
table[j][2] = StrictMath.cos(theta);
table[j++][3] = StrictMath.tan(theta);
}
System.out.println("angle#sin#cos#tan");
for(i = 0; i < table.length; i++){
for(j = 0; j < table[i].length; j++)
System.out.print(table[i][j] + "\t");
System.out.println();
}
}
And the output is:
angle#sin#cos#tan
0.0 0.0 1.0 0.0
30.0 0.49999999999999994 0.8660254037844387 0.5773502691896257
45.0 0.7071067811865475 0.7071067811865476 0.9999999999999999
60.0 0.8660254037844386 0.5000000000000001 1.7320508075688767
90.0 1.0 6.123233995736766E-17 1.633123935319537E16
(Please forgive the unorganised output).
I've noted several things:
sin 30 i.e. 0.5 is stored as 0.49999999999999994.
tan 45 i.e. 1.0 is stored as 0.9999999999999999.
tan 90 i.e. infinity or undefined is stored as 1.633123935319537E16 (which is a very big number).
Naturally, I was quite confused to see the output (even after deciphering the output).
So I've read this post, and the best answer tells me:
These accuracy problems are due to the internal representation of floating > point numbers and there's not much you can do to avoid it.
By the way, printing these values at run-time often still leads to the correct results, at >least using modern C++ compilers. For most operations, this isn't much of an issue.
answered Oct 7 '08 at 7:42
Konrad Rudolph
So, my question is:
Is there any way to prevent such inaccurate results (in Java)?
Should I round-off the results? In that case, how would I store infinity i.e. Double.POSITIVE_INFINITY?

You have to take a bit of a zen* approach to floating-point numbers: rather than eliminating the error, learn to live with it.
In practice this usually means doing things like:
when displaying the number, use String.format to specify the amount of precision to display (it'll do the appropriate rounding for you)
when comparing against an expected value, don't look for equality (==). Instead, look for a small-enough delta: Math.abs(myValue - expectedValue) <= someSmallError
EDIT: For infinity, the same principle applies, but with a tweak: you have to pick some number to be "large enough" to treat as infinity. This is again because you have to learn to live with, rather than solve, imprecise values. In the case of something like tan(90 degrees), a double can't store π/2 with infinite precision, so your input is something very close to, but not exactly, 90 degrees -- and thus, the result is something very big, but not quite infinity. You may ask "why don't they just return Double.POSITIVE_INFINITY when you pass in the closest double to π/2," but that could lead to ambiguity: what if you really wanted the tan of that number, and not 90 degrees? Or, what if (due to previous floating-point error) you had something that was slightly farther from π/2 than the closest possible value, but for your needs it's still π/2? Rather than make arbitrary decisions for you, the JDK treats your close-to-but-not-exactly π/2 number at face value, and thus gives you a big-but-not-infinity result.
For some operations, especially those relating to money, you can use BigDecimal to eliminate floating-point errors: you can really represent values like 0.1 (instead of a value really really close to 0.1, which is the best a float or double can do). But this is much slower, and doesn't help you for things like sin/cos (at least with the built-in libraries).
* this probably isn't actually zen, but in the colloquial sense

You have to use BigDecimal instead of double. Unfortunately, StrictMath doesn't support BigDecimal, so you will have to use another library, or your own implementation of sin/cos/tan.

This is inherent in using floating-point numbers, in any language. Actually, it's inherent in using any representation with a fixed maximum precision.
There are several solutions. One is to use an extended-precision math package -- BigDecimal is often suggested for Java. BigDecimal can handle many more digits of precision, and also -- because it's a decimal representation rather than a 2's-complement representation -- tends to round off in ways that are less surprising to humans who are used to working in base 10. (That doesn't necessarily make them more correct, please note. Binary can't represent 1/3 exactly, but neither can decimal.)
There are also extended-precision 2's-complement floating-point representations. Java directly supports float and double (which are usually also supported by the hardware), but it's possible to write versions which support more digits of accuracy.
Of course any of the extended-precision packages will slow down your computations. So you shouldn't resort to them unless you actually need them.
Another may to use fixed-point binary rather than floating point. For example, the standard solution for most financial calculations is simply to compute in terms of the smallest unit of currency -- pennies, in the US -- in integers, converting to and from the display format (eg dollars and cents) only for I/O. That's also the approach used for time in Java -- the internal clock reports an integer number of milliseconds (or nanoseconds, if you use the nanotime call), which gives both more than sufficient precision and a more than sufficient range of values for most practical purposes. Again, this means that roundoff tends to happen in a way that matches human expectations... and again, that's less about accuracy than about not surprising the users. And these representations, because they process as integers or longs, allow fast computation -- faster than floating point, in fact.
There are yet other solutions which involve computing in rational numbers, or other variations, in an attempt to compromise between computational cost and precision.
But I also have to ask... Do you really NEED more precision than float is giving you? I know the roundoff is surprising, but in many cases it's perfectly acceptable to just let it happen, possibly rounding off to a less surprising number of fractional digts when you display the results to the user. In many cases, float or double are Just Fine for real-world use. That's why the hardware supports them, and that's why they're in the language.

Ordering operation to maximize double precision

I'm working on some tool that gets to compute numbers that can get close to 1e-25 in the worst cases, and compare them together, in Java. I'm obviously using double precision.
I have read in another answer that I shouldn't expect more than 1e-15 to 1e-17 precision, and this other question deals with getting better precision when ordering operations in a "better" order.
Which double precision operations are more keen to loose precision along the way? Should I try to work with number as big as possible or as small as possible? Do divisions first before multiplications?
I'd rather not use the BigDecimal classes or equivalent, as the code is already slow enough ;) (unless they don't impact speed too much, of course).
Any information will be greatly appreciated!
EDIT: The fact that numbers are "small" in absolute value (1e-25) does not matter, as double can go down to 1e-324. But what matters is that, when they are very similar (both in 1e-25), I have to compare, let's say 4.64563824048517606458e-21 to 4.64563824048517606472e-21 (difference is the 19th and 20th digits). When computing these numbers, the difference is so small that I might hit the "rounding error", where remainder is filled with random numbers.
The question is: "how to order computation so that this loss of precision is minimized?". It might be doing divisions before multiplications, or additions first.

If it is important to get the correct answer, you should use BigDecimal. It is slower than double, but for most cases it is fast enough. I can't think of a lot of cases where you do a lot of calculations with such small numbers where it does not matter if the answer is correct - at least with Java.
If this is a super performance sensitive application, I would consider using a different language.

Thanks to #John for pointing out a very complete article about floating point arithmetics.
It turns out that, when precision is needed, operations should be re-ordered, and formulas adapted to avoid loss of precision, as explained in the Cancellation chapter: when comparing numbers that are very close to each other (which is my case), "catastrophic cancellation" may occur, inducing a huge loss of precision. Often, re-writing the formula, or re-ordering operations according to your à-priori knowledge of the operands values can lead to achieving greater accuracy in calculus.
What I'll remember from this article is:
be careful when substracting two nearly-identical quantities
try to re-arrange operations to avoid catastrophic cancellation
For the latter case, remember that computing (x - y) * (x + y) gives more accurate results than x * x - y * y.

Java double addition rounding off

In java following expression results into
new Double(1.0E22) + new Double(3.0E22) = 4.0E22
but
new Double(1.0E22) + new Double(4.0E22) = 4.9999999999999996E22
I was expecting it to be 5.0E22. The Double limit is 1.7976931348623157E308.
Appreciate your help. My machine's architecture is x64 and JVM is also 64 bit.

Welcome to the planet of floating point units. Unfortunately, in a real world, you have to give up some precision to get speed and breadth of representation. You cannot avoid that: double is only an approximate representation. Actually, you cannot represent a number but with finite precision. Still, it's a good approximation: less than 0.00000000001% error. This has nothing to do with double upper limits, rather with CPU limits, try doing some more math with Python:
>>> 4.9999999999999996 / 5.
1.0
>>> 5. - 4.9999999999999996
0.0
See? As a side note, never check for equality on double, use approximate equality:
if ((a - b) < EPSILON)
Where EPSILON is a very small value. Probably Java library has something more appropriate, but you get the idea.
If you are insterested in some theory, the standard for floating point operations is IEEE754

There are a few ways that you can reduce floating point errors, e.g. pairwise summation and Kahan summation, but neither of these will help you precisely represent a number like 5.0E22 - as Stefano Sanfilippo stated in his answer, that's due to the limits of what you can represent in using floating point, as opposed to a problem with the algorithm used to achieve the answer. To represent 5.0E22 you should either use BigDecimal or else use a library that has a Rational data type, e.g. JScience.

How to actually avoid floating point errors when you need to use float?

I am trying to affect the translation of a 3D model using some UI buttons to shift the position by 0.1 or -0.1.
My model position is a three dimensional float so simply adding 0.1f to one of the values causes obvious rounding errors. While I can use something like BigDecimal to retain precision, I still have to convert it from a float and back to a float at the end and it always results in silly numbers that are making my UI look like a mess.
I could just pretty the displayed values but the rounding errors will only get worse with more editing and they make my save files rather hard to read.
So how do I actually avoid these errors when I need to use a float?

The Kahan summation and pairwise summation algorithms help to reduce floating point errors. Here's some Java code for the Kahan algorithm.

I would use a Rational class. There are many out there - this one looks like it should work.
One significant cost will be when the Rational is rendered into a float and one when the denominator is reduced to the gcd. The one I posted keeps the numerator and denominator in fully reduced state at all times which should be quite efficient if you are always adding or subtracting 1/10.
This implementation holds the values normalised (i.e. with consistent sign) but unreduced.
You should choose your implementation to best fit your usage.

A simple solution is to either use fixed precision. i.e. an integer 10x or 100x what you want.
float f = 10;
f += 0.1f;
becomes
int i = 100;
i += 1; // use an many times as you like
// use i / 10.0 as required.
I wouldn't use float in any case as you get more rounding errors than double for next to no benefit (unless you have millions of float values) double gives you 8 more digits of precision and with sensible rounding would won't see those errors.

If you stick with floats:
The easiest way to avoid the error is using floats which are exact, but
near the desired value which is
round(2^n * value) * 1/2^n.
n is the number of bits, value the number to use (in your case 0.1)
In your case with increasing precision:
n = 4 => 0.125
n = 8 (byte) => 0.9765625
n = 16 (short)=> 0.100006103516....
The long number chains are artefacts of the binary conversion,
the real number has much less bits.
As the floats are exact, addition and subtraction will
not introduce offset errors, but will always be
predictable as long as the number of bits is
not longer than the float value holds.
If you fear that your display will be compromised by
using this solution (because they are odd floats), use
and store only integers (step increase -1/1).
The final value which is internally set is
x = value * step.
As the step increases or decreases by an amount of 1,
precision will be retained.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.