Java double addition rounding off - java

In java following expression results into
new Double(1.0E22) + new Double(3.0E22) = 4.0E22
but
new Double(1.0E22) + new Double(4.0E22) = 4.9999999999999996E22
I was expecting it to be 5.0E22. The Double limit is 1.7976931348623157E308.
Appreciate your help. My machine's architecture is x64 and JVM is also 64 bit.

Welcome to the planet of floating point units. Unfortunately, in a real world, you have to give up some precision to get speed and breadth of representation. You cannot avoid that: double is only an approximate representation. Actually, you cannot represent a number but with finite precision. Still, it's a good approximation: less than 0.00000000001% error. This has nothing to do with double upper limits, rather with CPU limits, try doing some more math with Python:
>>> 4.9999999999999996 / 5.
1.0
>>> 5. - 4.9999999999999996
0.0
See? As a side note, never check for equality on double, use approximate equality:
if ((a - b) < EPSILON)
Where EPSILON is a very small value. Probably Java library has something more appropriate, but you get the idea.
If you are insterested in some theory, the standard for floating point operations is IEEE754

There are a few ways that you can reduce floating point errors, e.g. pairwise summation and Kahan summation, but neither of these will help you precisely represent a number like 5.0E22 - as Stefano Sanfilippo stated in his answer, that's due to the limits of what you can represent in using floating point, as opposed to a problem with the algorithm used to achieve the answer. To represent 5.0E22 you should either use BigDecimal or else use a library that has a Rational data type, e.g. JScience.

Related

The accuracy of a double in general programming and Java

I understand that due to the nature of a float/double one should not use them for precision important calculations. However, i'm a little confused on their limitations due to mixed answers on similar questions, whether or not floats and doubles will always be inaccurate regardless of significant digits or are only inaccurate up to the 16th digit.
I've ran a few examples in Java,
System.out.println(Double.parseDouble("999999.9999999999");
// this outputs correctly w/ 16 digits
System.out.println(Double.parseDouble("9.99999999999999");
// This also outputs correctly w/ 15 digits
System.out.println(Double.parseDouble("9.999999999999999");
// But this doesn't output correctly w/ 16 digits. Outputs 9.999999999999998
I can't find the link to another answer that stated that values like 1.98 and 2.02 would round down to 2.0 and therefore create inaccuracies but testing shows that the values are printed correctly. So my first question is whether or not floating/double values will always be inaccurate or is there a lower limit where you can be assured of precision.
My second question is in regards to using BigDecimal. I know that I should be using BigDecimal for precision important calculations. Therefore I should be using BigDecimal's methods for arithmetic and comparing. However, BigDecimal also includes a doubleValue() method which will convert the BigDecimal to a double. Would it be safe for me to do a comparison between double values that I know for sure have less than 16 digits? There will be no arithmetic done on them at all so the inherent values should not have changed.
For example, is it safe for me to do the following?
BigDecimal myDecimal = new BigDecimal("123.456");
BigDecimal myDecimal2 = new BigDecimal("234.567");
if (myDecimal.doubleValue() < myDecimal2.doubleValue()) System.out.println("myDecimal is smaller than myDecimal2");
Edit: After reading some of the responses to my own answer i've realized my understanding was incorrect and have deleted it. Here are some snippets from it that might help in the future.
"A double cannot hold 0.1 precisely. The closest representable value to 0.1 is 0.1000000000000000055511151231257827021181583404541015625. Java Double.toString only prints enough digits to uniquely identify the double, not the exact value." - Patricia Shanahan
Sources:
https://stackoverflow.com/a/5749978 - States that a double can hold up to 15 digits
I suggest you read this page:
https://en.wikipedia.org/wiki/Double-precision_floating-point_format
Once you've read and understood it, and perhaps converted several examples to their binary representations in the 64 bit floating point format, then you'll have a much better idea of what significant digits a Double can hold.
As a side note, (perhaps trivial) a nice and reliable way to store a known precision of value is to simply multiply it by the relevant factor and store as some integral type, which are completely precise.
For example:
double costInPounds = <something>; //e.g. 3.587
int costInPence = (int)(costInPounds * 100 + 0.5); //359
Plainly some precision can be lost, but if a required/desired precision is known, this can save a lot of bother with floating point values, and once this has been done, no precision can be lost by further manipulations.
The + 0.5 is to ensure that rounding works as expected. (int) takes the 'floor' of the provided double value, so adding 0.5 makes it round up and down as expected.

Comma double numbers multiplication

Why this java code returns 61.004999999999995 instead of 61,005 ?? I don´t get it.
System.out.println(105*0.581);
It occurs due to the nature of floating point numbers . Computers are not very intelligent working with floating point numbers , so we have to work based on approximations.
Instead of 6.005 == 6.004999 , you should do this: 6.005 - 6.004999 < = 0.001
You fall into a floating point precision problem. In computer science there is a simple (but anoing) fact : you cannot represent all real numbers. It's also true for Java.
If you want to go deeper, you can study how floating point number are stores in memory. Key words are : bit of sign; mantissa and exponent. Be aware that the precision also depends on the system memory (32or64)
http://en.wikipedia.org/wiki/Single-precision_floating-point_format
Java speaking, for more precision you can use BigDecimal :
System.out.println(new BigDecimal(105).multiply(new BigDecimal(0.581));
You can also round it with round(MathContext mc) which in this case will give you 61.005 if you set the precision to 5.
System.out.println(new BigDecimal(105).multiply(new BigDecimal(0.581)).round(new MathContext(5)));
https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
If it's just a question about how to display it and the precision dosen't matter, you can use the DecimalFormat.
System.out.println(new DecimalFormat("###.###").format(105*0.581));
https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html

Tan function in java [duplicate]

I came to know about the accuracy issues when I executed the following following program:
public static void main(String args[])
{
double table[][] = new double[5][4];
int i, j;
for(i = 0, j = 0; i <= 90; i+= 15)
{
if(i == 15 || i == 75)
continue;
table[j][0] = i;
double theta = StrictMath.toRadians((double)i);
table[j][1] = StrictMath.sin(theta);
table[j][2] = StrictMath.cos(theta);
table[j++][3] = StrictMath.tan(theta);
}
System.out.println("angle#sin#cos#tan");
for(i = 0; i < table.length; i++){
for(j = 0; j < table[i].length; j++)
System.out.print(table[i][j] + "\t");
System.out.println();
}
}
And the output is:
angle#sin#cos#tan
0.0 0.0 1.0 0.0
30.0 0.49999999999999994 0.8660254037844387 0.5773502691896257
45.0 0.7071067811865475 0.7071067811865476 0.9999999999999999
60.0 0.8660254037844386 0.5000000000000001 1.7320508075688767
90.0 1.0 6.123233995736766E-17 1.633123935319537E16
(Please forgive the unorganised output).
I've noted several things:
sin 30 i.e. 0.5 is stored as 0.49999999999999994.
tan 45 i.e. 1.0 is stored as 0.9999999999999999.
tan 90 i.e. infinity or undefined is stored as 1.633123935319537E16 (which is a very big number).
Naturally, I was quite confused to see the output (even after deciphering the output).
So I've read this post, and the best answer tells me:
These accuracy problems are due to the internal representation of floating > point numbers and there's not much you can do to avoid it.
By the way, printing these values at run-time often still leads to the correct results, at >least using modern C++ compilers. For most operations, this isn't much of an issue.
answered Oct 7 '08 at 7:42
Konrad Rudolph
So, my question is:
Is there any way to prevent such inaccurate results (in Java)?
Should I round-off the results? In that case, how would I store infinity i.e. Double.POSITIVE_INFINITY?
You have to take a bit of a zen* approach to floating-point numbers: rather than eliminating the error, learn to live with it.
In practice this usually means doing things like:
when displaying the number, use String.format to specify the amount of precision to display (it'll do the appropriate rounding for you)
when comparing against an expected value, don't look for equality (==). Instead, look for a small-enough delta: Math.abs(myValue - expectedValue) <= someSmallError
EDIT: For infinity, the same principle applies, but with a tweak: you have to pick some number to be "large enough" to treat as infinity. This is again because you have to learn to live with, rather than solve, imprecise values. In the case of something like tan(90 degrees), a double can't store π/2 with infinite precision, so your input is something very close to, but not exactly, 90 degrees -- and thus, the result is something very big, but not quite infinity. You may ask "why don't they just return Double.POSITIVE_INFINITY when you pass in the closest double to π/2," but that could lead to ambiguity: what if you really wanted the tan of that number, and not 90 degrees? Or, what if (due to previous floating-point error) you had something that was slightly farther from π/2 than the closest possible value, but for your needs it's still π/2? Rather than make arbitrary decisions for you, the JDK treats your close-to-but-not-exactly π/2 number at face value, and thus gives you a big-but-not-infinity result.
For some operations, especially those relating to money, you can use BigDecimal to eliminate floating-point errors: you can really represent values like 0.1 (instead of a value really really close to 0.1, which is the best a float or double can do). But this is much slower, and doesn't help you for things like sin/cos (at least with the built-in libraries).
* this probably isn't actually zen, but in the colloquial sense
You have to use BigDecimal instead of double. Unfortunately, StrictMath doesn't support BigDecimal, so you will have to use another library, or your own implementation of sin/cos/tan.
This is inherent in using floating-point numbers, in any language. Actually, it's inherent in using any representation with a fixed maximum precision.
There are several solutions. One is to use an extended-precision math package -- BigDecimal is often suggested for Java. BigDecimal can handle many more digits of precision, and also -- because it's a decimal representation rather than a 2's-complement representation -- tends to round off in ways that are less surprising to humans who are used to working in base 10. (That doesn't necessarily make them more correct, please note. Binary can't represent 1/3 exactly, but neither can decimal.)
There are also extended-precision 2's-complement floating-point representations. Java directly supports float and double (which are usually also supported by the hardware), but it's possible to write versions which support more digits of accuracy.
Of course any of the extended-precision packages will slow down your computations. So you shouldn't resort to them unless you actually need them.
Another may to use fixed-point binary rather than floating point. For example, the standard solution for most financial calculations is simply to compute in terms of the smallest unit of currency -- pennies, in the US -- in integers, converting to and from the display format (eg dollars and cents) only for I/O. That's also the approach used for time in Java -- the internal clock reports an integer number of milliseconds (or nanoseconds, if you use the nanotime call), which gives both more than sufficient precision and a more than sufficient range of values for most practical purposes. Again, this means that roundoff tends to happen in a way that matches human expectations... and again, that's less about accuracy than about not surprising the users. And these representations, because they process as integers or longs, allow fast computation -- faster than floating point, in fact.
There are yet other solutions which involve computing in rational numbers, or other variations, in an attempt to compromise between computational cost and precision.
But I also have to ask... Do you really NEED more precision than float is giving you? I know the roundoff is surprising, but in many cases it's perfectly acceptable to just let it happen, possibly rounding off to a less surprising number of fractional digts when you display the results to the user. In many cases, float or double are Just Fine for real-world use. That's why the hardware supports them, and that's why they're in the language.

Significant precision difference in floating point calculations in Java / C#

I know that similar questions have been asked before, but none of the answers solves my problem.
I have 2 functions:
Java
public static void main(String[] args) {
double h, z, lat0, n0, eSq;
z = 4488055.516;
lat0 = 0.7853981634671384;
n0 = 6388838.290122733;
eSq = 0.0066943799901975545;
h = z / Math.sin(lat0) - n0 * (1 - eSq);
System.out.println(h);
}
C#
public static void Main (string[] args)
{
double h, z, lat0, n0, eSq;
z = 4488055.516;
lat0 = 0.7853981634671384;
n0 = 6388838.290122733;
eSq = 0.0066943799901975545;
h = z / Math.Sin(lat0) - n0 * (1 - eSq);
Console.WriteLine(h);
}
or
4488055,516/sin(0,7853981634671384)-6388838,290122733*(1-0,0066943799901975545)
for SpeedCrunch, Maxima and LibreOffice Calc.
Results are:
Java: 1000.0000555226579 (same with and without strictfp)
C#: 1000,00005552359
SpeedCrunch (15): 1000,000055524055155
SpeedCrunch (50): 1000,00005552405515548724762598846216107366705932894830
LibreOffice Calc: 1000,000055523590000
Maxima: 1000.000055523589
Maxima: 1.00000005552391142b3 (bfloat - fpprec:20)
As you can see, Java and C# are different at the 9th decimal place. Others are not so uniform either. This is tested on the same OS and same CPU. Tests are done also on 32-bit and 64-bit systems.
How to solve this kind of problem? I thought that precision should be equal to 15 decimal places.
The cause of not getting 15 digits is the subtraction of two similar numbers. z / Math.sin(lat0) is about 6347068.978968251. n0 * (1 - eSq) is about 6346068.978912728. With 7 digits before the decimal point, a change in the 9th decimal place in the subtraction result corresponds to a change of less than one part in 10^15 in one of those inputs.
The simplest solution to this type of problem is usually to display only the digits that are supported by reliable digits in the inputs. Very few measurements can be done to one part in 10^12, so in this case it is almost certain that the digits that differ due to floating point rounding error would be dropped.
For example, it looks as though your data relates to location. One of the most carefully measured pieces of such data is the height of Mount Everest. The current best estimate, based on high precision GPS measurement, is "29,035 feet, with an error margin of plus or minus 6.5 feet" Encyclopedia Britannica. An error in the 13th significant digit corresponds to an error of less than one thousandth of an inch in measuring the circumference of the earth.
If the rounding error really is significant relative to the result requirements, and to what is realistically achievable given the accuracy of the inputs, then you might need to look at more clever ways of arranging the calculation or at higher precision arithmetic.
The precision difference you are observing is not in floating point support per se. Java and C# will be using the same (IEE 768) floating point representations, and quite likely the same instructions.
What you are observing is probably differences in the algorithms that calculate transcendental functions; e.g. the sine function. The theoretical calculation involves summing an infinite series. To get the most accurate answer possible, you keep summing the series forever. To get the answer to a certain precision, you keep summing until the "deltas" are less than the required precision.
In practice, this approach is impractically slow. Practical algorithms are implemented using tables and interpolation, where possible. This is much faster, though you don't get maximal precision.
The precision of Java Math.sin is determined by the algorithms used to compute it. These algorithms are platform specific. Here is what the javadocs for Math say on the subject of precision.
Unlike some of the numeric methods of class StrictMath, all implementations of the equivalent functions of class Math are not defined to return the bit-for-bit same results. This relaxation permits better-performing implementations where strict reproducibility is not required.
And the precision guarantee for Math.sin is:
The computed result must be within 1 ulp of the exact result. Results must be semi-monotonic.
where "ulp" and "semi-monotonic" are specified in the javadoc.
By contrast the javadocs for StrictMath state that a specific version of a specific open-source library is used to do the calculations. The goal is reproducibility; i.e. the same answer on all Java platforms.
Q: So why didn't they make Math.sin more precise? It is possible to get to within 0.5 ulp ...
It is an engineering trade-off between speed and precision. Read this wikipedia article to get a feel for the problem.
My advice is that if you want maximal precision, look for a 3rd party open-source Java library that implements the transcendental functions. And be prepared for your code to be a lot slower.

Java float and double diff

I am using jdk 1.6. This is my code.
float f = 10.0f;
double d = 10.0;
System.out.println("Equal Status : " + (f == d));
then the system shows the answer as true. But if I modified the value as
float f = 10.1f;
double d = 10.1;
System.out.println("Equal Status : " + (f == d));
then the system shows the answer as false. I know the system use Bit matching for == checking. But what is the reason behind. Can you explain about it? Thanks in advance.
While this is not "my" answer, this is about as close to "must read" literature for programmers who want to move from "meh" to "good." Great is something truly special, so don't think that "good" is anything to sneeze at. :)
What Every Programmer Needs to know about Floating Point
The link #Sam suggested is great but still too technical for me :P
I will just give some opinions to OP for handling floating point (probably a bit off-topic because you are asking for the reason behind. For the reason behind, read the link #Sam suggested).
Never assume floating point number is going to give you accurate representations. Sometimes it can but not always. Floating point has its constraint in "significant figures" which it is "accurate" for the first n-th digit.
Your situation is even worse cause you are mixing float and double, but the idea to solve is similar.
You need to decide to what precision your application needs the calculation result to be, and decide an Epsilon value base on it. For example, your application needs only accuracy to 3 decimal place, probably a Epsilon of 0.0005 is reasonable.
Comparing two floating point number shouldn't be done by ==, you should use
(a + EPSILON > b && a - EPSILON < b). Similarly, a > b should be expressed as a - EPSILON > b
Points to remember are
10.1 is a repeating sequence in binary 1010101010......
When comparing a float and a double the float is converted to a
double by adding zerro's to fill the number out
so you will be comparing
1010101...00000000... to 1010101.....101010... which are different.
float f = 10.1f;
double d = 10.1;
System.out.println("Equal Status : " + (f == (float)d));
will give the answer of true
IMHO, Generally speaking for 99% of use case double is a better choice because it is more accurate. i.e. don't use float unless you have to.
BigDecimal can be used to display the actual representation of a float or double. You don't see this normally as the toString will perform a small amount of rounding (as it is coded to accomodate the types representation limitations)
System.out.println("10.1f is actually " + new BigDecimal(10.1f));
System.out.println("10.1 is actually " + new BigDecimal(10.1));
prints
10.1f is actually 10.1000003814697265625
10.1 is actually 10.0999999999999996447286321199499070644378662109375
You can see that the double value is closer to the desired 10.1 but is not exactly this value. The reason the values are different is that in each case, it have the closest resprentable value for that type.
float is a 32 bit type whereas double is a 64 bit type.
You ran into the classical floating point precision problem.
Floats are imprecise. The actual values of 10.1f and 10.1 will be slightly different due to rounding. Floats are binary, not decimal, so numbers that look "simple" to us, like 10.1, can't be represented exactly as floats.
You would want to refresh yourself on the IEEE floating point standards for both 32 and 64-bit floating point representations. If you peel into the internals of this, you'll see clearly as to why these floating points behave finicky.
If you're curious about how it's represented internally (which is why it's failing), you can use this code, which shows the hexadecimal representations of these numbers. From there, you can match them up with the exponents and mantissas of single and double precision.
System.out.printf("Float 10.0: 0x%X\n", Float.floatToRawIntBits((float)10.0));
System.out.printf("Double 10.0: 0x%X\n", Double.doubleToRawLongBits(10.0));
System.out.printf("Float 10.1: 0x%X\n", Float.floatToRawIntBits((float)10.1));
System.out.printf("Double 10.1: 0x%X\n", Double.doubleToRawLongBits(10.1));
prints
Float 10.0: 0x41200000
Double 10.0: 0x4024000000000000
Float 10.1: 0x4121999A
Double 10.1: 0x4024333333333333
You'll notice that there is some repetition in the way the values are represented. This is because 1/10 can't be represented in a finite space of base 2.

Categories