double numbers are not too accurate - java

I've wrote a method for polynomial long division. And it works perfect with "good" polynomials. Under "good" I mean coefficients that divides accurate. Today I've faced with issue when tried to divide 2*x^3-18*x^2+.... / 7.00000(much zeros)0000028*x^2 + 5*x + ... After division 2*x^3 / 7.000...000028*x^2 I got 0.285714....53*x. On next step we need to multiply 0.2857....53*x on 7.00000...0000028*x^2 + 5*x + .. and subtract it from dividend polynomial 2*x^3-18*x^2+... and get new polynomial with degree = 2. But because of problem with double type I actually got polynomial 2.220....E-16*x^3 - 6*x^2 + .... I know that it is in fact zero near the x^3. I do not want to invent smth new and strange, that is why I am asking how to resolve it beautifully and correctly. Thanks.

Many division results such as 1/7 cannot be represented exactly in either double or BigDecimal. If you go with BigDecimal you would have to pick a number of digits to preserve, and deal with rounding error. For double, you get more convenient arithmetic, but a fixed number of significant bits.
You have two options.
One is to handle rounding error. When a result is very close to zero, so close that it is probably due to rounding error, treat it as zero. I don't know whether that will work for your algorithm or not. If you go this way, you can use either double or BigDecimal.
The second option is to use a rational number package. In rational number arithmetic all division results can be represented exactly. 1/7 remains 1/7, without being rounded to a terminating decimal or binary fraction. If you go this way, search for "java rational number" (no quotes) and decide which one you like best.

Related

Appropriate scale for converting via BigDecimal to floating point

I've written an arbitrary precision rational number class that needs to provide a way to convert to floating-point. This can be done straightforwardly via BigDecimal:
return new BigDecimal(num).divide(new BigDecimal(den), 17, RoundingMode.HALF_EVEN).doubleValue();
but this requires a value for the scale parameter when dividing the decimal numbers. I picked 17 as the initial guess because that is approximately the precision of a double precision floating point number, but I don't know whether that's actually correct.
What would be the correct number to use, defined as, the smallest number such that making it any larger would not make the answer any more accurate?
Introduction
No finite precision suffices.
The problem posed in the question is equivalent to:
What precision p guarantees that converting any rational number x to p decimal digits and then to floating-point yields the floating-point number nearest x (or, in case of a tie, either of the two nearest x)?
To see this is equivalent, observe that the BigDecimal divide shown in the question returns num/div to a selected number of decimal places. The question then asks whether increasing that number of decimal places could increase the accuracy of the result. Clearly, if there is a floating-point number nearer x than the result, then the accuracy could be improved. Thus, we are asking how many decimal places are needed to guarantee the closest floating-point number (or one of the tied two) is obtained.
Since BigDecimal offers a choice of rounding methods, I will consider whether any of them suffices. For the conversion to floating-point, I presume round-to-nearest-ties-to-even is used (which BigDecimal appears to use when converting to Double or Float). I give a proof using the IEEE-754 binary64 format, which Java uses for Double, but the proof applies to any binary floating-point format by changing the 252 used below to 2w-1, where w is the number of bits in the significand.
Proof
One of the parameters to a BigDecimal division is the rounding method. Java’s BigDecimal has several rounding methods. We only need to consider three, ROUND_UP, ROUND_HALF_UP, and ROUND_HALF_EVEN. Arguments for the others are analogous to those below, by using various symmetries.
In the following, suppose we convert to decimal using any large precision p. That is, p is the number of decimal digits in the result of the conversion.
Let m be the rational number 252+1+½−10−p. The two binary64 numbers neighboring m are 252+1 and 252+2. m is closer to the first one, so that is the result we require from converting m first to decimal and then to floating-point.
In decimal, m is 4503599627370497.4999…, where there are p−1 trailing 9s. When rounded to p significant digits with ROUND_UP, ROUND_HALF_UP, or ROUND_HALF_EVEN, the result is 4503599627370497.5 = 252+1+½. (Recognize that, at the position where rounding occurs, there are 16 trailing 9s being discarded, effectively a fraction of .9999999999999999 relative to the rounding position. In ROUND_UP, any non-zero discarded amount causes rounding up. In ROUND_HALF_UP and ROUND_HALF_EVEN, a discarded amount greater than ½ at that position causes rounding up.)
252+1+½ is equally close to the neighboring binary64 numbers 252+1 and 252+2, so the round-to-nearest-ties-to-even method produces 252+2.
Thus, the result is 252+2, which is not the binary64 value closest to m.
Therefore, no finite precision p suffices to round all rational numbers correctly.

How to round a double/float to BINARY precision?

I am writing tests for code performing calculations on floating point numbers. Quite expectedly, the results are rarely exact and I would like to set a tolerance between the calculated and expected result. I have verified that in practice, with double precision, the results are always correct after rounding of last two significant decimals, but usually after rounding the last decimal. I am aware of the format in which doubles and floats are stored, as well as the two main methods of rounding (precise via BigDecimal and faster via multiplication, math.round and division). As the mantissa is stored in binary however, is there a way to perform rounding using base 2 rather than 10?
Just clearing the last 3 bits almost always yields equal results, but if I could push it and instead 'add 2' to the mantissa if its second least significast bit is set, I could probably reach the limit of accuracy. This would be easy enough, expect I have no idea how to handle overflow (when all bits 52-1 are set).
A Java solution would be preferred, but I could probably port one for another language if I understood it.
EDIT:
As part of the problem was that my code was generic with regards to arithmetic (relying on scala.Numeric type class), what I did was an incorporation of rounding suggested in the answer into a new numeric type, which carried the calculated number (floating point in this case) and rounding error, essentially representing a range instead of a point. I then overrode equals so that two numbers are equal if their error ranges overlap (and they share arithmetic, i.e. the number type).
Yes, rounding off binary digits makes more sense than going through BigDecimal and can be implemented very efficiently if you are not worried about being within a small factor of Double.MAX_VALUE.
You can round a floating-point double value x with the following sequence in Java (untested):
double t = 9 * x; // beware: this overflows if x is too close to Double.MAX_VALUE
double y = x - t + t;
After this sequence, y should contain the rounded value. Adjust the distance between the two set bits in the constant 9 in order to adjust the number of bits that are rounded off. The value 3 rounds off one bit. The value 5 rounds off two bits. The value 17 rounds off four bits, and so on.
This sequence of instruction is attributed to Veltkamp and is typically used in “Dekker multiplication”. This page has some references.

Loss of precision after subtracting double from double [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Retain precision with Doubles in java
Alright so I've got the following chunk of code:
int rotation = e.getWheelRotation();
if(rotation < 0)
zoom(zoom + rotation * -.05);
else if(zoom - .05 > 0)
zoom(zoom - rotation * .05);
System.out.println(zoom);
Now, the zoom variable is of type double, initially set to 1. So, I would expect the results to be like 1 - .05 = .95; .95 - .05 = .9; .9 - .05 = .85; etc. This appears to be not the case though when I print the result as you can see below:
0.95
0.8999999999999999
0.8499999999999999
0.7999999999999998
0.7499999999999998
0.6999999999999997
Hopefully someone is able to clearly explain. I searched the internet and I read it has something to do with some limitations when we're storing floats in binary but I still don't quite understand. A solution to my problem is not shockingly important but I would like to understand this kind of behavior.
Java uses IEEE-754 floating point numbers. They're not perfectly precise. The famous example is:
System.out.println(0.1d + 0.2d);
...which outputs 0.30000000000000004.
What you're seeing is just a symptom of that imprecision. You can improve the precision by using double rather than float.
If you're dealing with financial calculations, you might prefer BigDecimal to float or double.
float and double have limited precision because its fractional part is represented as a series of powers of 2 e.g. 1/2 + 1/4 + 1/8 ... If you have an number like 1/10 it has to be approximated.
For this reason, whenever you deal with floating point you must use reasonable rounding or you can see small errors.
e.g.
System.out.printf("%.2f%n", zoom);
To minimise round errors, you could count the number of rotations instead and divide this int value by 20.0. You won't see a rounding error this way, and it will be faster, with less magic numbers.
float and double have precision issues. I would recommend you take a look at the BigDecimal Class. That should take care of precision issues.
Since decimal numbers (and integer numbers as well) can have an infinite number of possible values, they are impossible to map precisely to bits using a standard format. Computers circumvent this problem by limiting the range the numbers can assume.
For example, an int in java can represent nothing larger then Integer.MAX_VALUE or 2^31 - 1.
For decimal numbers, there is also a problem with the numbers after the comma, which also might be infinite. This is solved by not allowing all decimal values, but limiting to a (smartly chosen) number of possibilities, based on powers of 2. This happens automatically but is often nothing to worry about, you can interpret your result of 0.899999 as 0.9. In case you do need explicit precision, you will have to resort to other data types, which might have other limitations.

Weird floor rounding [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Moving decimal places over in a double
I am having this weird problem in Java, I have following code:
double velocity = -0.07;
System.out.println("raw value " + velocity*200 );
System.out.println("floored value " + Math.floor(velocity*200) );
I have following output:
raw value -14.000000000000002
floored value -15.0
Those traling 0002 screw everything up, and BTW there should not be that traling 2, I think it should be all zeroes after decimal point, can I get rid of that 2?
Update: Thanks for help, Guys do you know any way to make floor rounding on BigDecimal object without calling doubleValue method?
Because floor(-14.000000000000002) is indeed -15!
You see, floor is defined as the maximal whole number less or equal to the argument. As -14.000000000000002 is not a whole number, the closest whole number downwards is -15.
Well, now let's clear why -0.07 * 200 is not exactly -14. This is because the inner representation of floating-point numbers is in base 2, so the fractions where the denominator is not a power of 2 cannot be represented with 100% precision. (The same way as you cannot represent 1/3 as the decimal fraction with finite amount of decimal places.) So, the value of velocity is not exactly -0.07. (When the compiler sees the constant -0.07, it silently replaces it with a binary fraction which is quite close to -0.07, but not actually equal to.) This is why velocity * 200 is not exactly -14.
From The Floating-Point Guide:
Why don’t my numbers, like 0.1 + 0.2 add up to a nice round 0.3, and instead I get a weird result like 0.30000000000000004?
Because internally, computers use a format (binary floating-point)
that cannot accurately represent a number like 0.1, 0.2 or 0.3 at all.
When the code is compiled or interpreted, your “0.1” is already
rounded to the nearest number in that format, which results in a small
rounding error even before the calculation happens.
If you need numbers that exactly add up to specific expected values, you cannot use double. Read the linked-to site for details.
Use BigDecimal... The problem above is a well-known rounding problem with the representation schemes used on a computer with finite-memory. The problem is that the answer is repetitive in the binary (that is, base 2) system (i.e. like 1/3 = 0.33333333... with decimal) and cannot be presented correctly. A good example of this is 1/10 = 0.1 which is 0.000110011001100110011001100110011... in binary. After some point the 1s and 0s have to end, causing the perceived error.
Hope you're not working on life-critical stuff... for example http://www.ima.umn.edu/~arnold/disasters/patriot.html. 28 people lost their lives due to a rounding error.
Java doubles follow the IEEE 754 floating-point arithmetic, which can't represent every single real number with infinite accuracy. This round up is normal. You can't get rid of it in the internal representation. You can of course use String.format to print the result.

Weird Java behavior: How come adding doubles with EXACTLY two decimal places result to a double with MORE THAN two decimal places?

If I have an array of doubles that each have EXACTLY two decimal places, add them up altogether via a loop, and print out the total, what comes out is a number with MORE THAN two decimal places. Which is weird, because theoretically, adding two numbers that each have 2 and only 2 decimal places will NEVER produce a number that has a non-zero digit beyond the hundredths place.
Try executing this code:
double[] d = new double[2000];
for (int i = 0; i < d.length; i++) {
d[i] = 9.99;
}
double total = 0,00;
for (int i = 0; i < d.length; i++) {
total += d[i];
if (("" + total).matches("[0-9]+\\.[0-9]{3,}")) { // if there are 3 or more decimal places in the total
System.out.println("total: " + total + ", " + i); // print the total and the iteration when it occured
}
}
In my computer, this prints out:
total: 59.940000000000005, 5
If I round off the total to two decimal places then I'd get the same number as I would if I manually added 9.99 six times on a calculator. But how come this is happening and where are the extra decimal places coming from? Am I doing something wrong or (I doubt this is likely) is this a Java bug?
Are you familiar with base 10 to base 2 conversion (decimal to binary) for fractions? If not, look it up.
Then you'll see that although 9.99 looks pretty normal in base 10, it doesn't really look that nice in binary; It looks like a repeating decimal, but in binary. I'm sure you've seen a repeating decimal before, right? It doesn't end. But Java (or any language for that matter) has to save that infinite sequence of digits into a limited number of bytes. And that's when the extra digits appear. When you convert that truncated binary back to decimal, you're really dealing with a different number. The number stored in the variable isn't 9.99 exactly, it something like 9.9999999991 (just an example, I didn't work out the math).
But you're probably interested on how to solve this, right? Look up the BigDecimal class. That's what you want to use for your calculations, especially when dealing with currency. Also, look up DecimalFormat, which is a class for writing a number as a properly formatted string. I think it does rounding for you when you want to show only 2 decimal digits and your number has a lot more, for example.
If I have an array of doubles that each have EXACTLY two decimal places
Let's stop right there, because I suspect you don't. For example, you give 9.99 in your sample code. That isn't really 9.99. That's "the closest double to 9.99" as 9.99 itself can't be exactly represented in binary floating point.
At that point, the rest of your reasoning goes out of the window.
If you want values with an exact number of decimal digits, you should use a type which stores values in a decimal-centric manner, such as BigDecimal. Alternatively, store everything as integers and "know" that you're actually remembering "the value * 100" instead.
Doubles are represented in a binary format on the computer (). This means that certain numbers cannot be represented accurately, so the computer will use the closest number that can be represented.
E.g. 10.5 = 2^3+2+2^(-1) = 1.0101 * 2^3 (here the mantissa is in binary)
but 10.1 = 2^3+2+2^(-4)+2^(-5)+(infinite series here) = 1.0100001... * 2^3
9.99 is such a number with infinite representation. Thus when you add them together, the finite representation used by the computer is used in the calculation and the result will be even more further away from the mathematical sum than the originals were from their true representation. This is why you see more digits displayed than used in the original numbers.
this is because of floating point arithmetics.
doubles and floats are not exactly real numbers, there are finite number of bits to represent them while there are infinite number of real numbers [in any range], so you cannot represent all real numbers - You are getting the closest number you can have with the floating point representation.
Whenever you deal with floating points - remember that they are only an approximation to the number you are seeking. You might want to use BigDecimal if you want the exact number [or at least control the error].
More info can be found at this article
Use BigDecimal to perform floating point calculations with precision. It's a must when it comes to money.
This is a known issue that stems in the fact that binary calculations don't allow for precise floating point operations. Look at "floating point arithmetics" for more details.
This is due to inaccuracies when it comes to representing decimal numbers using a binary floating point value. In other words, the double literal 0.99 does not actually represent the mathematical value 9.99.
To reveal exactly what number a value, such as 9.99 represents you could let BigDecimal print the value.
Code to reveal the exact value:
System.out.println(new BigDecimal(9.99));
Output:
9.9900000000000002131628207280300557613372802734375
Btw, your reasoning would be completely accurate if you were taking about binary places instead of decimal places, since a number with two binary places can be exactly represented by a binary floating point value.

Categories