I have the following statement:
float diff = tempVal - m_constraint.getMinVal();
tempVal is declared as a float and the getMinVal() returns a float value.
I have the following print out:
diff=0.099999905, tempVal=5.1, m_constraint.getMinVal()=5.0
I expect the diff is 0.1 but not the above number. how to do that?
Floats use the IEEE754 to represent numbers, and that system has some rounding errors.
Floating point guide
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Wikipedia on IEE754
Bottom-line if you are doing arithmetic and it needs to be exact don't use float or double but us BigDecimal
Because of the way they store values internally, floats and doubles can only store completely accurately numbers which can be decomposed into a sum of powers of 2 (and then, within certain constraints relating to their absolute and relative magnitude).
So as soon as you attempt to store, or perform a calculating involving, a number which cannot be stored exactly, you are going to get an error in the final digit.
Usually this isn't a problem provided you use floats and doubles with some precaution:
use a size of floating point primitive which has "spare" digits of precision beyond what you need;
for many applications, this probably means don't use float at all (use double instead): it has very poor precision and, with the exception of division, has no performance benefit on many processors;
when printing FP numbers, only actually print and consider the number of digits of precision that you need, and certainly don't include the final digit (use String.format to help you);
if you need arbitrary number of digits of precision, use BigDecimal instead.
You cannot get exact results with floating point numbers. You might need to use a FixedPoint library for that. See : http://sourceforge.net/projects/jmfp/
Java encodes real numbers using binary floating point representations defined in IEEE 754. Like all finite representations it cannot accurately represent all real numbers because there is far more real numbers than potential representations. Numbers which cannot be represented exactly (like 0.1 in your case) are rounded to the nearest representable number.
Related
I've written an arbitrary precision rational number class that needs to provide a way to convert to floating-point. This can be done straightforwardly via BigDecimal:
return new BigDecimal(num).divide(new BigDecimal(den), 17, RoundingMode.HALF_EVEN).doubleValue();
but this requires a value for the scale parameter when dividing the decimal numbers. I picked 17 as the initial guess because that is approximately the precision of a double precision floating point number, but I don't know whether that's actually correct.
What would be the correct number to use, defined as, the smallest number such that making it any larger would not make the answer any more accurate?
Introduction
No finite precision suffices.
The problem posed in the question is equivalent to:
What precision p guarantees that converting any rational number x to p decimal digits and then to floating-point yields the floating-point number nearest x (or, in case of a tie, either of the two nearest x)?
To see this is equivalent, observe that the BigDecimal divide shown in the question returns num/div to a selected number of decimal places. The question then asks whether increasing that number of decimal places could increase the accuracy of the result. Clearly, if there is a floating-point number nearer x than the result, then the accuracy could be improved. Thus, we are asking how many decimal places are needed to guarantee the closest floating-point number (or one of the tied two) is obtained.
Since BigDecimal offers a choice of rounding methods, I will consider whether any of them suffices. For the conversion to floating-point, I presume round-to-nearest-ties-to-even is used (which BigDecimal appears to use when converting to Double or Float). I give a proof using the IEEE-754 binary64 format, which Java uses for Double, but the proof applies to any binary floating-point format by changing the 252 used below to 2w-1, where w is the number of bits in the significand.
Proof
One of the parameters to a BigDecimal division is the rounding method. Java’s BigDecimal has several rounding methods. We only need to consider three, ROUND_UP, ROUND_HALF_UP, and ROUND_HALF_EVEN. Arguments for the others are analogous to those below, by using various symmetries.
In the following, suppose we convert to decimal using any large precision p. That is, p is the number of decimal digits in the result of the conversion.
Let m be the rational number 252+1+½−10−p. The two binary64 numbers neighboring m are 252+1 and 252+2. m is closer to the first one, so that is the result we require from converting m first to decimal and then to floating-point.
In decimal, m is 4503599627370497.4999…, where there are p−1 trailing 9s. When rounded to p significant digits with ROUND_UP, ROUND_HALF_UP, or ROUND_HALF_EVEN, the result is 4503599627370497.5 = 252+1+½. (Recognize that, at the position where rounding occurs, there are 16 trailing 9s being discarded, effectively a fraction of .9999999999999999 relative to the rounding position. In ROUND_UP, any non-zero discarded amount causes rounding up. In ROUND_HALF_UP and ROUND_HALF_EVEN, a discarded amount greater than ½ at that position causes rounding up.)
252+1+½ is equally close to the neighboring binary64 numbers 252+1 and 252+2, so the round-to-nearest-ties-to-even method produces 252+2.
Thus, the result is 252+2, which is not the binary64 value closest to m.
Therefore, no finite precision p suffices to round all rational numbers correctly.
Our teacher asked us to search about this and what I kept on getting from the net are explanations stating what double and float means.
Can you tell me whether it is possible or not, and explain why or why not?
Simple answer: yes, but only if the double is not too large.
float's are single-precision floating point numbers, meaning they use a 23-bit mantissa and 8-bit exponent, corresponding to ~6/7 s.f. precision and ~ 10^38 range.
double's are double-precision - with 52-bit mantissa and 11-bit exponent, corresponding to ~14/15 s.f. precision and ~ 10^308 range.
Since double's have larger range than floats, adding a float to a very large double will nullify the float's effects (called underflow). Of course this can happen for two double types as well.
https://en.wikipedia.org/wiki/Floating_point
Can you add two numbers with varying decimal places (e.g. 432.54385789364 + 432.1)? Yes you can.
In Java, it is the same idea.
From the Java Tutorials:
float: The float data type is a single-precision 32-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. As with the recommendations for byte and short, use a float (instead of double) if you need to save memory in large arrays of floating point numbers. This data type should never be used for precise values, such as currency. For that, you will need to use the java.math.BigDecimal class instead. Numbers and Strings covers BigDecimal and other useful classes provided by the Java platform.
double: The double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. For decimal values, this data type is generally the default choice. As mentioned above, this data type should never be used for precise values, such as currency.
Basically, they are both holders to decimals. The way that they are different is how precise they can be. A float can only be 32 bits in size, compared to a double which is 64 bits in size. A float can have precision up to around 5 or 6 float point numbers, and a double can have precision up to around 10 floating point numbers.
Basically... a double can store a decimal better than a float... but takes up more space.
To answer your question, you can add a float to a double and vice versa. Generally, the result will be made into a double, and you will have to cast it back to a float if that is what you want.
If you want to be really deep about it you should say yes it is possible due to value coercion, but that it opens the door for more severe precision errors to accumulate invisibly to the compiler. float has substantially precision than double and is very regrettably the default type of literal floating-point numbers in Java source. In practice make sure to use the d suffix on literals to make sure theh are double if you have to use floating point.
These precision errors can lead to serious harm and even loss of life in sensitive systems.
Floating point is very hard to use correctly and should be avoided if possible. One extremely obvious thing not to do that is commonly mistakenly done is representing currency as a float or double. This can cause real money to be effectively given to or stolen from people.
Floating point (preferring double) is appropriate for approximate calculations and certain high performance scientific computing applications. However it is still extremely important to be aware of the precision loss characteristics particularly when a resulting floating point value is fed into further floating-point calculations.
This more generally leads in Numerical Computing and now I've really gone afield :)
SAS has a decent paper on this:
http://support.sas.com/resources/papers/proceedings11/275-2011.pdf
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 8 years ago.
While running kmeans clustering in java the absolute difference between the data points 0.33 and 0.99 is displayed as 0.659999999 instead of 0.66.
Why is that?
Both the variables holding the data are of type double and I am using the Math.abs() function.
I saw such a problem only for 0.99. When subtracting using other values, the results were fine.
Thanks for any help
Floating-point datatype (float and double) can't be accurately represented in memory bits. They are approximately represented in memory.
Squeezing infinitely many real numbers into a finite number of bits
requires an approximate representation. Although there are infinitely
many integers, in most programs the result of integer computations can
be stored in 32 bits. In contrast, given any fixed number of bits,
most calculations with real numbers will produce quantities that
cannot be exactly represented using that many bits. Therefore the
result of a floating-point calculation must often be rounded in order
to fit back into its finite representation. This rounding error is the
characteristic feature of floating-point computation
What Every Computer Scientist Should Know About Floating-Point Arithmetic
This is how floating point numbers behave. They are not accurate.
Check this:- What Every Computer Scientist Should Know About Floating-Point Arithmetic
Also to add Floating point numbers use binary fractions and not decimal fractions. And if you need exact decimal values, you should use java.math.BigDecimal
You may check this answer as well for more reasoning and details:
Floating point rounding errors. 0.1 cannot be represented as
accurately in base-2 as in base-10 due to the missing prime factor of
5.
Doubles are not exact due to the way they are stored in memory. More information here:
https://en.wikipedia.org/wiki/Double-precision_floating-point_format
If you need an exact result, you should look into BigDecimal
I have A String that is formatted correctly to be cast to a double and it works fine for most decimals. The issue is that for .33, .67, and possibly others I haven't tested, the decimal becomes something like .6700000000002, or .329999999998. I understand why this happens but does any one have a suggestion to fix it.
It's a result of IEEE-754 rounding rules, some numbers cannot be represented precisely in two's complement. For example, 1/10 is not precisely representable.
You can add more precision (but not infinite) by using BigDecimal.
BigDecimal oneTenth = new BigDecimal("1").divide(new BigDecimal("10"));
System.out.println(oneTenth);
Which outputs 0.1
Some decimal numbers can not be represented accurately with the internal base 2 machine representation.
That's double precision for you. Binary numbers and decimals don't work well together. Unless you are doing something really precise it should be fine, if you are printing it you should use either decimal format or printf.
Value of floating point numbers are not stored directly but with exponential values. You may write 3.1233453456356 as number, but this is stored something like 3 and 2^6 in memory. It tries to store a value as close as your number, but those differences can happen.
It shouldn't be a problem unless you're testing for equality. With floating-point tests for equality you'll need to allow a "delta" so that:
if (a == b)
becomes
if (abs(a-b) < 0.000001)
or a similar small delta value. For printing, limit it to two decimal places and the formatter will round it for you.
In java I am using float to store the numbers. I chose the float format as I am working both with integers and double numbers, where the numbers are different, there can be big integers or big double numbers with different number of decimals. But when I insert these numbers into database, the wrong number is stored. For example:
float value = 0f;
value = 67522665;
System.out.println(value);
Printed: 6.7522664E7 and it is stored in the database as 67522664 not as 67522665
Floating point numbers have limited resolution — roughly 7 significant digits. You are seeing round-off error. You can use a double for more resolution or, for exact arithmetic, use BigDecimal.
Suggested reading: What Every Computer Scientist Should Know About Floating-Point Arithmetic
Doubles and floats have storage issues.
How is floating point stored?
"The float and double types are designed primarily for scientific and engineering
calculations. They perform binary floating-point arithmetic, which was carefully
designed to furnish accurate approximations quickly over a broad range of magnitudes.
They do not, however, provide exact results and should not be used where
exact results are required."
Don't use float. Use BigDecimal instead. And in my experience with databases, they return their NUMBER-typed elements as BigDecimal. When I fetch them using JDBC, they are BigDecimal objects.
As far as I got it, this is about the gap size (or ULP, units in the last place) in the binary representation, that is the spacing between contiguous f-point values.
This value is equal to:
2^(e+1-p)
being e the actual exponent of a number, and p the precision.
Note that the spacing (or gap) increases as the value of the represented number increases:
In IEEE-754, the precision is p 24, so you can see that when e >= 23 we can start talking of integer spacing in the floating point world.
2^23 = 8388608 --> 8388608 actually stored IEEE-754
8388608.2 --> 8388608 actually stored IEEE-754
Things get worse as numbers get bigger. For example:
164415560 --> 164415552 actually stored IEEE-754
Ref: The Spacing of Binary Floating-Point Numbers