I have experimented what is wrong with float and double types, in Java System.out.print(1-.6) prints .4 and the result is a bit unexpected (0.30000000000000004) in case of System.out.print(1-.7). It would be helpful if anyone is able to direct me towards some resources that explain WHY does it happen. I am assuming its not Java specific its something inherently wrong with these types.
Thanks!
The real types in Java are implementations of IEEE754 single and double precision floating point notation. These are approximations of real numbers rather than exact representations. Some real numbers like 0.8 cannot be represented accurately.
As said Vincent the float and double types cannot store values that will not be represented as the sum of 2^-n values (n size depends on the implementation).
Use the BigDecimal class instead.
Related
Our teacher asked us to search about this and what I kept on getting from the net are explanations stating what double and float means.
Can you tell me whether it is possible or not, and explain why or why not?
Simple answer: yes, but only if the double is not too large.
float's are single-precision floating point numbers, meaning they use a 23-bit mantissa and 8-bit exponent, corresponding to ~6/7 s.f. precision and ~ 10^38 range.
double's are double-precision - with 52-bit mantissa and 11-bit exponent, corresponding to ~14/15 s.f. precision and ~ 10^308 range.
Since double's have larger range than floats, adding a float to a very large double will nullify the float's effects (called underflow). Of course this can happen for two double types as well.
https://en.wikipedia.org/wiki/Floating_point
Can you add two numbers with varying decimal places (e.g. 432.54385789364 + 432.1)? Yes you can.
In Java, it is the same idea.
From the Java Tutorials:
float: The float data type is a single-precision 32-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. As with the recommendations for byte and short, use a float (instead of double) if you need to save memory in large arrays of floating point numbers. This data type should never be used for precise values, such as currency. For that, you will need to use the java.math.BigDecimal class instead. Numbers and Strings covers BigDecimal and other useful classes provided by the Java platform.
double: The double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. For decimal values, this data type is generally the default choice. As mentioned above, this data type should never be used for precise values, such as currency.
Basically, they are both holders to decimals. The way that they are different is how precise they can be. A float can only be 32 bits in size, compared to a double which is 64 bits in size. A float can have precision up to around 5 or 6 float point numbers, and a double can have precision up to around 10 floating point numbers.
Basically... a double can store a decimal better than a float... but takes up more space.
To answer your question, you can add a float to a double and vice versa. Generally, the result will be made into a double, and you will have to cast it back to a float if that is what you want.
If you want to be really deep about it you should say yes it is possible due to value coercion, but that it opens the door for more severe precision errors to accumulate invisibly to the compiler. float has substantially precision than double and is very regrettably the default type of literal floating-point numbers in Java source. In practice make sure to use the d suffix on literals to make sure theh are double if you have to use floating point.
These precision errors can lead to serious harm and even loss of life in sensitive systems.
Floating point is very hard to use correctly and should be avoided if possible. One extremely obvious thing not to do that is commonly mistakenly done is representing currency as a float or double. This can cause real money to be effectively given to or stolen from people.
Floating point (preferring double) is appropriate for approximate calculations and certain high performance scientific computing applications. However it is still extremely important to be aware of the precision loss characteristics particularly when a resulting floating point value is fed into further floating-point calculations.
This more generally leads in Numerical Computing and now I've really gone afield :)
SAS has a decent paper on this:
http://support.sas.com/resources/papers/proceedings11/275-2011.pdf
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 7 years ago.
I am executing the following code in java but i got two different answers for what should be the same number mathematically.
public class TestClass {
public static void main(String[] args) {
double a=0.01;
double b=4.5;
double c=789;
System.out.println("Value1---->"+(a*b*c));
System.out.println("Value2---->"+(b*c*a));
}
}
Output:
Value1---->35.504999999999995
Value2---->35.505
Floating point numbers have a certain precision. Some fractions can not be displayed correctly with floating point numbers, that's why rounding errors can occur.
The results are different because of the precedence of the calculations. Each of your calculations consists of two multiplications. The multiply * operator in Java has a left to right associativity. That means that in (a*b*c), a*b is calculated first and then multiplied by c as in ((a*b)*c). One of those calculation chains happens to produce a rounding error because a number in it simply can't be represented as a floating point number.
Essentially, Java uses binary floating point values to handle all of its decimal based operations. As mentioned, in another answer, here is a link to the IEEE 754 that addresses the issue you've encountered. And as also mentioned in Joshua Bloch's Effective Java, refer to item 48 "Avoid float and double if exact answers are required":
In summary, don’t use float or double for any calculations that require an
exact answer. Use BigDecimal if you want the system to keep track of the decimal
point and you don’t mind the inconvenience and cost of not using a primitive type.
It is because type double is an approximation.
Double in Java denotes to IEEE 754 standart type decimal64.
To resolve this problem use Math.round() or either BigDecimal class.
Multiplication of floating points uses a process that introduces precision errors.
To quote Wikipedia:
"To multiply, the significands are multiplied while the exponents are added, and the result is rounded and normalized."
Java multiplies from left to right. In your example, the first parts (a * b and b * c) actually produce no precision errors.
So your final multiplications end up as:
System.out.println("Value1---->" + (0.045 * 789));
System.out.println("Value2---->" + (3550.5 * 0.01));
Now, 0.045 * 789 produces a precision error due to that floating point multiplication process. Whereas 3550.5 * 0.01 does not.
'Cause double * double will be double, and that not totally precise.
Try the following code:
System.out.println(1.0-0.9-0.1) // -2.7755575615628914E-17
If you want totally precise real numbers, use BigDecimal instead!
This is because double has finite precision. Binary representation can't store exactly the value of for example 0.01. See also wikipedia entry on double precision floating point numbers.
Order of multiplication can change the way that representation errors are accumulated.
Consider using BigDecimal class, if you need precision.
As JavaDoc the double is a floating point type, and it's imprecise by nature. That why two exactly identical operation will wield different results, since the float point type (double) is an approximation.
See http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html :
double: The double data type is a double-precision 64-bit IEEE 754
floating point. Its range of values is beyond the scope of this
discussion, but is specified in the Floating-Point Types, Formats, and
Values section of the Java Language Specification. For decimal values,
this data type is generally the default choice. As mentioned above,
this data type should never be used for precise values, such as
currency.
See also the wikipedia http://en.wikipedia.org/wiki/Floating_point :
The floating-point representation is by far the most common way of
representing in computers an approximation to real numbers.
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
Quote: "double: The double data type is a double-precision 64-bit IEEE 754 floating point."
When you dig into IEEE 754 you will understand how doubles are stored in memory.
For such calculations I would recommend http://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
See this reply from 2011
Java:Why should we use BigDecimal instead of Double in the real world?
It's called loss of precision and is very noticeable when working with either very big numbers or very small numbers.
See the section
Decimal numbers are approximations
And read down
As mentioned, there is an issue with the floating precision. You can either use printf or you can use Math.round() like so (change the number of zeros to affect precision):
System.out.println("Value 1 ----> " + (double) Math.round((a*b*c) * 100000) / 100000);
System.out.println("Value 2 ----> " + (double) Math.round((b*c*a) * 100000) / 100000);
Output
Value 1 ----> 35.505
Value 2 ----> 35.505
Casting for integers is very straightforward, the extra bits simply disappear.
But, is it important to understand what is happening under the hood for casting floating point? I've tried to read information on how floating point is calculated, but I have yet to find one that explains it well. At least that's my excuse. I get the basic idea although the calculation of the mantissa is a bit difficult.
At least up to Java 7, I understand that floating points cannot be used in bitwise operations. Which makes sense because of how they are stored internally. Is there anything important that is needed to know on how floating points operate or are cast?
So, to Summarize:
Is it important to understand the internal workings of floating point like integers?
What is the internal process of casting a floating point to an integer?
What is the internal process of casting a floating point to an integer?
Java calls the machine code instruction which does this in compliance with the IEEE-754 standard. There is nothing for Java to do as such. If you want to know how casting works I suggest you read the standard.
Basically, the mantissa is shifted by the exponent and the sign applied. i.e. a floating point number is sign * 2^exponent * mantissa and all it does is perform this calculation and drop and fractional parts.
First, you need to understand that a floating point number is essentially an approximation. You can put in, say 1.23 and get out 1.229998 (or some such), because 1.23 is represented exactly. Regardless of whether you will be doing any casts, you need to understand this, and how it affects computations (and especially comparisons).
From the standpoint of cast, casting a float to a double causes no loss of information, since a double can contain every value that a float can contain. But casting from double to float can cause loss of precision (and, for very large or small numbers, exponent overflow/underflow), since there's simply more information in a 64-bit value than in a 32-bit one, so some data's going to end up "on the floor".
Similarly, casting from an int to a double causes no loss of information, since a double can contain every value an int can contain and then some. But casting from int to float or from long to double or float can result in loss of precision (though there can never be an exponent overflow/underflow).
Casting from float or double to int or long can easily result in overflow/underflow and major loss of data, if the float or double value has a large positive exponent or any negative exponent. And, of course, when you cast from floating-point to fixed the fractional part of the number is truncated (essentially a "floor" operation).
Please explain:
I'm declaring a class with 2 constructors as following:
class A {
public :
A (double x) {cout << "DOUBLE \n";}
A (float x) {cout << "FLOAT \n";}
};
Then:
A a (3.7);
This result with DOUBLE as output.
I've tried this also on java - same result.
Can anyone explain why?
EDIT: I do realise double is the default type for number such as 3.7 My question is why and if there is a good reason for that.
This is because the 3.7 literal is a double. If you want float, use 3.7f. In C++, it is specified in the standard, 2.14.4 Floating Literals. The most relevant section is
The type of a floating literal is double unless explicitly specified by a suffix. The suffixes f and F specify
float, the suffixes l and L specify long double.
This doesn't answer why this is so. I imagine it is because the way it was in C, and the reason it is that way in C must be, to some level, arbitrary.
There seem to have been at least a couple of reasons for this.
First of all, the PDP-11 floating point unit had a single precision mode and a double precision mode. Switching between modes was possible, but fairly slow. At the same time, execution in double precision mode was almost as fast as in single precision mode (if memory serves, even faster in a few cases).
Second, early C didn't have a way to specify function parameter types. The standard library functions only accepted double precision floating point (since it gave extra precision almost for free). Writing the library to deal with both single and double precision floating point would have (approximately) doubled the effort, but provided little real advantage.
By default 3.7 will be considered as double in java. If you want it treated as float, you need to append f, 3.7f.
Please refer java tutorial and Java Language Specification.
Floating point doesn't have an exact representation. This means that 3.7d != 3.7f as these have different precision. As 3.7d has more precision it makes a better choice for the default value 3.7. If you used 3.7f you can assign this to a double and be unaware that this lack the precision of a double e.g.
double d = 3.7f;
System.out.println(d); // doesn't print 3.7 as expected!
I have on question regarding double precision.When a float value is passed into double then I get some different result. For e.g.
float f= 54.23f;
double d1 = f;
System.out.println(d1);
The output is 54.22999954223633. Can someone explain the reason behind this behaviour. Is it like double defaults to 14 places of decimal precision.
The same value is printed differently for float and double because the Java specification requires printing as many digits as needed to distinguish the value from adjacent representable values in the same type (per my answer here, and see the linked documentation for more precision in the definition).
Since float has fewer bits to represent values, and hence fewer values, they are spaced more widely apart, and you do not need as many digits to distinguish them. When you put the value into a double and print it, the Java rules require that more digits be printed so that the value is distinguished from nearby double values. The println function does not know that the value originally came from a float and does not contain as much information as can fit into a double.
54.23f is exactly 54.229999542236328125 (in hexadecimal, 0x1.b1d70ap+5). The float values just below and just above this are 54.2299957275390625 (0x1.b1d708p+5) and 54.23000335693359375 (0x1.b1d70cp+5). As you can see, printing “54.229999” would distinguish the value from 54.229995… and from 54.23…. However, the double values just below and just above 54.23f are 54.22999954223632101957264239899814128875732421875 and 54.22999954223633523042735760100185871124267578125. To distinguish the value, you need “54.22999954223633”.
This is because the float hides the extra decimals and double shows them. The double will represent the actual number quite precisely and shows more digits.
Try this:
System.out.println(f.doubleValue()); (need to make it a Float first ofcourse)
So as you can see, the information is there, it is just rounded.
Hope this helps
This is due to the Internal Representation.
Floating-point numbers are typically packed into a computer datum as the sign bit, the exponent field, and the significand (mantissa), from left to right.
This is called as Accuracy Problems.
The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations. This is related to the finite precision with which computers generally represent numbers.
It is not a problem. It is how double works. You do not have to handle it and care about it. The precision of double is enough. Think, the difference between you number and the expected result is in the 14 position after decimal point.
If you need arbitrarily good precision, use the java.math.BigDecimal class.
Or if you still want to use double. Do like this:
double d = 5.5451521841;
NumberFormat nf = new DecimalFormat("##.###");
System.out.println(nf.format(d));
Please let me know in case of any doubt.
Actually this is only about different visual representation or converting float / double to String. Let's take a look at internal binary representation
float f = 0.23f;
double d = f;
System.out.println(Integer.toBinaryString(Float.floatToIntBits(f)));
System.out.println(Long.toBinaryString(Double.doubleToLongBits(d)));
output
111110011010111000010100011111
11111111001101011100001010001111100000000000000000000000000000
it means that f was converted to d1 without any distortion, significant digits are the same
double and float represent numbers in different formats.
Because of this you are bound to find certain numbers that store perfectly in one format but not in the other. You happen to have found one that correctly fits in a float but does not fit exactly in a `double.
This problem can also show itself when two different formatters are used.