This does not answer the question.
I ran the same exact code in Java & C# and it gave two differents results.
Why? As the doubles in both languages have the exact same specifications :
double is a type that represents 64-bit IEEE 754 floating-point number
in Java
double is a type that represents 64-bit double-precision number in
IEEE 754 format in C#.
Java
double a = Math.pow(Math.sin(3), 2);
double b = Math.pow(Math.cos(3), 2);
System.out.println(a); // 0.01991485667481699
System.out.println(b); // 0.9800851433251829
System.out.println(a+b); // 0.9999999999999999
C#
double a = Math.Pow(Math.Sin(3), 2);
double b = Math.Pow(Math.Cos(3), 2);
Console.WriteLine(a); // 0.019914856674817
Console.WriteLine(b); // 0.980085143325183
Console.WriteLine(a+b); // 1
It's just the precision that C# is using with the writeLine method. See https://msdn.microsoft.com/en-us/library/dwhawy9k.aspx#GFormatString where it specifies that the G format specifier gives 15-digit precision.
If you write:
Console.WriteLine(a.ToString("R"));
it prints 0.019914856674816989.
The root of it is that floating point numbers are imprecise and calculations can't even really be relied upon to be deterministic. But they're close.
Most likely the difference is probably because the CLR is allowed to work with doubles as 80 bit numbers internally. You don't ever see more than 64 bits, however the processor will work with 80. I'm unsure how Java handles floating point numbers internally. It could possibly be the same.
There's tons on the topic, but here's some random light reading from Google which may be of interest.
IEEE754 has a distiction between required operations and optional operations:
required operations, like addition, subtraction, etc must be exactly rounded
optional operations are not required to be exactly rounded, and the list of these operations includes all trigonometric functions (and others), they are let to the implementation
So you have no guarantees from the standard that sin and cos implementation should match between implementations.
More infomations here or here.
Related
Our teacher asked us to search about this and what I kept on getting from the net are explanations stating what double and float means.
Can you tell me whether it is possible or not, and explain why or why not?
Simple answer: yes, but only if the double is not too large.
float's are single-precision floating point numbers, meaning they use a 23-bit mantissa and 8-bit exponent, corresponding to ~6/7 s.f. precision and ~ 10^38 range.
double's are double-precision - with 52-bit mantissa and 11-bit exponent, corresponding to ~14/15 s.f. precision and ~ 10^308 range.
Since double's have larger range than floats, adding a float to a very large double will nullify the float's effects (called underflow). Of course this can happen for two double types as well.
https://en.wikipedia.org/wiki/Floating_point
Can you add two numbers with varying decimal places (e.g. 432.54385789364 + 432.1)? Yes you can.
In Java, it is the same idea.
From the Java Tutorials:
float: The float data type is a single-precision 32-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. As with the recommendations for byte and short, use a float (instead of double) if you need to save memory in large arrays of floating point numbers. This data type should never be used for precise values, such as currency. For that, you will need to use the java.math.BigDecimal class instead. Numbers and Strings covers BigDecimal and other useful classes provided by the Java platform.
double: The double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. For decimal values, this data type is generally the default choice. As mentioned above, this data type should never be used for precise values, such as currency.
Basically, they are both holders to decimals. The way that they are different is how precise they can be. A float can only be 32 bits in size, compared to a double which is 64 bits in size. A float can have precision up to around 5 or 6 float point numbers, and a double can have precision up to around 10 floating point numbers.
Basically... a double can store a decimal better than a float... but takes up more space.
To answer your question, you can add a float to a double and vice versa. Generally, the result will be made into a double, and you will have to cast it back to a float if that is what you want.
If you want to be really deep about it you should say yes it is possible due to value coercion, but that it opens the door for more severe precision errors to accumulate invisibly to the compiler. float has substantially precision than double and is very regrettably the default type of literal floating-point numbers in Java source. In practice make sure to use the d suffix on literals to make sure theh are double if you have to use floating point.
These precision errors can lead to serious harm and even loss of life in sensitive systems.
Floating point is very hard to use correctly and should be avoided if possible. One extremely obvious thing not to do that is commonly mistakenly done is representing currency as a float or double. This can cause real money to be effectively given to or stolen from people.
Floating point (preferring double) is appropriate for approximate calculations and certain high performance scientific computing applications. However it is still extremely important to be aware of the precision loss characteristics particularly when a resulting floating point value is fed into further floating-point calculations.
This more generally leads in Numerical Computing and now I've really gone afield :)
SAS has a decent paper on this:
http://support.sas.com/resources/papers/proceedings11/275-2011.pdf
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 7 years ago.
I am executing the following code in java but i got two different answers for what should be the same number mathematically.
public class TestClass {
public static void main(String[] args) {
double a=0.01;
double b=4.5;
double c=789;
System.out.println("Value1---->"+(a*b*c));
System.out.println("Value2---->"+(b*c*a));
}
}
Output:
Value1---->35.504999999999995
Value2---->35.505
Floating point numbers have a certain precision. Some fractions can not be displayed correctly with floating point numbers, that's why rounding errors can occur.
The results are different because of the precedence of the calculations. Each of your calculations consists of two multiplications. The multiply * operator in Java has a left to right associativity. That means that in (a*b*c), a*b is calculated first and then multiplied by c as in ((a*b)*c). One of those calculation chains happens to produce a rounding error because a number in it simply can't be represented as a floating point number.
Essentially, Java uses binary floating point values to handle all of its decimal based operations. As mentioned, in another answer, here is a link to the IEEE 754 that addresses the issue you've encountered. And as also mentioned in Joshua Bloch's Effective Java, refer to item 48 "Avoid float and double if exact answers are required":
In summary, don’t use float or double for any calculations that require an
exact answer. Use BigDecimal if you want the system to keep track of the decimal
point and you don’t mind the inconvenience and cost of not using a primitive type.
It is because type double is an approximation.
Double in Java denotes to IEEE 754 standart type decimal64.
To resolve this problem use Math.round() or either BigDecimal class.
Multiplication of floating points uses a process that introduces precision errors.
To quote Wikipedia:
"To multiply, the significands are multiplied while the exponents are added, and the result is rounded and normalized."
Java multiplies from left to right. In your example, the first parts (a * b and b * c) actually produce no precision errors.
So your final multiplications end up as:
System.out.println("Value1---->" + (0.045 * 789));
System.out.println("Value2---->" + (3550.5 * 0.01));
Now, 0.045 * 789 produces a precision error due to that floating point multiplication process. Whereas 3550.5 * 0.01 does not.
'Cause double * double will be double, and that not totally precise.
Try the following code:
System.out.println(1.0-0.9-0.1) // -2.7755575615628914E-17
If you want totally precise real numbers, use BigDecimal instead!
This is because double has finite precision. Binary representation can't store exactly the value of for example 0.01. See also wikipedia entry on double precision floating point numbers.
Order of multiplication can change the way that representation errors are accumulated.
Consider using BigDecimal class, if you need precision.
As JavaDoc the double is a floating point type, and it's imprecise by nature. That why two exactly identical operation will wield different results, since the float point type (double) is an approximation.
See http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html :
double: The double data type is a double-precision 64-bit IEEE 754
floating point. Its range of values is beyond the scope of this
discussion, but is specified in the Floating-Point Types, Formats, and
Values section of the Java Language Specification. For decimal values,
this data type is generally the default choice. As mentioned above,
this data type should never be used for precise values, such as
currency.
See also the wikipedia http://en.wikipedia.org/wiki/Floating_point :
The floating-point representation is by far the most common way of
representing in computers an approximation to real numbers.
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
Quote: "double: The double data type is a double-precision 64-bit IEEE 754 floating point."
When you dig into IEEE 754 you will understand how doubles are stored in memory.
For such calculations I would recommend http://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html
See this reply from 2011
Java:Why should we use BigDecimal instead of Double in the real world?
It's called loss of precision and is very noticeable when working with either very big numbers or very small numbers.
See the section
Decimal numbers are approximations
And read down
As mentioned, there is an issue with the floating precision. You can either use printf or you can use Math.round() like so (change the number of zeros to affect precision):
System.out.println("Value 1 ----> " + (double) Math.round((a*b*c) * 100000) / 100000);
System.out.println("Value 2 ----> " + (double) Math.round((b*c*a) * 100000) / 100000);
Output
Value 1 ----> 35.505
Value 2 ----> 35.505
Please explain:
I'm declaring a class with 2 constructors as following:
class A {
public :
A (double x) {cout << "DOUBLE \n";}
A (float x) {cout << "FLOAT \n";}
};
Then:
A a (3.7);
This result with DOUBLE as output.
I've tried this also on java - same result.
Can anyone explain why?
EDIT: I do realise double is the default type for number such as 3.7 My question is why and if there is a good reason for that.
This is because the 3.7 literal is a double. If you want float, use 3.7f. In C++, it is specified in the standard, 2.14.4 Floating Literals. The most relevant section is
The type of a floating literal is double unless explicitly specified by a suffix. The suffixes f and F specify
float, the suffixes l and L specify long double.
This doesn't answer why this is so. I imagine it is because the way it was in C, and the reason it is that way in C must be, to some level, arbitrary.
There seem to have been at least a couple of reasons for this.
First of all, the PDP-11 floating point unit had a single precision mode and a double precision mode. Switching between modes was possible, but fairly slow. At the same time, execution in double precision mode was almost as fast as in single precision mode (if memory serves, even faster in a few cases).
Second, early C didn't have a way to specify function parameter types. The standard library functions only accepted double precision floating point (since it gave extra precision almost for free). Writing the library to deal with both single and double precision floating point would have (approximately) doubled the effort, but provided little real advantage.
By default 3.7 will be considered as double in java. If you want it treated as float, you need to append f, 3.7f.
Please refer java tutorial and Java Language Specification.
Floating point doesn't have an exact representation. This means that 3.7d != 3.7f as these have different precision. As 3.7d has more precision it makes a better choice for the default value 3.7. If you used 3.7f you can assign this to a double and be unaware that this lack the precision of a double e.g.
double d = 3.7f;
System.out.println(d); // doesn't print 3.7 as expected!
In java following expression results into
new Double(1.0E22) + new Double(3.0E22) = 4.0E22
but
new Double(1.0E22) + new Double(4.0E22) = 4.9999999999999996E22
I was expecting it to be 5.0E22. The Double limit is 1.7976931348623157E308.
Appreciate your help. My machine's architecture is x64 and JVM is also 64 bit.
Welcome to the planet of floating point units. Unfortunately, in a real world, you have to give up some precision to get speed and breadth of representation. You cannot avoid that: double is only an approximate representation. Actually, you cannot represent a number but with finite precision. Still, it's a good approximation: less than 0.00000000001% error. This has nothing to do with double upper limits, rather with CPU limits, try doing some more math with Python:
>>> 4.9999999999999996 / 5.
1.0
>>> 5. - 4.9999999999999996
0.0
See? As a side note, never check for equality on double, use approximate equality:
if ((a - b) < EPSILON)
Where EPSILON is a very small value. Probably Java library has something more appropriate, but you get the idea.
If you are insterested in some theory, the standard for floating point operations is IEEE754
There are a few ways that you can reduce floating point errors, e.g. pairwise summation and Kahan summation, but neither of these will help you precisely represent a number like 5.0E22 - as Stefano Sanfilippo stated in his answer, that's due to the limits of what you can represent in using floating point, as opposed to a problem with the algorithm used to achieve the answer. To represent 5.0E22 you should either use BigDecimal or else use a library that has a Rational data type, e.g. JScience.
I use doubles for a uniform implementation of some arithmetic calculations. These calculations may be actually applied to integers too, but there are no C++-like templates in Java and I don't want to duplicate the implementation code, so I simply use "double" version for ints.
Does JVM spec guarantees the correctness of integer operations such a <=,>=, +, -, *, and / (in case of remainder==0) when the operations are emulated as corresponding floating point ops?
(Any integer, of course, has reasonable size to be represented in double's mantissa)
According to the Java Language Specification:
Operators on floating-point numbers
behave as specified by IEEE 754 (with
the exception of the remainder
operator (§15.17.3)).
So you're guaranteed uniform behaviour, and while I don't have access to the official IEEE standard document, I'm pretty sure that it implicitly guarantees that operations on integers that can be represented exactly as a float/double work as expected.
briefly yes.
double a = 3.0;
double b = 2.0;
System.out.println(a*b); // 6.0
System.out.println(a+b); // 5.0
System.out.println(a-b); // 1.0
System.out.println(a/b); // 1.5 // if you want to get 1 here you should cast it to `integer (int)`
System.out.println(a>=b); // true
System.out.println(a<=b); // false
but be careful while multiplication (*) because a*b can cause overflow while casting to integer. same situation for (+ and -)
Indeed, I 've found the standard and it says "yes"
JVM spec:
The rounding operations of the Java virtual machine always use IEEE 754 round to
nearest mode. Inexact results are rounded to the nearest representable value, with ties going to the value with a zero least-significant bit. This is the IEEE 754 default mode. But Java virtual machine instructions that convert values of floating-point types to values of integral types round toward zero. The Java virtual machine does not give any means to change the floating-point rounding mode.
ANSI/IEEE Std 754-1985 5.
... Except for binary <---> decimal conversion, each of the operations shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then coerced this intermediate result to fit in the destination’s format
ANSI/IEEE Std 754-1985 5.4.
Conversions between floating-point integers and integer formats shall be exact unless an exception arises as specified in 7.1.
Summary
1) exact operations are always exact if the result fits the double format (and, therefore, integer result is always floating-point integer).
2) int <--> double conversions are always exact for floating point integers.