Please explain:
I'm declaring a class with 2 constructors as following:
class A {
public :
A (double x) {cout << "DOUBLE \n";}
A (float x) {cout << "FLOAT \n";}
};
Then:
A a (3.7);
This result with DOUBLE as output.
I've tried this also on java - same result.
Can anyone explain why?
EDIT: I do realise double is the default type for number such as 3.7 My question is why and if there is a good reason for that.
This is because the 3.7 literal is a double. If you want float, use 3.7f. In C++, it is specified in the standard, 2.14.4 Floating Literals. The most relevant section is
The type of a floating literal is double unless explicitly specified by a suffix. The suffixes f and F specify
float, the suffixes l and L specify long double.
This doesn't answer why this is so. I imagine it is because the way it was in C, and the reason it is that way in C must be, to some level, arbitrary.
There seem to have been at least a couple of reasons for this.
First of all, the PDP-11 floating point unit had a single precision mode and a double precision mode. Switching between modes was possible, but fairly slow. At the same time, execution in double precision mode was almost as fast as in single precision mode (if memory serves, even faster in a few cases).
Second, early C didn't have a way to specify function parameter types. The standard library functions only accepted double precision floating point (since it gave extra precision almost for free). Writing the library to deal with both single and double precision floating point would have (approximately) doubled the effort, but provided little real advantage.
By default 3.7 will be considered as double in java. If you want it treated as float, you need to append f, 3.7f.
Please refer java tutorial and Java Language Specification.
Floating point doesn't have an exact representation. This means that 3.7d != 3.7f as these have different precision. As 3.7d has more precision it makes a better choice for the default value 3.7. If you used 3.7f you can assign this to a double and be unaware that this lack the precision of a double e.g.
double d = 3.7f;
System.out.println(d); // doesn't print 3.7 as expected!
Related
Why the inconsistency?
There is no inconsistency: the methods are simply designed to follow different specifications.
long round(double a)
Returns the closest long to the argument.
double floor(double a)
Returns the largest (closest to positive infinity) double value that is less than or equal to the argument and is equal to a mathematical integer.
Compare with double ceil(double a)
double rint(double a)
Returns the double value that is closest in value to the argument and is equal to a mathematical integer
So by design round rounds to a long and rint rounds to a double. This has always been the case since JDK 1.0.
Other methods were added in JDK 1.2 (e.g. toRadians, toDegrees); others were added in 1.5 (e.g. log10, ulp, signum, etc), and yet some more were added in 1.6 (e.g. copySign, getExponent, nextUp, etc) (look for the Since: metadata in the documentation); but round and rint have always had each other the way they are now since the beginning.
Arguably, perhaps instead of long round and double rint, it'd be more "consistent" to name them double round and long rlong, but this is argumentative. That said, if you insist on categorically calling this an "inconsistency", then the reason may be as unsatisfying as "because it's inevitable".
Here's a quote from Effective Java 2nd Edition, Item 40: Design method signatures carefully:
When in doubt, look to the Java library APIs for guidance. While there are plenty of inconsistencies -- inevitable, given the size and scope of these libraries -- there are also fair amount of consensus.
Distantly related questions
Why does int num = Integer.getInteger("123") throw NullPointerException?
Most awkward/misleading method in Java Base API ?
Most Astonishing Violation of the Principle of Least Astonishment
floor would have been chosen to match the standard c routine in math.h (rint, mentioned in another answer, is also present in that library, and returns a double, as in java).
but round was not a standard function in c at that time (it's not mentioned in C89 - c identifiers and standards; c99 does define round and it returns a double, as you would expect). it's normal for language designers to "borrow" ideas, so maybe it comes from some other language? fortran 77 doesn't have a function of that name and i am not sure what else would have been used back then as a reference. perhaps vb - that does have Round but, unfortunately for this theory, it returns a double (php too). interestingly, perl deliberately avoids defining round.
[update: hmmm. looks like smalltalk returns integers. i don't know enough about smalltalk to know if that is correct and/or general, and the method is called rounded, but it might be the source. smalltalk did influence java in some ways (although more conceptually than in details).]
if it's not smalltalk, then we're left with the hypothesis that someone simply chose poorly (given the implicit conversions possible in java it seems to me that returning a double would have been more useful, since then it can be used both while converting types and when doing floating point calculations).
in other words: functions common to java and c tend to be consistent with the c library standard at the time; the rest seem to be arbitrary, but this particular wrinkle may have come from smalltalk.
I agree, that it is odd that Math.round(double) returns long. If large double values are cast to long (which is what Math.round implicitly does), Long.MAX_VALUE is returned. An alternative is using Math.rint() in order to avoid that. However, Math.rint() has a somewhat strange rounding behavior: ties are settled by rounding to the even integer, i.e. 4.5 is rounded down to 4.0 but 5.5 is rounded up to 6.0). Another alternative is to use Math.floor(x+0.5). But be aware that 1.5 is rounded to 2 while -1.5 is rounded to -1, not -2. Yet another alternative is to use Math.round, but only if the number is in the range between Long.MIN_VALUE and Long.MAX_VALUE. Double precision floating point values outside this range are integers anyhow.
Unfortunately, why Math.round() returns long is unknown. Somebody made that decision, and he probably never gave an interview to tell us why. My guess is, that Math.round was designed to provide a better way (i.e., with rounding) for converting doubles to longs.
Like everyone else here I also don't know the answer, but thought someone might find this useful. I noticed that if you want to round a double to an int without casting, you can use the two round implementations long round(double) and int round(float) together:
double d = something;
int i = Math.round(Math.round(d));
This does not answer the question.
I ran the same exact code in Java & C# and it gave two differents results.
Why? As the doubles in both languages have the exact same specifications :
double is a type that represents 64-bit IEEE 754 floating-point number
in Java
double is a type that represents 64-bit double-precision number in
IEEE 754 format in C#.
Java
double a = Math.pow(Math.sin(3), 2);
double b = Math.pow(Math.cos(3), 2);
System.out.println(a); // 0.01991485667481699
System.out.println(b); // 0.9800851433251829
System.out.println(a+b); // 0.9999999999999999
C#
double a = Math.Pow(Math.Sin(3), 2);
double b = Math.Pow(Math.Cos(3), 2);
Console.WriteLine(a); // 0.019914856674817
Console.WriteLine(b); // 0.980085143325183
Console.WriteLine(a+b); // 1
It's just the precision that C# is using with the writeLine method. See https://msdn.microsoft.com/en-us/library/dwhawy9k.aspx#GFormatString where it specifies that the G format specifier gives 15-digit precision.
If you write:
Console.WriteLine(a.ToString("R"));
it prints 0.019914856674816989.
The root of it is that floating point numbers are imprecise and calculations can't even really be relied upon to be deterministic. But they're close.
Most likely the difference is probably because the CLR is allowed to work with doubles as 80 bit numbers internally. You don't ever see more than 64 bits, however the processor will work with 80. I'm unsure how Java handles floating point numbers internally. It could possibly be the same.
There's tons on the topic, but here's some random light reading from Google which may be of interest.
IEEE754 has a distiction between required operations and optional operations:
required operations, like addition, subtraction, etc must be exactly rounded
optional operations are not required to be exactly rounded, and the list of these operations includes all trigonometric functions (and others), they are let to the implementation
So you have no guarantees from the standard that sin and cos implementation should match between implementations.
More infomations here or here.
Our teacher asked us to search about this and what I kept on getting from the net are explanations stating what double and float means.
Can you tell me whether it is possible or not, and explain why or why not?
Simple answer: yes, but only if the double is not too large.
float's are single-precision floating point numbers, meaning they use a 23-bit mantissa and 8-bit exponent, corresponding to ~6/7 s.f. precision and ~ 10^38 range.
double's are double-precision - with 52-bit mantissa and 11-bit exponent, corresponding to ~14/15 s.f. precision and ~ 10^308 range.
Since double's have larger range than floats, adding a float to a very large double will nullify the float's effects (called underflow). Of course this can happen for two double types as well.
https://en.wikipedia.org/wiki/Floating_point
Can you add two numbers with varying decimal places (e.g. 432.54385789364 + 432.1)? Yes you can.
In Java, it is the same idea.
From the Java Tutorials:
float: The float data type is a single-precision 32-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. As with the recommendations for byte and short, use a float (instead of double) if you need to save memory in large arrays of floating point numbers. This data type should never be used for precise values, such as currency. For that, you will need to use the java.math.BigDecimal class instead. Numbers and Strings covers BigDecimal and other useful classes provided by the Java platform.
double: The double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. For decimal values, this data type is generally the default choice. As mentioned above, this data type should never be used for precise values, such as currency.
Basically, they are both holders to decimals. The way that they are different is how precise they can be. A float can only be 32 bits in size, compared to a double which is 64 bits in size. A float can have precision up to around 5 or 6 float point numbers, and a double can have precision up to around 10 floating point numbers.
Basically... a double can store a decimal better than a float... but takes up more space.
To answer your question, you can add a float to a double and vice versa. Generally, the result will be made into a double, and you will have to cast it back to a float if that is what you want.
If you want to be really deep about it you should say yes it is possible due to value coercion, but that it opens the door for more severe precision errors to accumulate invisibly to the compiler. float has substantially precision than double and is very regrettably the default type of literal floating-point numbers in Java source. In practice make sure to use the d suffix on literals to make sure theh are double if you have to use floating point.
These precision errors can lead to serious harm and even loss of life in sensitive systems.
Floating point is very hard to use correctly and should be avoided if possible. One extremely obvious thing not to do that is commonly mistakenly done is representing currency as a float or double. This can cause real money to be effectively given to or stolen from people.
Floating point (preferring double) is appropriate for approximate calculations and certain high performance scientific computing applications. However it is still extremely important to be aware of the precision loss characteristics particularly when a resulting floating point value is fed into further floating-point calculations.
This more generally leads in Numerical Computing and now I've really gone afield :)
SAS has a decent paper on this:
http://support.sas.com/resources/papers/proceedings11/275-2011.pdf
I am using jdk 1.6. This is my code.
float f = 10.0f;
double d = 10.0;
System.out.println("Equal Status : " + (f == d));
then the system shows the answer as true. But if I modified the value as
float f = 10.1f;
double d = 10.1;
System.out.println("Equal Status : " + (f == d));
then the system shows the answer as false. I know the system use Bit matching for == checking. But what is the reason behind. Can you explain about it? Thanks in advance.
While this is not "my" answer, this is about as close to "must read" literature for programmers who want to move from "meh" to "good." Great is something truly special, so don't think that "good" is anything to sneeze at. :)
What Every Programmer Needs to know about Floating Point
The link #Sam suggested is great but still too technical for me :P
I will just give some opinions to OP for handling floating point (probably a bit off-topic because you are asking for the reason behind. For the reason behind, read the link #Sam suggested).
Never assume floating point number is going to give you accurate representations. Sometimes it can but not always. Floating point has its constraint in "significant figures" which it is "accurate" for the first n-th digit.
Your situation is even worse cause you are mixing float and double, but the idea to solve is similar.
You need to decide to what precision your application needs the calculation result to be, and decide an Epsilon value base on it. For example, your application needs only accuracy to 3 decimal place, probably a Epsilon of 0.0005 is reasonable.
Comparing two floating point number shouldn't be done by ==, you should use
(a + EPSILON > b && a - EPSILON < b). Similarly, a > b should be expressed as a - EPSILON > b
Points to remember are
10.1 is a repeating sequence in binary 1010101010......
When comparing a float and a double the float is converted to a
double by adding zerro's to fill the number out
so you will be comparing
1010101...00000000... to 1010101.....101010... which are different.
float f = 10.1f;
double d = 10.1;
System.out.println("Equal Status : " + (f == (float)d));
will give the answer of true
IMHO, Generally speaking for 99% of use case double is a better choice because it is more accurate. i.e. don't use float unless you have to.
BigDecimal can be used to display the actual representation of a float or double. You don't see this normally as the toString will perform a small amount of rounding (as it is coded to accomodate the types representation limitations)
System.out.println("10.1f is actually " + new BigDecimal(10.1f));
System.out.println("10.1 is actually " + new BigDecimal(10.1));
prints
10.1f is actually 10.1000003814697265625
10.1 is actually 10.0999999999999996447286321199499070644378662109375
You can see that the double value is closer to the desired 10.1 but is not exactly this value. The reason the values are different is that in each case, it have the closest resprentable value for that type.
float is a 32 bit type whereas double is a 64 bit type.
You ran into the classical floating point precision problem.
Floats are imprecise. The actual values of 10.1f and 10.1 will be slightly different due to rounding. Floats are binary, not decimal, so numbers that look "simple" to us, like 10.1, can't be represented exactly as floats.
You would want to refresh yourself on the IEEE floating point standards for both 32 and 64-bit floating point representations. If you peel into the internals of this, you'll see clearly as to why these floating points behave finicky.
If you're curious about how it's represented internally (which is why it's failing), you can use this code, which shows the hexadecimal representations of these numbers. From there, you can match them up with the exponents and mantissas of single and double precision.
System.out.printf("Float 10.0: 0x%X\n", Float.floatToRawIntBits((float)10.0));
System.out.printf("Double 10.0: 0x%X\n", Double.doubleToRawLongBits(10.0));
System.out.printf("Float 10.1: 0x%X\n", Float.floatToRawIntBits((float)10.1));
System.out.printf("Double 10.1: 0x%X\n", Double.doubleToRawLongBits(10.1));
prints
Float 10.0: 0x41200000
Double 10.0: 0x4024000000000000
Float 10.1: 0x4121999A
Double 10.1: 0x4024333333333333
You'll notice that there is some repetition in the way the values are represented. This is because 1/10 can't be represented in a finite space of base 2.
I have experimented what is wrong with float and double types, in Java System.out.print(1-.6) prints .4 and the result is a bit unexpected (0.30000000000000004) in case of System.out.print(1-.7). It would be helpful if anyone is able to direct me towards some resources that explain WHY does it happen. I am assuming its not Java specific its something inherently wrong with these types.
Thanks!
The real types in Java are implementations of IEEE754 single and double precision floating point notation. These are approximations of real numbers rather than exact representations. Some real numbers like 0.8 cannot be represented accurately.
As said Vincent the float and double types cannot store values that will not be represented as the sum of 2^-n values (n size depends on the implementation).
Use the BigDecimal class instead.