float to double giving Strange results [duplicate] - java

This question already has answers here:
Float precision with specific numbers
(3 answers)
Closed 8 years ago.
I came across with this behavior of float and double during type casting.
I modified my actual statements to better understanding.
1
System.out.println((double)((float)(128.12301)));//Output:128.12301635712188
Same output all the time.
2
System.out.println((double)((float)(128888.12301)));//Output:128888.125
Both outputs are strange for me I can't understand how it's working.
Can anyone help me out?

There are several steps involved here, each with different numbers. Let's split the code up for each statement:
double original = 128.12301; // Or 128888.12301
float floatValue = (float) original;
double backToDouble = (double) floatValue;
System.out.println(backToDouble);
So for each number, the steps are:
Compile time: Convert the decimal value in the source code into the nearest exact double value
Execution time: Convert the double value to the nearest exact float value
Execution time: Convert the float value into a double value (this never loses any information)
Execution time: Convert the final double value into a string
Steps 1 and 2 can lose information; step 4 doesn't always print the exact value - it just follows what Double.toString(double) does.
So let's take 128.12301 as an example. That's converted at compile-time to exactly 128.123009999999993624442140571773052215576171875. Then the conversion to float yields exactly 128.123016357421875. So after the conversion back to double (which preserves the value) we print out 128.123016357421875. That prints 128.12301635712188 because that's the fewest digits in can print out without being ambiguous between that value and the nearest double value greater than or less than it.
Now with 128888.12301, the exact double value is 128888.123009999995701946318149566650390625 - and the closest float to that is exactly
128888.125. After converting that back to a double, the exact value of that double is printed out because there are other exact double values near it.
Basically, the result will depend on how many significant digits you've included to start with, and how much information is lost when it rounds to the nearest double and then to the nearest float.

System.out.println(number) will go through Double.toString() which is a fairly complex method (as seen in its documentation) an will not always behave as you'd expect. It basically gives the shortest string which uniquely determines number.
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significant of d must be 0.

A float is a 32 bit IEEE 754 floating point.
A double is a 64 bit IEEE 754 floating point.
It is the same for float and double, both are binary floating point types, but double has more precision than float.
Check this for more details

Related

Is there a way to get right results from BigDecimal.floatValue() function? [duplicate]

I am working with an application that is based entirely on doubles, and am having trouble in one utility method that parses a string into a double. I've found a fix where using BigDecimal for the conversion solves the issue, but raises another problem when I go to convert the BigDecimal back to a double: I'm losing several places of precision. For example:
import java.math.BigDecimal;
import java.text.DecimalFormat;
public class test {
public static void main(String [] args){
String num = "299792.457999999984";
BigDecimal val = new BigDecimal(num);
System.out.println("big decimal: " + val.toString());
DecimalFormat nf = new DecimalFormat("#.0000000000");
System.out.println("double: "+val.doubleValue());
System.out.println("double formatted: "+nf.format(val.doubleValue()));
}
}
This produces the following output:
$ java test
big decimal: 299792.457999999984
double: 299792.458
double formatted: 299792.4580000000
The formatted double demonstrates that it's lost the precision after the third place (the application requires those lower places of precision).
How can I get BigDecimal to preserve those additional places of precision?
Thanks!
Update after catching up on this post. Several people mention this is exceeding the precision of the double data type. Unless I'm reading this reference incorrectly:
http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.2.3
then the double primitive has a maximum exponential value of Emax = 2K-1-1, and the standard implementation has K=11. So, the max exponent should be 511, no?
You've reached the maximum precision for a double with that number. It can't be done. The value gets rounded up in this case. The conversion from BigDecimal is unrelated and the precision problem is the same either way. See this for example:
System.out.println(Double.parseDouble("299792.4579999984"));
System.out.println(Double.parseDouble("299792.45799999984"));
System.out.println(Double.parseDouble("299792.457999999984"));
Output is:
299792.4579999984
299792.45799999987
299792.458
For these cases double has more than 3 digits of precision after the decimal point. They just happen to be zeros for your number and that's the closest representation you can fit into a double. It's closer for it to round up in this case, so your 9's seem to disappear. If you try this:
System.out.println(Double.parseDouble("299792.457999999924"));
You'll notice that it keeps your 9's because it was closer to round down:
299792.4579999999
If you require that all of the digits in your number be preserved then you'll have to change your code that operates on double. You could use BigDecimal in place of them. If you need performance then you might want to explore BCD as an option, although I'm not aware of any libraries offhand.
In response to your update: the maximum exponent for a double-precision floating-point number is actually 1023. That's not your limiting factor here though. Your number exceeds the precision of the 52 fractional bits that represent the significand, see IEEE 754-1985.
Use this floating-point conversion to see your number in binary. The exponent is 18 since 262144 (2^18) is nearest. If you take the fractional bits and go up or down one in binary, you can see there's not enough precision to represent your number:
299792.457999999900 // 0010010011000100000111010100111111011111001110110101
299792.457999999984 // here's your number that doesn't fit into a double
299792.458000000000 // 0010010011000100000111010100111111011111001110110110
299792.458000000040 // 0010010011000100000111010100111111011111001110110111
The problem is that a double can hold 15 digits, while a BigDecimal can hold an arbitrary number. When you call toDouble(), it attempts to apply a rounding mode to remove the excess digits. However, since you have a lot of 9's in the output, that means that they keep getting rounded up to 0, with a carry to the next-highest digit.
To keep as much precision as you can, you need to change the BigDecimal's rounding mode so that it truncates:
BigDecimal bd1 = new BigDecimal("12345.1234599999998");
System.out.println(bd1.doubleValue());
BigDecimal bd2 = new BigDecimal("12345.1234599999998", new MathContext(15, RoundingMode.FLOOR));
System.out.println(bd2.doubleValue());
Only that many digits are printed so that, when parsing the string back to double, it will result in the exact same value.
Some detail can be found in the javadoc for Double#toString
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
If it's entirely based on doubles ... why are you using BigDecimal? Wouldn't Double make more sense? If it's too large of value (or too much precision) for that then ... you can't convert it; that would be the reason to use BigDecimal in the first place.
As to why it's losing precision, from the javadoc
Converts this BigDecimal to a double. This conversion is similar to the narrowing primitive conversion from double to float as defined in the Java Language Specification: if this BigDecimal has too great a magnitude represent as a double, it will be converted to Double.NEGATIVE_INFINITY or Double.POSITIVE_INFINITY as appropriate. Note that even when the return value is finite, this conversion can lose information about the precision of the BigDecimal value.
You've hit the maximum possible precision for the double. If you would still like to store the value in primitives... one possible way is to store the part before the decimal point in a long
long l = 299792;
double d = 0.457999999984;
Since you are not using up (that's a bad choice of words) the precision for storing the decimal section, you can hold more digits of precision for the fractional component. This should be easy enough to do with some rounding etc..

Why does Math.floor(1.23456789 * 1e8) / 1e8) return 1.23456788

Math.floor(1.23456789 * 1e8) / 1e8)
returns:
1.23456788
Which is strange considering that:
Math.floor(1.23456789 * 1e9) / 1e9)
returns:
1.23456789
and also
Math.floor(1.23456799 * 1e8) / 1e8)
returns:
1.23456799
Any idea why this is happening and how to avoid it?
As answered by Daniel Centore, double precision values are imprecise. Here is a list of the actual values (to 20 digits) for double precision numbers. Shown are the two closest encodings for 1.23456789; the first one is closer. When multiplying by 1e8, the multiply doesn't round up. When multiplying by 1e9, the multiply rounds up to an exact integer value.
1.23456789 => 1.2345678899999998901 hex 3ff3c0ca4283de1b
1.23456789 => 1.2345678900000001121 hex 3ff3c0ca4283de1c
1.23456789*1e8 => 123456788.99999998510 hex 419d6f3453ffffff
1.23456789*1e9 => 1234567890.0000000000 hex 41d26580b4800000
Floating point arithmetic is imprecise. The conversion to binary and back to decimal can cause problems (as binary cannot perfectly represent decimal fractions and visa-versa) on top of the fact that floating point has limited precision.
You can use BigDecimal to get perfect math, but it is much slower. This is only noticable if you will be doing many calculations.
Edit: Here's a BigDecimal tutorial.
The Java double type (almost) conforms to an international standard called IEEE-754, for floating point numbers. The numbers that can be expressed in this type all have one thing in common - their representations in binary terminate after at most 53 significant digits.
Most numbers with terminating decimal representations do not have terminating binary representations, which means there's no double that stores them exactly. When you write a double literal in Java, the value stored in the double will generally not be the number you wrote in the literal - instead it will be the nearest available double.
The literal 1.23456789 is no exception. It falls neatly between two numbers that can be stored as double, and the exact values of those two double numbers are 1.2345678899999998900938180668163113296031951904296875 and 1.23456789000000011213842299184761941432952880859375. The rule is that the closer of those two numbers is chosen, so the literal 1.23456789 is stored as 1.2345678899999998900938180668163113296031951904296875.
The literal 1E8 can be stored exactly as a double, so the multiplication in your first example is 1.2345678899999998900938180668163113296031951904296875 times 100000000, which of course is 123456788.99999998900938180668163113296031951904296875. This number can't be stored exactly as a double. The nearest double below it is 123456788.99999998509883880615234375 and the nearest double above it is 123456789 exactly. However, the double below it is closer, so the value of the Java expression 1.23456789 * 1E8 is actually 123456788.99999998509883880615234375. When you apply the Math.floor method to this number, the result is exactly 123456788.
The literal 1E9 can be stored exactly as a double, so the multiplication in your second example is 1.2345678899999998900938180668163113296031951904296875 times 1000000000, which of course is 1234567889.9999998900938180668163113296031951904296875. This number can't be stored exactly as a double. The nearest double below it is 1234567889.9999997615814208984375 and the nearest double above it is 1234567890 exactly. But this time, the double above it is closer, so the value of the Java expression 1.23456789 * 1E9 is exactly 1234567890, which is unchanged by the Math.floor method.
The second part of your question was how to avoid this. Well, if you want to do exact calculations involving numbers with terminating decimal representations, you must not store them in double variables. Instead, you can use the BigDecimal class, which lets you do things like this
BigDecimal a = new BigDecimal("1.23456789");
BigDecimal b = new BigDecimal("100000000");
BigDecimal product = a.multiply(b);
and the numbers are represented exactly.

Double Precision when a float value is passed in double

I have on question regarding double precision.When a float value is passed into double then I get some different result. For e.g.
float f= 54.23f;
double d1 = f;
System.out.println(d1);
The output is 54.22999954223633. Can someone explain the reason behind this behaviour. Is it like double defaults to 14 places of decimal precision.
The same value is printed differently for float and double because the Java specification requires printing as many digits as needed to distinguish the value from adjacent representable values in the same type (per my answer here, and see the linked documentation for more precision in the definition).
Since float has fewer bits to represent values, and hence fewer values, they are spaced more widely apart, and you do not need as many digits to distinguish them. When you put the value into a double and print it, the Java rules require that more digits be printed so that the value is distinguished from nearby double values. The println function does not know that the value originally came from a float and does not contain as much information as can fit into a double.
54.23f is exactly 54.229999542236328125 (in hexadecimal, 0x1.b1d70ap+5). The float values just below and just above this are 54.2299957275390625 (0x1.b1d708p+5) and 54.23000335693359375 (0x1.b1d70cp+5). As you can see, printing “54.229999” would distinguish the value from 54.229995… and from 54.23…. However, the double values just below and just above 54.23f are 54.22999954223632101957264239899814128875732421875 and 54.22999954223633523042735760100185871124267578125. To distinguish the value, you need “54.22999954223633”.
This is because the float hides the extra decimals and double shows them. The double will represent the actual number quite precisely and shows more digits.
Try this:
System.out.println(f.doubleValue()); (need to make it a Float first ofcourse)
So as you can see, the information is there, it is just rounded.
Hope this helps
This is due to the Internal Representation.
Floating-point numbers are typically packed into a computer datum as the sign bit, the exponent field, and the significand (mantissa), from left to right.
This is called as Accuracy Problems.
The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations. This is related to the finite precision with which computers generally represent numbers.
It is not a problem. It is how double works. You do not have to handle it and care about it. The precision of double is enough. Think, the difference between you number and the expected result is in the 14 position after decimal point.
If you need arbitrarily good precision, use the java.math.BigDecimal class.
Or if you still want to use double. Do like this:
double d = 5.5451521841;
NumberFormat nf = new DecimalFormat("##.###");
System.out.println(nf.format(d));
Please let me know in case of any doubt.
Actually this is only about different visual representation or converting float / double to String. Let's take a look at internal binary representation
float f = 0.23f;
double d = f;
System.out.println(Integer.toBinaryString(Float.floatToIntBits(f)));
System.out.println(Long.toBinaryString(Double.doubleToLongBits(d)));
output
111110011010111000010100011111
11111111001101011100001010001111100000000000000000000000000000
it means that f was converted to d1 without any distortion, significant digits are the same
double and float represent numbers in different formats.
Because of this you are bound to find certain numbers that store perfectly in one format but not in the other. You happen to have found one that correctly fits in a float but does not fit exactly in a `double.
This problem can also show itself when two different formatters are used.

Precision for Double.parseDouble() and String.valueOf()

Does the following statement holds for any double (Java primitive double precision IEEE-754) except NaN:
Double.parseDouble(String.valueOf(d)) == d
Said otherwise, does parsing a serialized (using String.valueOf()) double value always yields the exact original double?
With the exception of NaN as you've said, yes, that invariant should hold. If not, that's a JDK bug right there.
Double.toString says this in its Javadoc:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
To summarize, it returns enough digits to identify this double uniquely, so Double.parseDouble should return the exact same double that was converted to a string.

Is it possible that a number exactly represented as float can NOT be exactly represented as double?

I have a question which arose from another question about precision of floating numbers.
Now, I know that floating points can not always be represented accurately and hence they are stored as the closest possible floating number that can be represented.
My question is actually about the difference in representation of float and double.
Where does this question arise from?
Suppose I do:
System.out.println(.475d+.075d);
then the output would not be 0.55 but 0.549999 (on my machine)
However, when I do :
System.out.println(.475f+.075f);
I get the correct answer, i.e. 0.55 (a little unexpected for me)
Till now I was under an impression that double has more precision(double will be more accurate upto a longer number of decimal places) that float. So, if a double cannot be represented precisely, then its equivalent float representation will also be stored inaccurately.
However the results I got are a little disturbing for me. I am confused if:
I have an incorrect understanding of what precision means?
float and double are represented differently, apart from the fact that double has more bits?
A number that can be reprsented as a float can be represented as double too.
What you read is just formatted output, you don't read actual binary representation.
System.out.println(Long.toBinaryString(Double.doubleToRawLongBits(.475d + .075d)));
// 11111111100001100110011001100110011001100110011001100110011001
System.out.println(Integer.toBinaryString(Float.floatToRawIntBits(.475f + .075f)));
// 111111000011001100110011001101
double d = .475d + .075d;
System.out.println(d);
// 0.5499999999999999
System.out.println((float)d);
// 0.55 (as expected)
System.out.println((double)(float)d);
// 0.550000011920929
System.out.println( .475f + .075f == 0.550000011920929d);
// true
Precision just means more bits. A number that cannot be represented as a float may have an exact representation as a double, but that the number of those cases is infinitely small relative to the total number of possible cases.
For the simple cases like 0.1, that is not representable as a fixed-length floating-point number, no matter what the number of bits available. This is the same as saying that a fraction such as 1/7 cannot be represented exactly in decimal, regardless of the number of digits you are allowed to use (as long as the number of digits is finite). You can approximate it as 0.142857142857142857... repeating over and over again, but you will never be able to write it EXACTLY no matter how long you go on.
Conversely, if a number is representable exactly as a float, it will also be representable exactly as a double. A double has a larger exponent range and more mantissa bits.
For your example, the cause of the apparent discrepancy is that in float, the difference between 0.475 and its float representation was in the 'right' direction so that when truncation occurred it went how you expected it. When increasing the precision available, the representation was "closer" to 0.475 but now on the opposite side. As a gross example, let's say that the closest possible float was 0.475006 but in a double the closest possible value was 0.474999. This would give you the results you see.
Edit: Here's the results of a quick experiment:
public class Test {
public static void main(String[] args)
{
float f = 0.475f;
double d = 0.475d;
System.out.printf("%20.16f", f);
System.out.printf("%20.16f", d);
}
}
Output:
0.4749999940395355 0.4750000000000000
What this means is that the floating-point representation of the number 0.475, if you had a huge number of bits, would be just a tiny bit less than 0.475. This is see in the double representation. However, the first 'wrong' bit occurs so far to the right that when truncated to fit in a float, it just happens to work out to 0.475. This is purely an accident.
If one regards that floating-point types actually represent ranges of values, rather than discrete values (e.g. 0.1f doesn't represent 13421773/134217728, but rather "something between 13421772.5/134217728 and 13421773.5/134217728"), conversions from double to float will usually be accurate, while conversions from float to double will usually not. Unfortunately, Java allows the usually-inaccurate conversions to be performed implicitly, while requiring a typecast in the usually-accurate direction.
For every value of type float, there exists a value of type double whose range is centered about the center of the float's range. That does not mean the double is an accurate representation of the value in the float. For example, converting 0.1f to double yields a value meaning "something between 13421772.9999999/134217728 and 13421773.0000001/134217728", a value which is off by over a million times the implied tolerance.
For almost every value of type double, there exists a value of type float whose range completely includes the range implied by the double. The only exceptions are values whose range is centered precisely on the boundary between two float values. Converting such values to float would require that the system chose one range or the other; if the system rounds up when the double actually represented a number below the center of its range, or vice versa, the range of the float would not totally encompass that of the double. In practical terms, though, this is a non-issue, since it means that instead of a float cast from a double representing a range like (13421772.5/134217728 to 13421773.5/134217728) it would represent a range like (13421772.4999999/134217728 to 13421773.5000001/134217728). Compared with the horrendous imprecision resulting from a float to double cast, that tiny imprecision is nothing.
BTW, returning to the particular numbers you are using, when you do your calculations as float, the computations are:
0.075f = 20132660±½ / 268435456
0.475f = 31876710±½ / 67108864
Sum = 18454938±½ / 33554432
In other words, the sum represents a number somewhere between roughly 0.54999999701 and 0.55000002682. The most natural representation is 0.55 (since the actual value could be more or less than that, additional digits would be meaningless).

Categories