Precision for Double.parseDouble() and String.valueOf() - java

Does the following statement holds for any double (Java primitive double precision IEEE-754) except NaN:
Double.parseDouble(String.valueOf(d)) == d
Said otherwise, does parsing a serialized (using String.valueOf()) double value always yields the exact original double?

With the exception of NaN as you've said, yes, that invariant should hold. If not, that's a JDK bug right there.
Double.toString says this in its Javadoc:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
To summarize, it returns enough digits to identify this double uniquely, so Double.parseDouble should return the exact same double that was converted to a string.

Related

How does Double.toString() work if a fraction number cannot be precisely represented in binary?

I am unable to understand how Double.toString() works in Java/JVM.
My understanding is that in general fraction numbers cannot be represented precisely in floating point types such as Double and Float. For example, the binary representation of 206.64 would be 206.6399999999999863575794734060764312744140625. Then how come (206.64).toString() returns "206.64" instead of "206.6399999999999863575794734060764312744140625"?
Test code in Kotlin.
#Test
fun testBigDecimalToString() {
val value = 206.64
val expected = "206.64"
val bigDecimal = BigDecimal(value)
assertEquals(expected, value.toString()) // success
assertEquals(expected, bigDecimal.toString()) // failed. Actual: 206.6399999999999863575794734060764312744140625
}
The number of digits you see when a float or a double is printed is a consequence of Java’s rules for default conversion of float and double to decimal.
Java’s default formatting for floating-point numbers uses the fewest significant decimal digits needed to distinguish the number from nearby representable numbers.1
In your example, 206.64 in source text is converted to the double value 206.6399999999999863575794734060764312744140625, because, of all the values representable in the double type, that one is closest to 206.64. The next lower and next higher values are 206.639999999999957935870043002068996429443359375 and
206.640000000000014779288903810083866119384765625.
When printing this value, Java only needs to print “206.64”, because that is enough that we can pick out the double value 206.6399999999999863575794734060764312744140625 from its neighbors 206.639999999999957935870043002068996429443359375 and
206.640000000000014779288903810083866119384765625. Note that, starting from the end of the 9s in 206.63999…, that first value differs from 206.64 by .1364…, whereas the third value, 206.64000…, differs by .1477…. So, when Java prints “206.64”, it means the value of the double being printed is the nearest representable value, and that is the 206.6399999999999863575794734060764312744140625 value, not the farther 206.640000000000014779288903810083866119384765625 value.
Footnote
1 The rule for Java SE 10 can be found in the documentation for java.lang.float, in the toString(float d) section. The double documentation is similar. The passage, with the most relevant part in bold, is:
Returns a string representation of the float argument. All characters mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative, the first character of the result is '-' ('\u002D'); if the sign is positive, no sign character appears in the result. As for the magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces the result "0.0".
If m is greater than or equal to 10-3 but less than 107, then it is represented as the integer part of m, in decimal form with no leading zeroes, followed by '.' ('\u002E'), followed by one or more decimal digits representing the fractional part of m.
If m is less than 10-3 or greater than or equal to 107, then it is represented in so-called "computerized scientific notation." Let n be the unique integer such that 10n ≤ m < 10n+1; then let a be the mathematically exact quotient of m and 10n so that 1 ≤ a < 10. The magnitude is then represented as the integer part of a, as a single decimal digit, followed by '.' ('\u002E'), followed by decimal digits representing the fractional part of a, followed by the letter 'E' ('\u0045'), followed by a representation of n as a decimal integer, as produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type float. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument f. Then f must be the float value nearest to x; or, if two float values are equally close to x, then f must be one of them and the least significant bit of the significand of f must be 0.
I am somewhat novice, so I hope someone with more experience can answer more thoroughly, but here is what I theorize is the reason...
Formatting
Although this is for the .NET framework and not specifically Java, I imagine that they work similarly: the toString method uses an optional formatter input, and most likely Java uses something similar, formatting the double to a close approximation in the toString method.
Considering that Oracle specifically states that toString should be concise and easy-to-read, likely such a method is implemented for Double.toString().
Only Necessary Digits to Distinguish...
This is about as much documentation as I could find on the specifics of the Double.toString() method -- note the last paragraph:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
I am curious what it means by "adjacent values of type double" (other variables?), but it seems to also concur with the above -- toString and other methods likely only use as few digits as possible to uniquely identify the double, rounding when the number is arbitrarily close enough, as in the case of 23.675999999999 being "close enough" to 23.676.
Or I could be wildly misunderstanding the documentation.

Why do Scanner.nextDouble() and System.out.println behave differently regarding locale?

I teach Java to beginning programmers and many of my Dutch students get confused the first time they have to use Scanner to input floating point values.
The default behavior of nextDouble() is to consult the locale setting of the computer. When the setting is Dutch, this means that a decimal comma has to be used.
On the other hand, System.out.println does not seem to consider the locale and uses a decimal point.
Consider for example the following output (user input in bold):
Give a number below 10.0
10,5
Sorry, 10.5 is not below 10.0
(When one would type 10.5, an InputMismatchException will be thrown.) The output above is produced by the following fragment
Scanner scanner = new Scanner( System.in );
double limit = 10.0;
System.out.println( "Give a number below "+limit );
double x = scanner.nextDouble();
if (x >= limit ) {
System.out.println( "Sorry, "+x+" is not below "+limit );
}
Is this an inconsistency in the library or do I use it in the wrong way?
The argument provided to this invocation
System.out.println( "Sorry, "+x+" is not below "+limit );
is a String that is the result of concatenating some String literals and some double values.
The JLS states
If only one operand expression is of type String, then string
conversion (§5.1.11) is performed on the other operand to produce a
string at run time.
and then
A value x of primitive type T is first converted to a reference value
as if by giving it as an argument to an appropriate class instance
creation expression (§15.9):
If T is double, then use new Double(x).
It then goes on to say
Otherwise, the conversion is performed as if by an invocation of the
toString method of the referenced object with no arguments; but if the
result of invoking the toString method is null, then the string "null"
is used instead.
The javadoc of Double#toString explains the format
Returns a string representation of this Double object. The primitive
double value represented by this object is converted to a string
exactly as if by the method toString of one argument.
which is the overloaded toString(double) method
Returns a string representation of the double argument. All characters
mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative,
the first character of the result is '-' ('\u002D'); if the sign is
positive, no sign character appears in the result. As for the
magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and
negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces
the result "0.0".
If m is greater than or equal to 10^-3 but less than 10^7, then it is represented as the integer part of m, in decimal form with no
leading zeroes, followed by '.' ('\u002E'), followed by one or more
decimal digits representing the fractional part of m.
If m is less than 10^-3 or greater than or equal to 10^7, then it is represented in so-called "computerized scientific notation." Let
n be the unique integer such that 10^n ≤ m < 10n+1; then let a be the
mathematically exact quotient of m and 10^n so that 1 ≤ a < 10. The
magnitude is then represented as the integer part of a, as a single
decimal digit, followed by '.' ('\u002E'), followed by decimal digits
representing the fractional part of a, followed by the letter 'E'
('\u0045'), followed by a representation of n as a decimal integer, as
produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a?
There must be at least one digit to represent the fractional part, and
beyond that as many, but only as many, more digits as are needed to
uniquely distinguish the argument value from adjacent values of type
double. That is, suppose that x is the exact mathematical value
represented by the decimal representation produced by this method for
a finite nonzero argument d. Then d must be the double value nearest
to x; or if two double values are equally close to x, then d must be
one of them and the least significant bit of the significand of d must
be 0.
To create localized string representations of a floating-point value,
use subclasses of NumberFormat.
which describes how to properly format a double.
Alternatively, use printf and provide the appropriate format pattern for floating point values.
String concatenation, String.format, printf, Double.toString, Double.valueOf all use the decimal point without thousand separator, compatible to java source code. This is fortunately a basic non-localized representation, useful for textual transport of data.
Scanner, NumberFormat, MessageFormat involve localization, sometimes a bit awkward. They are for high-level code, for user interaction. With the thousand separator too.
One has to make a choice to one. Using localized numbers at the minimum is a good exercise. Or when being lazy one might use Double.parseDouble instead of Scanner or:
Scanner scanner = new Scanner(System.in).useLocale(Locale.US);
Another format/value issue is the usage of double which is an approximation, and has no precision, as the fixed point BigDecimal. Doubles should be used for fast calculations, and BigDecimal for financial precision.
new BigDecimal("0.20") // Precision of 2, no loss.
0.20 // Not exactly 0.20, no precision on printing

String to double with precision

I do have a String that is read from the file with the value: 38.739793110376837
When i convert this value to double using:
double.parseDouble("38.739793110376837");
I get the result: 38.73979311037684
How can i have the original value in a double variable?
I don't want to use BigDecimal data type.
How can i have the original value in a double variable?
You can't. The closest double to your original value is exactly 38.739793110376837148578488267958164215087890625
That's being converted to "38.73979311037684" because that's the shortest value which uniquely identifies that double value. From the documentation:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
If you want to retain exactly your original value, you should be using BigDecimal.

float to double giving Strange results [duplicate]

This question already has answers here:
Float precision with specific numbers
(3 answers)
Closed 8 years ago.
I came across with this behavior of float and double during type casting.
I modified my actual statements to better understanding.
1
System.out.println((double)((float)(128.12301)));//Output:128.12301635712188
Same output all the time.
2
System.out.println((double)((float)(128888.12301)));//Output:128888.125
Both outputs are strange for me I can't understand how it's working.
Can anyone help me out?
There are several steps involved here, each with different numbers. Let's split the code up for each statement:
double original = 128.12301; // Or 128888.12301
float floatValue = (float) original;
double backToDouble = (double) floatValue;
System.out.println(backToDouble);
So for each number, the steps are:
Compile time: Convert the decimal value in the source code into the nearest exact double value
Execution time: Convert the double value to the nearest exact float value
Execution time: Convert the float value into a double value (this never loses any information)
Execution time: Convert the final double value into a string
Steps 1 and 2 can lose information; step 4 doesn't always print the exact value - it just follows what Double.toString(double) does.
So let's take 128.12301 as an example. That's converted at compile-time to exactly 128.123009999999993624442140571773052215576171875. Then the conversion to float yields exactly 128.123016357421875. So after the conversion back to double (which preserves the value) we print out 128.123016357421875. That prints 128.12301635712188 because that's the fewest digits in can print out without being ambiguous between that value and the nearest double value greater than or less than it.
Now with 128888.12301, the exact double value is 128888.123009999995701946318149566650390625 - and the closest float to that is exactly
128888.125. After converting that back to a double, the exact value of that double is printed out because there are other exact double values near it.
Basically, the result will depend on how many significant digits you've included to start with, and how much information is lost when it rounds to the nearest double and then to the nearest float.
System.out.println(number) will go through Double.toString() which is a fairly complex method (as seen in its documentation) an will not always behave as you'd expect. It basically gives the shortest string which uniquely determines number.
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significant of d must be 0.
A float is a 32 bit IEEE 754 floating point.
A double is a 64 bit IEEE 754 floating point.
It is the same for float and double, both are binary floating point types, but double has more precision than float.
Check this for more details

Regarding BigDecimal

i do the below java print command for this double variable
double test=58.15;
When i do a System.out.println(test); and System.out.println(new Double(test).toString()); It prints as 58.15.
When i do a System.out.println(new BigDecimal(test)) I get the below value
58.14999999999999857891452847979962825775146484375
I am able to understand "test" double variable value is internally stored as 58.1499999. But when i do the below two System.out.println i am getting the output as 58.15 and not 58.1499999.
System.out.println(test);
System.out.println(new Double(test).toString());
It prints the output as 58.15 for the above two.
Is the above System.out.println statements are doing some rounding of the value 58.1499999 and printing it as 58.15?
System.out.println(new BigDecimal("58.15"));
To construct a BigDecimal from a hard-coded constant, you must always use one of constants in the class (ZERO, ONE, or TEN) or one of the string constructors. The reason is that one you put the value in a double, you've already lost precision that can never be regained.
EDIT: polygenelubricants is right. Specifically, you're using Double.toString or equivalent. To quote from there:
How many digits must be printed for
the fractional part of m or a? There
must be at least one digit to
represent the fractional part, and
beyond that as many, but only as many,
more digits as are needed to uniquely
distinguish the argument value from
adjacent values of type double. That
is, suppose that x is the exact
mathematical value represented by the
decimal representation produced by
this method for a finite nonzero
argument d. Then d must be the double
value nearest to x; or if two double
values are equally close to x, then d
must be one of them and the least
significant bit of the significand of
d must be 0.
Yes, println (or more precisely, Double.toString) rounds. For proof, System.out.println(.1D); prints 0.1, which is impossible to represent in binary.
Also, when using BigDecimal, don't use the double constructor, because that would attempt to precisely represent an imprecise value. Use the String constructor instead.
out.println and Double.toString() use the format specified in Double.toString(double).
BigDecimal uses more precision by default, as described in the javadoc, and when you call toString() it outputs all of the characters up to the precision level available to a primitive double since .15 does not have an exact binary representation.

Categories