Why do Scanner.nextDouble() and System.out.println behave differently regarding locale?

Why do Scanner.nextDouble() and System.out.println behave differently regarding locale? - java

I teach Java to beginning programmers and many of my Dutch students get confused the first time they have to use Scanner to input floating point values.
The default behavior of nextDouble() is to consult the locale setting of the computer. When the setting is Dutch, this means that a decimal comma has to be used.
On the other hand, System.out.println does not seem to consider the locale and uses a decimal point.
Consider for example the following output (user input in bold):
Give a number below 10.0
10,5
Sorry, 10.5 is not below 10.0
(When one would type 10.5, an InputMismatchException will be thrown.) The output above is produced by the following fragment
Scanner scanner = new Scanner( System.in );
double limit = 10.0;
System.out.println( "Give a number below "+limit );
double x = scanner.nextDouble();
if (x >= limit ) {
System.out.println( "Sorry, "+x+" is not below "+limit );
}
Is this an inconsistency in the library or do I use it in the wrong way?

The argument provided to this invocation
System.out.println( "Sorry, "+x+" is not below "+limit );
is a String that is the result of concatenating some String literals and some double values.
The JLS states
If only one operand expression is of type String, then string
conversion (§5.1.11) is performed on the other operand to produce a
string at run time.
and then
A value x of primitive type T is first converted to a reference value
as if by giving it as an argument to an appropriate class instance
creation expression (§15.9):
If T is double, then use new Double(x).
It then goes on to say
Otherwise, the conversion is performed as if by an invocation of the
toString method of the referenced object with no arguments; but if the
result of invoking the toString method is null, then the string "null"
is used instead.
The javadoc of Double#toString explains the format
Returns a string representation of this Double object. The primitive
double value represented by this object is converted to a string
exactly as if by the method toString of one argument.
which is the overloaded toString(double) method
Returns a string representation of the double argument. All characters
mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative,
the first character of the result is '-' ('\u002D'); if the sign is
positive, no sign character appears in the result. As for the
magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and
negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces
the result "0.0".
If m is greater than or equal to 10^-3 but less than 10^7, then it is represented as the integer part of m, in decimal form with no
leading zeroes, followed by '.' ('\u002E'), followed by one or more
decimal digits representing the fractional part of m.
If m is less than 10^-3 or greater than or equal to 10^7, then it is represented in so-called "computerized scientific notation." Let
n be the unique integer such that 10^n ≤ m < 10n+1; then let a be the
mathematically exact quotient of m and 10^n so that 1 ≤ a < 10. The
magnitude is then represented as the integer part of a, as a single
decimal digit, followed by '.' ('\u002E'), followed by decimal digits
representing the fractional part of a, followed by the letter 'E'
('\u0045'), followed by a representation of n as a decimal integer, as
produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a?
There must be at least one digit to represent the fractional part, and
beyond that as many, but only as many, more digits as are needed to
uniquely distinguish the argument value from adjacent values of type
double. That is, suppose that x is the exact mathematical value
represented by the decimal representation produced by this method for
a finite nonzero argument d. Then d must be the double value nearest
to x; or if two double values are equally close to x, then d must be
one of them and the least significant bit of the significand of d must
be 0.
To create localized string representations of a floating-point value,
use subclasses of NumberFormat.
which describes how to properly format a double.
Alternatively, use printf and provide the appropriate format pattern for floating point values.

String concatenation, String.format, printf, Double.toString, Double.valueOf all use the decimal point without thousand separator, compatible to java source code. This is fortunately a basic non-localized representation, useful for textual transport of data.
Scanner, NumberFormat, MessageFormat involve localization, sometimes a bit awkward. They are for high-level code, for user interaction. With the thousand separator too.
One has to make a choice to one. Using localized numbers at the minimum is a good exercise. Or when being lazy one might use Double.parseDouble instead of Scanner or:
Scanner scanner = new Scanner(System.in).useLocale(Locale.US);
Another format/value issue is the usage of double which is an approximation, and has no precision, as the fixed point BigDecimal. Doubles should be used for fast calculations, and BigDecimal for financial precision.
new BigDecimal("0.20") // Precision of 2, no loss.
0.20 // Not exactly 0.20, no precision on printing

Related

How does Double.toString() work if a fraction number cannot be precisely represented in binary?

I am unable to understand how Double.toString() works in Java/JVM.
My understanding is that in general fraction numbers cannot be represented precisely in floating point types such as Double and Float. For example, the binary representation of 206.64 would be 206.6399999999999863575794734060764312744140625. Then how come (206.64).toString() returns "206.64" instead of "206.6399999999999863575794734060764312744140625"?
Test code in Kotlin.
#Test
fun testBigDecimalToString() {
val value = 206.64
val expected = "206.64"
val bigDecimal = BigDecimal(value)
assertEquals(expected, value.toString()) // success
assertEquals(expected, bigDecimal.toString()) // failed. Actual: 206.6399999999999863575794734060764312744140625
}

The number of digits you see when a float or a double is printed is a consequence of Java’s rules for default conversion of float and double to decimal.
Java’s default formatting for floating-point numbers uses the fewest significant decimal digits needed to distinguish the number from nearby representable numbers.1
In your example, 206.64 in source text is converted to the double value 206.6399999999999863575794734060764312744140625, because, of all the values representable in the double type, that one is closest to 206.64. The next lower and next higher values are 206.639999999999957935870043002068996429443359375 and
206.640000000000014779288903810083866119384765625.
When printing this value, Java only needs to print “206.64”, because that is enough that we can pick out the double value 206.6399999999999863575794734060764312744140625 from its neighbors 206.639999999999957935870043002068996429443359375 and
206.640000000000014779288903810083866119384765625. Note that, starting from the end of the 9s in 206.63999…, that first value differs from 206.64 by .1364…, whereas the third value, 206.64000…, differs by .1477…. So, when Java prints “206.64”, it means the value of the double being printed is the nearest representable value, and that is the 206.6399999999999863575794734060764312744140625 value, not the farther 206.640000000000014779288903810083866119384765625 value.
Footnote
1 The rule for Java SE 10 can be found in the documentation for java.lang.float, in the toString(float d) section. The double documentation is similar. The passage, with the most relevant part in bold, is:
Returns a string representation of the float argument. All characters mentioned below are ASCII characters.
If the argument is NaN, the result is the string "NaN".
Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative, the first character of the result is '-' ('\u002D'); if the sign is positive, no sign character appears in the result. As for the magnitude m:
If m is infinity, it is represented by the characters "Infinity"; thus, positive infinity produces the result "Infinity" and negative infinity produces the result "-Infinity".
If m is zero, it is represented by the characters "0.0"; thus, negative zero produces the result "-0.0" and positive zero produces the result "0.0".
If m is greater than or equal to 10-3 but less than 107, then it is represented as the integer part of m, in decimal form with no leading zeroes, followed by '.' ('\u002E'), followed by one or more decimal digits representing the fractional part of m.
If m is less than 10-3 or greater than or equal to 107, then it is represented in so-called "computerized scientific notation." Let n be the unique integer such that 10n ≤ m < 10n+1; then let a be the mathematically exact quotient of m and 10n so that 1 ≤ a < 10. The magnitude is then represented as the integer part of a, as a single decimal digit, followed by '.' ('\u002E'), followed by decimal digits representing the fractional part of a, followed by the letter 'E' ('\u0045'), followed by a representation of n as a decimal integer, as produced by the method Integer.toString(int).
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type float. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument f. Then f must be the float value nearest to x; or, if two float values are equally close to x, then f must be one of them and the least significant bit of the significand of f must be 0.

I am somewhat novice, so I hope someone with more experience can answer more thoroughly, but here is what I theorize is the reason...
Formatting
Although this is for the .NET framework and not specifically Java, I imagine that they work similarly: the toString method uses an optional formatter input, and most likely Java uses something similar, formatting the double to a close approximation in the toString method.
Considering that Oracle specifically states that toString should be concise and easy-to-read, likely such a method is implemented for Double.toString().
Only Necessary Digits to Distinguish...
This is about as much documentation as I could find on the specifics of the Double.toString() method -- note the last paragraph:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
I am curious what it means by "adjacent values of type double" (other variables?), but it seems to also concur with the above -- toString and other methods likely only use as few digits as possible to uniquely identify the double, rounding when the number is arbitrarily close enough, as in the case of 23.675999999999 being "close enough" to 23.676.
Or I could be wildly misunderstanding the documentation.

Why does Math.floor(1.23456789 * 1e8) / 1e8) return 1.23456788

Math.floor(1.23456789 * 1e8) / 1e8)
returns:
1.23456788
Which is strange considering that:
Math.floor(1.23456789 * 1e9) / 1e9)
returns:
1.23456789
and also
Math.floor(1.23456799 * 1e8) / 1e8)
returns:
1.23456799
Any idea why this is happening and how to avoid it?

As answered by Daniel Centore, double precision values are imprecise. Here is a list of the actual values (to 20 digits) for double precision numbers. Shown are the two closest encodings for 1.23456789; the first one is closer. When multiplying by 1e8, the multiply doesn't round up. When multiplying by 1e9, the multiply rounds up to an exact integer value.
1.23456789 => 1.2345678899999998901 hex 3ff3c0ca4283de1b
1.23456789 => 1.2345678900000001121 hex 3ff3c0ca4283de1c
1.23456789*1e8 => 123456788.99999998510 hex 419d6f3453ffffff
1.23456789*1e9 => 1234567890.0000000000 hex 41d26580b4800000

Floating point arithmetic is imprecise. The conversion to binary and back to decimal can cause problems (as binary cannot perfectly represent decimal fractions and visa-versa) on top of the fact that floating point has limited precision.
You can use BigDecimal to get perfect math, but it is much slower. This is only noticable if you will be doing many calculations.
Edit: Here's a BigDecimal tutorial.

The Java double type (almost) conforms to an international standard called IEEE-754, for floating point numbers. The numbers that can be expressed in this type all have one thing in common - their representations in binary terminate after at most 53 significant digits.
Most numbers with terminating decimal representations do not have terminating binary representations, which means there's no double that stores them exactly. When you write a double literal in Java, the value stored in the double will generally not be the number you wrote in the literal - instead it will be the nearest available double.
The literal 1.23456789 is no exception. It falls neatly between two numbers that can be stored as double, and the exact values of those two double numbers are 1.2345678899999998900938180668163113296031951904296875 and 1.23456789000000011213842299184761941432952880859375. The rule is that the closer of those two numbers is chosen, so the literal 1.23456789 is stored as 1.2345678899999998900938180668163113296031951904296875.
The literal 1E8 can be stored exactly as a double, so the multiplication in your first example is 1.2345678899999998900938180668163113296031951904296875 times 100000000, which of course is 123456788.99999998900938180668163113296031951904296875. This number can't be stored exactly as a double. The nearest double below it is 123456788.99999998509883880615234375 and the nearest double above it is 123456789 exactly. However, the double below it is closer, so the value of the Java expression 1.23456789 * 1E8 is actually 123456788.99999998509883880615234375. When you apply the Math.floor method to this number, the result is exactly 123456788.
The literal 1E9 can be stored exactly as a double, so the multiplication in your second example is 1.2345678899999998900938180668163113296031951904296875 times 1000000000, which of course is 1234567889.9999998900938180668163113296031951904296875. This number can't be stored exactly as a double. The nearest double below it is 1234567889.9999997615814208984375 and the nearest double above it is 1234567890 exactly. But this time, the double above it is closer, so the value of the Java expression 1.23456789 * 1E9 is exactly 1234567890, which is unchanged by the Math.floor method.
The second part of your question was how to avoid this. Well, if you want to do exact calculations involving numbers with terminating decimal representations, you must not store them in double variables. Instead, you can use the BigDecimal class, which lets you do things like this
BigDecimal a = new BigDecimal("1.23456789");
BigDecimal b = new BigDecimal("100000000");
BigDecimal product = a.multiply(b);
and the numbers are represented exactly.

Does "System.out.println( DECIMAL )" always output the same DECIMAL in the code?

Of course,
System.out.println( 0.1 );
outputs 0.1. But is it always true for an arbitrary decimal?
(EXCLUDE cases which result from the precision of double number itself. Such as, System.out.println( 0.10000000000000000001); outputs 0.1)
When I hit System.out.println( DECIMAL ); I think, DECIMAL is converted into binary(double) and that binary is converted into decimal (to output decimal as String)
Think about the following conversion.
decimal[D1] -> (CONVERSION1) -> binary[B] -> (CONVERSION2) -> decimal[D2]
CONVERSION1:
(within the range of significant digits of double) The nearest binary of [D1] is selected as [B]
e.g. [D1] 0.1 -> [B] 0x0.1999999999999a
CONVERSION2:
[D2] is the decimal number which can uniquely distinguish [B] and has smallest digits.
e.g. [B] 0x0.1999999999999a -> [D2] 0.1
QUOTE Java7 API Double.toString(double d)
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
My Question :
Is "[D1]=[D2]" always true?
Why I ask this Question :
Thinking about the following case,
save user's decimal input as double -> display that decimal
I'm wondering whether [ user's input = display ] is guaranteed or not.
(As mentioned above, exclude the cases which result from the precision of double number itself. Since that long input is rare case.)
I know when I need accurate arithmetic, I should use BigDecimal. But in this case, I don't need accurate arithmetic. Just want to display the same decimal as user's input.

It depends in part on your definition of equality. If you require exact string match, the answer is no. For example:
System.out.println(0.1e-1);
prints
0.01
Now assume that "equal" means decimal value equality, so that 0.1e-1 and 0.01 are equal.
If you limit your doubles to normal numbers (not subnormal, overflow, or underflow) with less than 16 significant decimal digits, you are safe. An infinity of decimal fractions round to each binary fraction that can be exactly represented in double. To recover the original, it has to be the shortest member of that set. That means the difference between it and the two nearest decimal numbers of the same or shorter length has to be big enough to ensure that they round to different doubles. To get another decimal number of the same or shorter length requires a change of at least one decimal ulp of the original number.
If two decimal numbers differ by more than one part in 2^54, and are in the normal number range, they are too far apart to map to the same double.
This reasoning does not apply to subnormal numbers because they have less precision than normal numbers:
System.out.println(0.123451234512345e-310);
prints
1.2345123451236E-311
even though the input has only 15 significant digits.

The first conversion, from decimal to binary, may yield another mathematical value, because not every decimal can be represented exactly as a binary number. This is true independent of the accuracy of the binary, take 0.1 as an example.
The second conversion, from binary to decimal, is always possible in a loss-less fashion, i.e. yielding the same mathematical value. This will in general need a ridiculously long decimal representation so in practice you will round the value to a much shorter representation, which is then no longer the same mathematical value.
The answer to your question "Is [D1]=[D2] always true?" is therefore in general no. It all depends on the accuracy of the binary and the decimal representations.

Precision for Double.parseDouble() and String.valueOf()

Does the following statement holds for any double (Java primitive double precision IEEE-754) except NaN:
Double.parseDouble(String.valueOf(d)) == d
Said otherwise, does parsing a serialized (using String.valueOf()) double value always yields the exact original double?

With the exception of NaN as you've said, yes, that invariant should hold. If not, that's a JDK bug right there.
Double.toString says this in its Javadoc:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
To summarize, it returns enough digits to identify this double uniquely, so Double.parseDouble should return the exact same double that was converted to a string.

Regarding BigDecimal

i do the below java print command for this double variable
double test=58.15;
When i do a System.out.println(test); and System.out.println(new Double(test).toString()); It prints as 58.15.
When i do a System.out.println(new BigDecimal(test)) I get the below value
58.14999999999999857891452847979962825775146484375
I am able to understand "test" double variable value is internally stored as 58.1499999. But when i do the below two System.out.println i am getting the output as 58.15 and not 58.1499999.
System.out.println(test);
System.out.println(new Double(test).toString());
It prints the output as 58.15 for the above two.
Is the above System.out.println statements are doing some rounding of the value 58.1499999 and printing it as 58.15?

System.out.println(new BigDecimal("58.15"));
To construct a BigDecimal from a hard-coded constant, you must always use one of constants in the class (ZERO, ONE, or TEN) or one of the string constructors. The reason is that one you put the value in a double, you've already lost precision that can never be regained.
EDIT: polygenelubricants is right. Specifically, you're using Double.toString or equivalent. To quote from there:
How many digits must be printed for
the fractional part of m or a? There
must be at least one digit to
represent the fractional part, and
beyond that as many, but only as many,
more digits as are needed to uniquely
distinguish the argument value from
adjacent values of type double. That
is, suppose that x is the exact
mathematical value represented by the
decimal representation produced by
this method for a finite nonzero
argument d. Then d must be the double
value nearest to x; or if two double
values are equally close to x, then d
must be one of them and the least
significant bit of the significand of
d must be 0.

Yes, println (or more precisely, Double.toString) rounds. For proof, System.out.println(.1D); prints 0.1, which is impossible to represent in binary.
Also, when using BigDecimal, don't use the double constructor, because that would attempt to precisely represent an imprecise value. Use the String constructor instead.

out.println and Double.toString() use the format specified in Double.toString(double).
BigDecimal uses more precision by default, as described in the javadoc, and when you call toString() it outputs all of the characters up to the precision level available to a primitive double since .15 does not have an exact binary representation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.