Related
When I try to parse the following string into a float and into a double :
String abc = "8.40";
System.out.println("Double Value: " + Double.parseDouble(abc) * 100);
System.out.println("Float Value: " + Float.parseFloat(abc) * 100);
I get two different results.
Double Value: 840.0
Float Value: 839.99994
But when I try the same code with multiplying the float and double by 10 or 1000 I get the similar results for both of them.
String abc = "8.40";
System.out.println("Double Value: " + Double.parseDouble(abc) * 10);
System.out.println("Float Value: " + Float.parseFloat(abc) * 10);
I get two similar results.
Double Value: 84.0
Float Value: 84.0
And when I try this :
String abc = "8.40";
System.out.println("Double Value: " + Double.parseDouble(abc) * 1000);
System.out.println("Float Value: " + Float.parseFloat(abc) * 1000);
I get two similar results.
Double Value: 8400.0
Float Value: 8400.0
This will work fine:
System.out.println("Float Value: "+Math.round((float)Float.parseFloat(abc)*100));
So, this happens because of different representation of double and float, or more precise, about IEEE-754 rounding for float. Read about it here.
float has a smaller range and precision, so double would be better when you have memory (which you do today). But, they are both evil! There is a better option in Java called BigDecimal and you should use it, since it doesn't have problem with size and today we have strong computers so we will not have problems with memory and speed when dealing with a large number of decimal numbers needing max precision. For example, if you work on software that deals with a lot of money transactions, its a must to use BigDecimal.
It is true that double has more precision than float, but both of them suffer from the same problem: their value may not be exact, and they both have some (small) rounding error in their Least Significant Bit (LSB). This is clear in the first result you got: float value is not accurate. But when you multiply by 10 or 1000, the LSB is discarded from the result, and so you get the right answer for both float and double.
public class doublePrecision {
public static void main(String[] args) {
double total = 0;
total += 5.6;
total += 5.8;
System.out.println(total);
}
}
The above code prints:
11.399999999999
How would I get this to just print (or be able to use it as) 11.4?
As others have mentioned, you'll probably want to use the BigDecimal class, if you want to have an exact representation of 11.4.
Now, a little explanation into why this is happening:
The float and double primitive types in Java are floating point numbers, where the number is stored as a binary representation of a fraction and a exponent.
More specifically, a double-precision floating point value such as the double type is a 64-bit value, where:
1 bit denotes the sign (positive or negative).
11 bits for the exponent.
52 bits for the significant digits (the fractional part as a binary).
These parts are combined to produce a double representation of a value.
(Source: Wikipedia: Double precision)
For a detailed description of how floating point values are handled in Java, see the Section 4.2.3: Floating-Point Types, Formats, and Values of the Java Language Specification.
The byte, char, int, long types are fixed-point numbers, which are exact representions of numbers. Unlike fixed point numbers, floating point numbers will some times (safe to assume "most of the time") not be able to return an exact representation of a number. This is the reason why you end up with 11.399999999999 as the result of 5.6 + 5.8.
When requiring a value that is exact, such as 1.5 or 150.1005, you'll want to use one of the fixed-point types, which will be able to represent the number exactly.
As has been mentioned several times already, Java has a BigDecimal class which will handle very large numbers and very small numbers.
From the Java API Reference for the BigDecimal class:
Immutable,
arbitrary-precision signed decimal
numbers. A BigDecimal consists of an
arbitrary precision integer unscaled
value and a 32-bit integer scale. If
zero or positive, the scale is the
number of digits to the right of the
decimal point. If negative, the
unscaled value of the number is
multiplied by ten to the power of the
negation of the scale. The value of
the number represented by the
BigDecimal is therefore (unscaledValue
× 10^-scale).
There has been many questions on Stack Overflow relating to the matter of floating point numbers and its precision. Here is a list of related questions that may be of interest:
Why do I see a double variable initialized to some value like 21.4 as 21.399999618530273?
How to print really big numbers in C++
How is floating point stored? When does it matter?
Use Float or Decimal for Accounting Application Dollar Amount?
If you really want to get down to the nitty gritty details of floating point numbers, take a look at What Every Computer Scientist Should Know About Floating-Point Arithmetic.
When you input a double number, for example, 33.33333333333333, the value you get is actually the closest representable double-precision value, which is exactly:
33.3333333333333285963817615993320941925048828125
Dividing that by 100 gives:
0.333333333333333285963817615993320941925048828125
which also isn't representable as a double-precision number, so again it is rounded to the nearest representable value, which is exactly:
0.3333333333333332593184650249895639717578887939453125
When you print this value out, it gets rounded yet again to 17 decimal digits, giving:
0.33333333333333326
If you just want to process values as fractions, you can create a Fraction class which holds a numerator and denominator field.
Write methods for add, subtract, multiply and divide as well as a toDouble method. This way you can avoid floats during calculations.
EDIT: Quick implementation,
public class Fraction {
private int numerator;
private int denominator;
public Fraction(int n, int d){
numerator = n;
denominator = d;
}
public double toDouble(){
return ((double)numerator)/((double)denominator);
}
public static Fraction add(Fraction a, Fraction b){
if(a.denominator != b.denominator){
double aTop = b.denominator * a.numerator;
double bTop = a.denominator * b.numerator;
return new Fraction(aTop + bTop, a.denominator * b.denominator);
}
else{
return new Fraction(a.numerator + b.numerator, a.denominator);
}
}
public static Fraction divide(Fraction a, Fraction b){
return new Fraction(a.numerator * b.denominator, a.denominator * b.numerator);
}
public static Fraction multiply(Fraction a, Fraction b){
return new Fraction(a.numerator * b.numerator, a.denominator * b.denominator);
}
public static Fraction subtract(Fraction a, Fraction b){
if(a.denominator != b.denominator){
double aTop = b.denominator * a.numerator;
double bTop = a.denominator * b.numerator;
return new Fraction(aTop-bTop, a.denominator*b.denominator);
}
else{
return new Fraction(a.numerator - b.numerator, a.denominator);
}
}
}
Observe that you'd have the same problem if you used limited-precision decimal arithmetic, and wanted to deal with 1/3: 0.333333333 * 3 is 0.999999999, not 1.00000000.
Unfortunately, 5.6, 5.8 and 11.4 just aren't round numbers in binary, because they involve fifths. So the float representation of them isn't exact, just as 0.3333 isn't exactly 1/3.
If all the numbers you use are non-recurring decimals, and you want exact results, use BigDecimal. Or as others have said, if your values are like money in the sense that they're all a multiple of 0.01, or 0.001, or something, then multiply everything by a fixed power of 10 and use int or long (addition and subtraction are trivial: watch out for multiplication).
However, if you are happy with binary for the calculation, but you just want to print things out in a slightly friendlier format, try java.util.Formatter or String.format. In the format string specify a precision less than the full precision of a double. To 10 significant figures, say, 11.399999999999 is 11.4, so the result will be almost as accurate and more human-readable in cases where the binary result is very close to a value requiring only a few decimal places.
The precision to specify depends a bit on how much maths you've done with your numbers - in general the more you do, the more error will accumulate, but some algorithms accumulate it much faster than others (they're called "unstable" as opposed to "stable" with respect to rounding errors). If all you're doing is adding a few values, then I'd guess that dropping just one decimal place of precision will sort things out. Experiment.
You may want to look into using java's java.math.BigDecimal class if you really need precision math. Here is a good article from Oracle/Sun on the case for BigDecimal. While you can never represent 1/3 as someone mentioned, you can have the power to decide exactly how precise you want the result to be. setScale() is your friend.. :)
Ok, because I have way too much time on my hands at the moment here is a code example that relates to your question:
import java.math.BigDecimal;
/**
* Created by a wonderful programmer known as:
* Vincent Stoessel
* xaymaca#gmail.com
* on Mar 17, 2010 at 11:05:16 PM
*/
public class BigUp {
public static void main(String[] args) {
BigDecimal first, second, result ;
first = new BigDecimal("33.33333333333333") ;
second = new BigDecimal("100") ;
result = first.divide(second);
System.out.println("result is " + result);
//will print : result is 0.3333333333333333
}
}
and to plug my new favorite language, Groovy, here is a neater example of the same thing:
import java.math.BigDecimal
def first = new BigDecimal("33.33333333333333")
def second = new BigDecimal("100")
println "result is " + first/second // will print: result is 0.33333333333333
Pretty sure you could've made that into a three line example. :)
If you want exact precision, use BigDecimal. Otherwise, you can use ints multiplied by 10 ^ whatever precision you want.
As others have noted, not all decimal values can be represented as binary since decimal is based on powers of 10 and binary is based on powers of two.
If precision matters, use BigDecimal, but if you just want friendly output:
System.out.printf("%.2f\n", total);
Will give you:
11.40
You're running up against the precision limitation of type double.
Java.Math has some arbitrary-precision arithmetic facilities.
You can't, because 7.3 doesn't have a finite representation in binary. The closest you can get is 2054767329987789/2**48 = 7.3+1/1407374883553280.
Take a look at http://docs.python.org/tutorial/floatingpoint.html for a further explanation. (It's on the Python website, but Java and C++ have the same "problem".)
The solution depends on what exactly your problem is:
If it's that you just don't like seeing all those noise digits, then fix your string formatting. Don't display more than 15 significant digits (or 7 for float).
If it's that the inexactness of your numbers is breaking things like "if" statements, then you should write if (abs(x - 7.3) < TOLERANCE) instead of if (x == 7.3).
If you're working with money, then what you probably really want is decimal fixed point. Store an integer number of cents or whatever the smallest unit of your currency is.
(VERY UNLIKELY) If you need more than 53 significant bits (15-16 significant digits) of precision, then use a high-precision floating-point type, like BigDecimal.
private void getRound() {
// this is very simple and interesting
double a = 5, b = 3, c;
c = a / b;
System.out.println(" round val is " + c);
// round val is : 1.6666666666666667
// if you want to only two precision point with double we
// can use formate option in String
// which takes 2 parameters one is formte specifier which
// shows dicimal places another double value
String s = String.format("%.2f", c);
double val = Double.parseDouble(s);
System.out.println(" val is :" + val);
// now out put will be : val is :1.67
}
Use java.math.BigDecimal
Doubles are binary fractions internally, so they sometimes cannot represent decimal fractions to the exact decimal.
/*
0.8 1.2
0.7 1.3
0.7000000000000002 2.3
0.7999999999999998 4.2
*/
double adjust = fToInt + 1.0 - orgV;
// The following two lines works for me.
String s = String.format("%.2f", adjust);
double val = Double.parseDouble(s);
System.out.println(val); // output: 0.8, 0.7, 0.7, 0.8
Doubles are approximations of the decimal numbers in your Java source. You're seeing the consequence of the mismatch between the double (which is a binary-coded value) and your source (which is decimal-coded).
Java's producing the closest binary approximation. You can use the java.text.DecimalFormat to display a better-looking decimal value.
Short answer: Always use BigDecimal and make sure you are using the constructor with String argument, not the double one.
Back to your example, the following code will print 11.4, as you wish.
public class doublePrecision {
public static void main(String[] args) {
BigDecimal total = new BigDecimal("0");
total = total.add(new BigDecimal("5.6"));
total = total.add(new BigDecimal("5.8"));
System.out.println(total);
}
}
Multiply everything by 100 and store it in a long as cents.
Computers store numbers in binary and can't actually represent numbers such as 33.333333333 or 100.0 exactly. This is one of the tricky things about using doubles. You will have to just round the answer before showing it to a user. Luckily in most applications, you don't need that many decimal places anyhow.
Floating point numbers differ from real numbers in that for any given floating point number there is a next higher floating point number. Same as integers. There's no integer between 1 and 2.
There's no way to represent 1/3 as a float. There's a float below it and there's a float above it, and there's a certain distance between them. And 1/3 is in that space.
Apfloat for Java claims to work with arbitrary precision floating point numbers, but I've never used it. Probably worth a look.
http://www.apfloat.org/apfloat_java/
A similar question was asked here before
Java floating point high precision library
Use a BigDecimal. It even lets you specify rounding rules (like ROUND_HALF_EVEN, which will minimize statistical error by rounding to the even neighbor if both are the same distance; i.e. both 1.5 and 2.5 round to 2).
Why not use the round() method from Math class?
// The number of 0s determines how many digits you want after the floating point
// (here one digit)
total = (double)Math.round(total * 10) / 10;
System.out.println(total); // prints 11.4
Check out BigDecimal, it handles problems dealing with floating point arithmetic like that.
The new call would look like this:
term[number].coefficient.add(co);
Use setScale() to set the number of decimal place precision to be used.
If you have no choice other than using double values, can use the below code.
public static double sumDouble(double value1, double value2) {
double sum = 0.0;
String value1Str = Double.toString(value1);
int decimalIndex = value1Str.indexOf(".");
int value1Precision = 0;
if (decimalIndex != -1) {
value1Precision = (value1Str.length() - 1) - decimalIndex;
}
String value2Str = Double.toString(value2);
decimalIndex = value2Str.indexOf(".");
int value2Precision = 0;
if (decimalIndex != -1) {
value2Precision = (value2Str.length() - 1) - decimalIndex;
}
int maxPrecision = value1Precision > value2Precision ? value1Precision : value2Precision;
sum = value1 + value2;
String s = String.format("%." + maxPrecision + "f", sum);
sum = Double.parseDouble(s);
return sum;
}
You can Do the Following!
System.out.println(String.format("%.12f", total));
if you change the decimal value here %.12f
So far I understand it as main goal to get correct double from wrong double.
Look for my solution how to get correct value from "approximate" wrong value - if it is real floating point it rounds last digit - counted from all digits - counting before dot and try to keep max possible digits after dot - hope that it is enough precision for most cases:
public static double roundError(double value) {
BigDecimal valueBigDecimal = new BigDecimal(Double.toString(value));
String valueString = valueBigDecimal.toPlainString();
if (!valueString.contains(".")) return value;
String[] valueArray = valueString.split("[.]");
int places = 16;
places -= valueArray[0].length();
if ("56789".contains("" + valueArray[0].charAt(valueArray[0].length() - 1))) places--;
//System.out.println("Rounding " + value + "(" + valueString + ") to " + places + " places");
return valueBigDecimal.setScale(places, RoundingMode.HALF_UP).doubleValue();
}
I know it is long code, sure not best, maybe someone can fix it to be more elegant. Anyway it is working, see examples:
roundError(5.6+5.8) = 11.399999999999999 = 11.4
roundError(0.4-0.3) = 0.10000000000000003 = 0.1
roundError(37235.137567000005) = 37235.137567
roundError(1/3) 0.3333333333333333 = 0.333333333333333
roundError(3723513756.7000005) = 3.7235137567E9 (3723513756.7)
roundError(3723513756123.7000005) = 3.7235137561237E12 (3723513756123.7)
roundError(372351375612.7000005) = 3.723513756127E11 (372351375612.7)
roundError(1.7976931348623157) = 1.797693134862316
Do not waste your efford using BigDecimal. In 99.99999% cases you don't need it. java double type is of cource approximate but in almost all cases, it is sufficiently precise. Mind that your have an error at 14th significant digit. This is really negligible!
To get nice output use:
System.out.printf("%.2f\n", total);
I was trying to really learn more about floats, doubles and bigdecimals in Java. I wanted to know exactly how a floating point number gets represented in each type, for ex. floats use 2^, big decimals use 10^ plus scaled(32-bit) and unscaled values (arbitrary precision).
I put together simple calcs using all three types and did conversations for each, the result is rather confusing. I would appreciate some hints about why the only correct representation is for float, and why when converted into Double and BigDecimal there were trailing imprecisions. Is it to do with binary representation conversions? Anyhow here are the code and its output:
// Float - 32b
float a = 3.14f;
float b = 3.100004f;
float abAsAFloat = a + b;
double abAsADouble = a + b;
BigDecimal abAsABigDecimal = new BigDecimal(a + b);
System.out.println("a + b as a float: " + abAsAFloat);
System.out.println("a + b as a double: " + abAsADouble);
System.out.println("a + b as a BigDecimal: " + abAsABigDecimal);
// Double - 64b
double c = 3.14;
double d = 3.100004;
double cdAsADouble = c + d;
BigDecimal cdAsABigDecimal = new BigDecimal(c + d);
System.out.println("c + d as a double: " + cdAsADouble);
System.out.println("c + d as a BigDecimal: " + cdAsABigDecimal);
// BigDecimal, arbitrary-precision, signBit*unscaledValue × 10^-scale
BigDecimal e = new BigDecimal(3.14);
BigDecimal f = new BigDecimal(3.100004);
BigDecimal efAsABigDecimal = e.add(f);
System.out.println("e + f: " + efAsABigDecimal);
// Drawbacks. speed, memory, native value equality, no overloads for +/- et al
a + b as a float: 6.240004
a + b as a double: 6.240004062652588
a + b as a BigDecimal: 6.240004062652587890625
c + d as a double: 6.240004000000001
c + d as a BigDecimal:
6.2400040000000007722746886429376900196075439453125
e + f: 6.240004000000000328185478792875073850154876708984375
You're inadvertently mixing types. For example:
BigDecimal e = new BigDecimal(3.14);
BigDecimal f = new BigDecimal(3.100004);
In this case, you're providing doubles as inputs, so e and f will have double residues. Instead, use this:
BigDecimal e = new BigDecimal("3.14");
BigDecimal f = new BigDecimal("3.100004");
The float output is seemingly the most accurate because Java "knows" floats have a limited precision, so it won't print fifteen digits.
float may look correct for this particular case, but it will be just as wrong for other values. Note that when float and double are converted to strings, only as many digits are printed as are necessary to get the right value in that type; this means float may print "the correct answer" even when that representation conceals just as much rounding error as double.
The problem with BigDecimal is that you're not using it correctly: you should be writing new BigDecimal("3.14") instead of new BigDecimal(3.14), which allows double to "mess it up" before BigDecimal has the chance to "fix it."
For the details of representation, https://en.wikipedia.org/wiki/Double-precision_floating-point_format has a thorough explanation with useful diagrams, but the short explanation is that float and double represent numbers as +/- 1 * 1. * 2^, where float stores the mantissa with 22 bits and the exponent with 8 bits, and double uses 52 and 11 bits respectively.
When you convert to either double or BigDecimal it converts to the closest representable value. When you convert to BigDecimal you are actually converting to double first as there is no direct conversion from float.
Usually the you want to convert from double to BigDecimal using BigDecimal.valueOf(double) This method assumes a certain level of rounding to match what the double would look like if you printed it.
Read this: Java Language Specification. Chapter 5. Conversions and Promotions
Especially, 5.6. Numeric Promotions
i.e
float a = 3.14f;
float b = 3.100004f;
double abAsADouble = a + b;
in this case first a will be added to b, giving a float result, then float will be converted to double and assigned. So, it might have a loss of precision, when comparing to (double)a + b;
The same thing, when using sum result as parameter to the constructor
new BigDecimal(a + b)
first, float a added to float b, giving a float result, after that it is converted to double and then BigDecimal object begins being constructed.
Any numeric constants with decimal point, unless you specify f at the end, are considered to be a double, so, when passing constant to the constructor:
new BigDecimal(3.100004);
Number is stored as double and passed as double precision to the constructor. To achieve more precision, use String parameter constructor instead:
new BigDecimal("3.100004");
The float data type is a single-precision 32-bit IEEE 754 floating point and the double data type is a double-precision 64-bit IEEE 754 floating point.
What does it mean? And when should I use float instead of double or vice-versa?
The Wikipedia page on it is a good place to start.
To sum up:
float is represented in 32 bits, with 1 sign bit, 8 bits of exponent, and 23 bits of the significand (or what follows from a scientific-notation number: 2.33728*1012; 33728 is the significand).
double is represented in 64 bits, with 1 sign bit, 11 bits of exponent, and 52 bits of significand.
By default, Java uses double to represent its floating-point numerals (so a literal 3.14 is typed double). It's also the data type that will give you a much larger number range, so I would strongly encourage its use over float.
There may be certain libraries that actually force your usage of float, but in general - unless you can guarantee that your result will be small enough to fit in float's prescribed range, then it's best to opt with double.
If you require accuracy - for instance, you can't have a decimal value that is inaccurate (like 1/10 + 2/10), or you're doing anything with currency (for example, representing $10.33 in the system), then use a BigDecimal, which can support an arbitrary amount of precision and handle situations like that elegantly.
A float gives you approx. 6-7 decimal digits precision while a double gives you approx. 15-16. Also the range of numbers is larger for double.
A double needs 8 bytes of storage space while a float needs just 4 bytes.
Floating-point numbers, also known as real numbers, are used when evaluating expressions that require fractional precision. For example, calculations such as square root, or transcendentals such as sine and cosine, result in a value whose precision requires a floating-point type. Java implements the standard (IEEE–754) set of floatingpoint types and operators. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. Their width and ranges are shown here:
Name Width in Bits Range
double 64 1 .7e–308 to 1.7e+308
float 32 3 .4e–038 to 3.4e+038
float
The type float specifies a single-precision value that uses 32 bits of storage. Single precision is faster on some processors and takes half as much space as double precision, but will become imprecise when the values are either very large or very small. Variables of type float are useful when you need a fractional component, but don't require a large degree of precision.
Here are some example float variable declarations:
float hightemp, lowtemp;
double
Double precision, as denoted by the double keyword, uses 64 bits to store a value. Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations. All transcendental math functions, such as sin( ), cos( ), and sqrt( ), return double values. When you need to maintain accuracy over many iterative calculations, or are manipulating large-valued numbers, double is the best choice.
This will give error:
public class MyClass {
public static void main(String args[]) {
float a = 0.5;
}
}
/MyClass.java:3: error: incompatible types: possible lossy conversion from double to float
float a = 0.5;
This will work perfectly fine
public class MyClass {
public static void main(String args[]) {
double a = 0.5;
}
}
This will also work perfectly fine
public class MyClass {
public static void main(String args[]) {
float a = (float)0.5;
}
}
Reason : Java by default stores real numbers as double to ensure higher precision.
Double takes more space but more precise during computation and float takes less space but less precise.
Java seems to have a bias towards using double for computations nonetheless:
Case in point the program I wrote earlier today, the methods didn't work when I used float, but now work great when I substituted float with double (in the NetBeans IDE):
package palettedos;
import java.util.*;
class Palettedos{
private static Scanner Z = new Scanner(System.in);
public static final double pi = 3.142;
public static void main(String[]args){
Palettedos A = new Palettedos();
System.out.println("Enter the base and height of the triangle respectively");
int base = Z.nextInt();
int height = Z.nextInt();
System.out.println("Enter the radius of the circle");
int radius = Z.nextInt();
System.out.println("Enter the length of the square");
long length = Z.nextInt();
double tArea = A.calculateArea(base, height);
double cArea = A.calculateArea(radius);
long sqArea = A.calculateArea(length);
System.out.println("The area of the triangle is\t" + tArea);
System.out.println("The area of the circle is\t" + cArea);
System.out.println("The area of the square is\t" + sqArea);
}
double calculateArea(int base, int height){
double triArea = 0.5*base*height;
return triArea;
}
double calculateArea(int radius){
double circArea = pi*radius*radius;
return circArea;
}
long calculateArea(long length){
long squaArea = length*length;
return squaArea;
}
}
According to the IEEE standards, float is a 32 bit representation of a real number while double is a 64 bit representation.
In Java programs we normally mostly see the use of double data type. It's just to avoid overflows as the range of numbers that can be accommodated using the double data type is more that the range when float is used.
Also when high precision is required, the use of double is encouraged. Few library methods that were implemented a long time ago still requires the use of float data type as a must (that is only because it was implemented using float, nothing else!).
But if you are certain that your program requires small numbers and an overflow won't occur with your use of float, then the use of float will largely improve your space complexity as floats require half the memory as required by double.
This example illustrates how to extract the sign (the leftmost bit), exponent (the 8 following bits) and mantissa (the 23 rightmost bits) from a float in Java.
int bits = Float.floatToIntBits(-0.005f);
int sign = bits >>> 31;
int exp = (bits >>> 23 & ((1 << 8) - 1)) - ((1 << 7) - 1);
int mantissa = bits & ((1 << 23) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
Float.intBitsToFloat((sign << 31) | (exp + ((1 << 7) - 1)) << 23 | mantissa));
The same approach can be used for double’s (11 bit exponent and 52 bit mantissa).
long bits = Double.doubleToLongBits(-0.005);
long sign = bits >>> 63;
long exp = (bits >>> 52 & ((1 << 11) - 1)) - ((1 << 10) - 1);
long mantissa = bits & ((1L << 52) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
Double.longBitsToDouble((sign << 63) | (exp + ((1 << 10) - 1)) << 52 | mantissa));
Credit: http://s-j.github.io/java-float/
You should use double instead of float for precise calculations, and float instead of double when using less accurate calculations. Float contains only decimal numbers, but double contains an IEEE754 double-precision floating point number, making it easier to contain and computate numbers more accurately. Hope this helps.
In regular programming calculations, we don’t use float. If we ensure that the result range is within the range of float data type then we can choose a float data type for saving memory. Generally, we use double because of two reasons:-
If we want to use the floating-point number as float data type then method caller must explicitly suffix F or f, because by default every floating-point number is treated as double. It increases the burden to the programmer. If we use a floating-point number as double data type then we don’t need to add any suffix.
Float is a single-precision data type means it occupies 4 bytes. Hence in large computations, we will not get a complete result. If we choose double data type, it occupies 8 bytes and we will get complete results.
Both float and double data types were designed especially for scientific calculations, where approximation errors are acceptable. If accuracy is the most prior concern then, it is recommended to use BigDecimal class instead of float or double data types. Source:- Float and double datatypes in Java
My coworker did this experiment:
public class DoubleDemo {
public static void main(String[] args) {
double a = 1.435;
double b = 1.43;
double c = a - b;
System.out.println(c);
}
}
For this first-grade operation I expected this output:
0.005
But unexpectedly the output was:
0.0050000000000001155
Why does double fails in such a simple operation? And if double is not the datatype for this work, what should I use?
double is internally stored as a fraction in binary -- like 1/4 + 1/8 + 1/16 + ...
The value 0.005 -- or the value 1.435 -- cannot be stored as an exact fraction in binary, so double cannot store the exact value 0.005, and the subtracted value isn't quite exact.
If you care about precise decimal arithmetic, use BigDecimal.
You may also find this article useful reading.
double and float are not exactly real numbers.
There are infinite number of real numbers in any range, but only finite number of bits to represent them! for this reason, rounding errors is expected for double and floats.
The number you get is the closest number possible that can be represented by double in floating point representation.
For more details, you might want to read this article [warning: might be high-level].
You might want to use BigDecimal to get exactly a decimal number [but you will again encounter rounding errors when you try to get 1/3].
Yes it worked this way using BigDecimal operations
private static void subtractUsingBigDecimalOperation(double a, double b) {
BigDecimal c = BigDecimal.valueOf(a).subtract(BigDecimal.valueOf(b));
System.out.println(c);
}
double and float arithmetic are never going to be exactly correct because of the rounding that occurs "under the hood".
Essentially doubles and floats can have an infinite amount of decimals but in memory they must be represented by some real number of bits. So when you do this decimal arithmetic a rounding procedure occurs and is often off by a very small amount if you take all of the decimals into account.
As suggested earlier, if you need completely exact values then use BigDecimal which stores its values differently. Here's the API
public class BigDecimalExample {
public static void main(String args[]) throws IOException {
//floating point calculation
double amount1 = 2.15;
double amount2 = 1.10;
System.out.println("difference between 2.15 and 1.0 using double is: " + (amount1 - amount2));
//Use BigDecimal for financial calculation
BigDecimal amount3 = new BigDecimal("2.15");
BigDecimal amount4 = new BigDecimal("1.10") ;
System.out.println("difference between 2.15 and 1.0 using BigDecimal is: " + (amount3.subtract(amount4)));
}
}
Output:
difference between 2.15 and 1.0 using double is: 1.0499999999999998
difference between 2.15 and 1.0 using BigDecmial is: 1.05
//just try to make a quick example to make b to have the same precision as a has, by using BigDecimal
private double getDesiredPrecision(Double a, Double b){
String[] splitter = a.toString().split("\\.");
splitter[0].length(); // Before Decimal Count
int numDecimals = splitter[1].length(); //After Decimal Count
BigDecimal bBigDecimal = new BigDecimal(b);
bBigDecimal = bBigDecimal.setScale(numDecimals,BigDecimal.ROUND_HALF_EVEN);
return bBigDecimal.doubleValue();
}