A question of mine was recently closed as a duplicate, but that didn't help me completely. My new and a more specific question is:
Can all values(whole numbers, without any decimal part) smaller than 1.7e308 and greater than 0 be stored in a double data type, as 1.7e308 is the maximum value of a double data type? Because I don't want a decimal numeral, but a large, non-decimal number so large that can't be represented even by long long.
Can all whole numbers smaller than 1.7e308 and greater than 0 be stored in a double data type
The simple answer is No.
There are various ways to come to this conclusion, but here is one that doesn't even depend on an understanding of floating point number formats.
We know that double is a 64 bit representation.
Therefore, there can be at most 264 distinct double values: that is about 1.8 x 1019
You are asking if a double can represent all integers between zero and 1.7 x 10308.
That is 1.7 x 10308 distinct values.
1.7 x 10308 is greater than 1.8 x 1019.
Therefore what you are asking is impossible.
Your best simple option is to use BigInteger.
But you said this:
... but due to slow operations on BigIntegers, I'm keeping it a last choice. It takes about a second to multiply two 4-digit numbers.
That is simply not true. If you have come to that conclusion by benchmarking, then there is something very wrong with your methodology.
Multiplying 2 x 4 digit numbers using BigInteger should take less than a microsecond.
All floating type numbers (halfs/floats/doubles/long doubles/etc) are composed of a mantissa and an exponent.
Suppose you have 1.7e308, 1.7 is the mantissa while 308 is the exponent. You can't exactly separate the two in a float. This is because every float is represented as a composition of the aforementioned in memory. Hence you can't have a "non-decimal" float.
Related
This question already has answers here:
Retain precision with double in Java
(24 answers)
Closed 1 year ago.
Is there a data type that can handle/store numbers exceeding type double in Java ?
As far as I know, Java type double can store numbers in range 1.0E+38 to 1.0E-45.
What if I need Java program to read numbers exceedeing digit precision of 1.0E+38 and 1.0E-45? for example 1.0E-180. Currently those numbers are recognized as 0.0 .
Don't ask why I need such ridiculous numbers. I got what was given to me.
As far as I know, Java type double can store numbers in range 1.0E+38 to 1.0E-45.
Incorrect; it can go as far as 1.7976931348623157 * 10^308 and 4.9406564584124654 x 10^-324.
However, there are only at most 2^64 numbers that can be stored in a double (reason: Think about it. Pigeon hole principle). There are an infinite amount of numbers between 0 and 1, let alone between those 2 extremes. doubles work by silently rounding, all the time, to the nearest number that is one of the chosen few 2^64. Let's call those blessed numbers.
These numbers are not equally distributed. Near 0, there are A LOT of these, as you move away from 0 there are fewer and fewer. Eventually (at around 2^52), the distance between any 2 blessed numbers is more than 1.0.
BigDecimal is one solution. There are others (for starters, double CAN represent 1e-108 - you're doing something wrong in your 'translate input data to a double value' code), but keep in mind that doubles at those extremes are incredibly inaccurate.
BD with numbers like that are incredibly slow, and out of the box, BDs cant divide out of the box (for the same reason you can't divide 1 by 3 and get a perfect result: 0.333333... and where does that stop?) - you need to configure it so it cuts off at some point. They're hard to use, but perhaps your only option.
This is something that's been on my mind for years, but I never took the time to ask before.
Many (pseudo) random number generators generate a random number between 0.0 and 1.0. Mathematically there are infinite numbers in this range, but double is a floating point number, and therefore has a finite precision.
So the questions are:
Just how many double numbers are there between 0.0 and 1.0?
Are there just as many numbers between 1 and 2? Between 100 and 101? Between 10^100 and 10^100+1?
Note: if it makes a difference, I'm interested in Java's definition of double in particular.
Java doubles are in IEEE-754 format, therefore they have a 52-bit fraction; between any two adjacent powers of two (inclusive of one and exclusive of the next one), there will therefore be 2 to the 52th power different doubles (i.e., 4503599627370496 of them). For example, that's the number of distinct doubles between 0.5 included and 1.0 excluded, and exactly that many also lie between 1.0 included and 2.0 excluded, and so forth.
Counting the doubles between 0.0 and 1.0 is harder than doing so between powers of two, because there are many powers of two included in that range, and, also, one gets into the thorny issues of denormalized numbers. 10 of the 11 bits of the exponents cover the range in question, so, including denormalized numbers (and I think a few kinds of NaN) you'd have 1024 times the doubles as lay between powers of two -- no more than 2**62 in total anyway. Excluding denormalized &c, I believe the count would be 1023 times 2**52.
For an arbitrary range like "100 to 100.1" it's even harder because the upper bound cannot be exactly represented as a double (not being an exact multiple of any power of two). As a handy approximation, since the progression between powers of two is linear, you could say that said range is 0.1 / 64th of the span between the surrounding powers of two (64 and 128), so you'd expect about
(0.1 / 64) * 2**52
distinct doubles -- which comes to 7036874417766.4004... give or take one or two;-).
Every double value whose representation is between 0x0000000000000000 and 0x3ff0000000000000 lies in the interval [0.0, 1.0]. That's (2^62 - 2^52) distinct values (plus or minus a couple depending on whether you count the endpoints).
The interval [1.0, 2.0] corresponds to representations between 0x3ff0000000000000 and 0x400000000000000; that's 2^52 distinct values.
The interval [100.0, 101.0] corresponds to representations between 0x4059000000000000 and 0x4059400000000000; that's 2^46 distinct values.
There are no doubles between 10^100 and 10^100 + 1. Neither one of those numbers is representable in double precision, and there are no doubles that fall between them. The closest two double precision numbers are:
99999999999999982163600188718701095...
and
10000000000000000159028911097599180...
Others have already explained that there are around 2^62 doubles in the range [0.0, 1.0].
(Not really surprising: there are almost 2^64 distinct finite doubles; of those, half are positive, and roughly half of those are < 1.0.)
But you mention random number generators: note that a random number generator generating numbers between 0.0 and 1.0 cannot in general produce all these numbers; typically it'll only produce numbers of the form n/2^53 with n an integer (see e.g. the Java documentation for nextDouble). So there are usually only around 2^53 (+/-1, depending on which endpoints are included) possible values for the random() output. This means that most doubles in [0.0, 1.0] will never be generated.
The article Java's new math, Part 2: Floating-point numbers from IBM offers the following code snippet to solve this (in floats, but I suspect it works for doubles as well):
public class FloatCounter {
public static void main(String[] args) {
float x = 1.0F;
int numFloats = 0;
while (x <= 2.0) {
numFloats++;
System.out.println(x);
x = Math.nextUp(x);
}
System.out.println(numFloats);
}
}
They have this comment about it:
It turns out there are exactly 8,388,609 floats between 1.0 and 2.0 inclusive; large but hardly the uncountable infinity of real numbers that exist in this range. Successive numbers are about 0.0000001 apart. This distance is called an ULP for unit of least precision or unit in the last place.
2^53 - the size of the significand/mantissa of a 64bit floating point number including the hidden bit.
Roughly yes, as the significand is fixed but the exponent changes.
See the wikipedia article for more information.
The Java double is a IEEE 754 binary64 number.
This means that we need to consider:
Mantissa is 52 bit
Exponent is 11 bit number with 1023 bias (ie with 1023 added to it)
If the exponent is all 0 and the mantissa is non zero then the number is said to be non-normalized
This basically means there is a total of 2^62-2^52+1 of possible double representations that according to the standard are between 0 and 1. Note that 2^52+1 is to the remove the cases of the non-normalized numbers.
Remember that if mantissa is positive but exponent is negative number is positive but less than 1 :-)
For other numbers it is a bit harder because the edge integer numbers may not representable in a precise manner in the IEEE 754 representation, and because there are other bits used in the exponent to be able represent the numbers, so the larger the number the lower the different values.
This question already has answers here:
Java float 123.129456 to 123.12 without rounding
(5 answers)
How to round a number to n decimal places in Java
(39 answers)
Closed 5 years ago.
Can I reduce the precision of a float number?
In all the searching I've been doing I saw only how to reduce the precision for printing the number. I do not need to print it.
I want, for example, to convert 13.2836 to 13.28. Without even rounding it.
Is it possible?
The suggested answer from the system is not what I am looking for. It also deals with printing the value and I want to have a float.
There isn't really a way to do it, with good reason. While john16384's answer alludes to this, his answer doesn't make the problem clear... so probably you'll try it, it won't do what you want, and perhaps you still won't know why...
The problem is that while we think in decimal and expect that the decimal point is controlled by a power-of-10 exponent, typical floating point implementations (including Java float) use a power-of-2 exponent. Why does it matter?
You know that to represent 1/3 in decimal you'd say 0.3(repeating) - so if you have a limited number of decimal digits, you can't really represent 1/3. When the exponent is 2 instead of 10, you can't really represent 1/5 either, or a lot of other numbers that you could represent exactly in decimal.
As it happens .28 is one of those numbers. So you could multiply by 100, pass the result to floor, and divide by 100, but when this gets converted back to a float, the resulting value will be a little different from .28 and so, if you then check its value, you'll still see more than 2 decimal places.
The solution would be to use something like BigDecimal that can exactly represent decimal values of a given precision.
The standard warnings about doing precision arithmetic with floats applies, but you can do this:
float f = 13.2836;
f = Math.floor(f * 100) / 100;
if you need to save memory in some part of your calculation, And your numbers are smaller than 2^15/100 (range short), you can do the following.
Part of this taken from this post https://stackoverflow.com/a/25201407/7256243.
float number = 1.2345667f;
number= (short)(100*number);
number=(float)(number/100);
You only need to rememeber that the short's are 100 times larger.
Most answers went straight to how do represent floats more accurately, which is strange because you're asking:
Can I reduce the precision of a float number
Which is the exact opposite. So I'll try to answer this.
However there are several way to "reduce precision":
Reduce precision to gain performance
Reduce memory footprint
Round / floor arbitrarily
Make the number more "fuzzy"
Reduce the number of digits after the coma
I'll tackle those separately.
Reduce precision to gain performance
Just to get it out of the way: simply because you're dropping precision off of your calculations on a float, doesn't mean it'll be any faster. Quite the contrary. This answer by #john16384:
f = Math.floor(f * 100) / 100;
Only adds up computation time. If you know the number of significant digits from the result is low, don't bother removing them, just carry that information with the number:
public class Number WithSignificantDigits {
private float value;
private int significantdigits;
(implement basic operations here, but don't floor/round anywhere)
}
If you're doing this because you're worried about performance: stop it now, just use the full precision. If not, read on.
Reduce memory footprint
To actually store a number with less precision, you need to move away from float.
One such representation is using an int with a fixed point convention (i.e. the last 2 digits are past the coma).
If you're trying to save on storage space, do this. If not, read on.
Round / floor arbitrarily
To keep using float, but drop its precision, several options exist:
#john16384 proposed:
`f = Math.floor(f * 100) / 100;`
Or even
f = ((int) (f*100)) / 100.;
If the answer is this, your question is a duplicate. If not, read on.
Make the number more "fuzzy"
Since you just want to lose precision, but haven't stated how much, you could do with bitwise shifts:
float v = 0;
int bits = Float.floatToIntBits(v);
bits = bits >> 7; // Precision lost here
float truncated = Float.intBitsToFloat(bits);
Use 7 bitshifts to reduce precision to nearest 1/128th (close enough to 1/100)
Use 10 bitshifts to reduce precision to nearest 1/1024th (close enough to 1/1000)
I haven't tested performance of those, but If your read this, you did not care.
If you want to lose precision, and you don't care about formatting (numbers may stil have a large number of digits after the coma, like 0,9765625 instead of 1), do this. If you care about formatting and want a limited number of digits after the coma, read on.
Reduce the number of digits after the coma
For this you can:
Follow #Mark Adelsberger's suggestion of BigDecimals, or
Store as a String (yuk)
Because floats or doubles won't let you do this in most cases.
I am writing tests for code performing calculations on floating point numbers. Quite expectedly, the results are rarely exact and I would like to set a tolerance between the calculated and expected result. I have verified that in practice, with double precision, the results are always correct after rounding of last two significant decimals, but usually after rounding the last decimal. I am aware of the format in which doubles and floats are stored, as well as the two main methods of rounding (precise via BigDecimal and faster via multiplication, math.round and division). As the mantissa is stored in binary however, is there a way to perform rounding using base 2 rather than 10?
Just clearing the last 3 bits almost always yields equal results, but if I could push it and instead 'add 2' to the mantissa if its second least significast bit is set, I could probably reach the limit of accuracy. This would be easy enough, expect I have no idea how to handle overflow (when all bits 52-1 are set).
A Java solution would be preferred, but I could probably port one for another language if I understood it.
EDIT:
As part of the problem was that my code was generic with regards to arithmetic (relying on scala.Numeric type class), what I did was an incorporation of rounding suggested in the answer into a new numeric type, which carried the calculated number (floating point in this case) and rounding error, essentially representing a range instead of a point. I then overrode equals so that two numbers are equal if their error ranges overlap (and they share arithmetic, i.e. the number type).
Yes, rounding off binary digits makes more sense than going through BigDecimal and can be implemented very efficiently if you are not worried about being within a small factor of Double.MAX_VALUE.
You can round a floating-point double value x with the following sequence in Java (untested):
double t = 9 * x; // beware: this overflows if x is too close to Double.MAX_VALUE
double y = x - t + t;
After this sequence, y should contain the rounded value. Adjust the distance between the two set bits in the constant 9 in order to adjust the number of bits that are rounded off. The value 3 rounds off one bit. The value 5 rounds off two bits. The value 17 rounds off four bits, and so on.
This sequence of instruction is attributed to Veltkamp and is typically used in “Dekker multiplication”. This page has some references.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
How do you explain floating point inaccuracy to fresh programmers and laymen who still think computers are infinitely wise and accurate?
Do you have a favourite example or anecdote which seems to get the idea across much better than an precise, but dry, explanation?
How is this taught in Computer Science classes?
There are basically two major pitfalls people stumble in with floating-point numbers.
The problem of scale. Each FP number has an exponent which determines the overall “scale” of the number so you can represent either really small values or really larges ones, though the number of digits you can devote for that is limited. Adding two numbers of different scale will sometimes result in the smaller one being “eaten” since there is no way to fit it into the larger scale.
PS> $a = 1; $b = 0.0000000000000000000000001
PS> Write-Host a=$a b=$b
a=1 b=1E-25
PS> $a + $b
1
As an analogy for this case you could picture a large swimming pool and a teaspoon of water. Both are of very different sizes, but individually you can easily grasp how much they roughly are. Pouring the teaspoon into the swimming pool, however, will leave you still with roughly a swimming pool full of water.
(If the people learning this have trouble with exponential notation, one can also use the values 1 and 100000000000000000000 or so.)
Then there is the problem of binary vs. decimal representation. A number like 0.1 can't be represented exactly with a limited amount of binary digits. Some languages mask this, though:
PS> "{0:N50}" -f 0.1
0.10000000000000000000000000000000000000000000000000
But you can “amplify” the representation error by repeatedly adding the numbers together:
PS> $sum = 0; for ($i = 0; $i -lt 100; $i++) { $sum += 0.1 }; $sum
9,99999999999998
I can't think of a nice analogy to properly explain this, though. It's basically the same problem why you can represent 1/3 only approximately in decimal because to get the exact value you need to repeat the 3 indefinitely at the end of the decimal fraction.
Similarly, binary fractions are good for representing halves, quarters, eighths, etc. but things like a tenth will yield an infinitely repeating stream of binary digits.
Then there is another problem, though most people don't stumble into that, unless they're doing huge amounts of numerical stuff. But then, those already know about the problem. Since many floating-point numbers are merely approximations of the exact value this means that for a given approximation f of a real number r there can be infinitely many more real numbers r1, r2, ... which map to exactly the same approximation. Those numbers lie in a certain interval. Let's say that rmin is the minimum possible value of r that results in f and rmax the maximum possible value of r for which this holds, then you got an interval [rmin, rmax] where any number in that interval can be your actual number r.
Now, if you perform calculations on that number—adding, subtracting, multiplying, etc.—you lose precision. Every number is just an approximation, therefore you're actually performing calculations with intervals. The result is an interval too and the approximation error only ever gets larger, thereby widening the interval. You may get back a single number from that calculation. But that's merely one number from the interval of possible results, taking into account precision of your original operands and the precision loss due to the calculation.
That sort of thing is called Interval arithmetic and at least for me it was part of our math course at the university.
Show them that the base-10 system suffers from exactly the same problem.
Try to represent 1/3 as a decimal representation in base 10. You won't be able to do it exactly.
So if you write "0.3333", you will have a reasonably exact representation for many use cases.
But if you move that back to a fraction, you will get "3333/10000", which is not the same as "1/3".
Other fractions, such as 1/2 can easily be represented by a finite decimal representation in base-10: "0.5"
Now base-2 and base-10 suffer from essentially the same problem: both have some numbers that they can't represent exactly.
While base-10 has no problem representing 1/10 as "0.1" in base-2 you'd need an infinite representation starting with "0.000110011..".
How's this for an explantation to the layman. One way computers represent numbers is by counting discrete units. These are digital computers. For whole numbers, those without a fractional part, modern digital computers count powers of two: 1, 2, 4, 8. ,,, Place value, binary digits, blah , blah, blah. For fractions, digital computers count inverse powers of two: 1/2, 1/4, 1/8, ... The problem is that many numbers can't be represented by a sum of a finite number of those inverse powers. Using more place values (more bits) will increase the precision of the representation of those 'problem' numbers, but never get it exactly because it only has a limited number of bits. Some numbers can't be represented with an infinite number of bits.
Snooze...
OK, you want to measure the volume of water in a container, and you only have 3 measuring cups: full cup, half cup, and quarter cup. After counting the last full cup, let's say there is one third of a cup remaining. Yet you can't measure that because it doesn't exactly fill any combination of available cups. It doesn't fill the half cup, and the overflow from the quarter cup is too small to fill anything. So you have an error - the difference between 1/3 and 1/4. This error is compounded when you combine it with errors from other measurements.
In python:
>>> 1.0 / 10
0.10000000000000001
Explain how some fractions cannot be represented precisely in binary. Just like some fractions (like 1/3) cannot be represented precisely in base 10.
Another example, in C
printf (" %.20f \n", 3.6);
incredibly gives
3.60000000000000008882
Here is my simple understanding.
Problem:
The value 0.45 cannot be accurately be represented by a float and is rounded up to 0.450000018. Why is that?
Answer:
An int value of 45 is represented by the binary value 101101.
In order to make the value 0.45 it would be accurate if it you could take 45 x 10^-2 (= 45 / 10^2.)
But that’s impossible because you must use the base 2 instead of 10.
So the closest to 10^2 = 100 would be 128 = 2^7. The total number of bits you need is 9 : 6 for the value 45 (101101) + 3 bits for the value 7 (111).
Then the value 45 x 2^-7 = 0.3515625. Now you have a serious inaccuracy problem. 0.3515625 is not nearly close to 0.45.
How do we improve this inaccuracy? Well we could change the value 45 and 7 to something else.
How about 460 x 2^-10 = 0.44921875. You are now using 9 bits for 460 and 4 bits for 10. Then it’s a bit closer but still not that close. However if your initial desired value was 0.44921875 then you would get an exact match with no approximation.
So the formula for your value would be X = A x 2^B. Where A and B are integer values positive or negative.
Obviously the higher the numbers can be the higher would your accuracy become however as you know the number of bits to represent the values A and B are limited. For float you have a total number of 32. Double has 64 and Decimal has 128.
A cute piece of numerical weirdness may be observed if one converts 9999999.4999999999 to a float and back to a double. The result is reported as 10000000, even though that value is obviously closer to 9999999, and even though 9999999.499999999 correctly rounds to 9999999.