Algorithm speed with double and int? - java

how do algorithms with double compete compared to int values? Is there much difference, or is it neglectable?
In my case, I have a canvas that uses Integers so far. But now as I'm implementing a scaling, I'm probably going to switch everything to Double. Would this have a big impact on calculations?
If so, would maybe rounding doubles to only a few fractions optimize performance?
Or am I totally on the path of over-optimization and should just use doubles without any headache?

You're in GWT, so ultimately your code will be JavaScript, and JavaScript has a single type for numeric data: Number, which corresponds to Java's Double.
Using integers in GWT can either mean (I have no idea what the GWT compiler exactly does, it might also be dependent on context, such as crossing JSNI boundaries) that the generated code is doing more work than with doubles (doing a narrowing conversion of numbers to integer values), or that the code won't change at all.
All in all, expect the same or slightly better performance using doubles (unless you have to later do conversions to integers, of course); but generally speaking you're over-optimizing (also: optimization needs measurement/metrics; if you don't have them, you're on the “premature optimization” path)

There is a sizable difference between integers and doubles, however generally doubles are also very fast.
The difference is that integers are still faster than doubles, because it takes very few clock cycles to do arithmetic operations on integers.
Doubles are also fast because they are generally natively supported by a floating-point unit, which means that it is calculated by dedicated hardware. Unfortunately, it generally is usually 2x to many 40x slower.
Having said this, the CPU will usually spend quite a bit of time on housekeeping like loops and function calls, so if it is fast enough with integers, most of the time (perhaps even 99% of the time), it will be fast enough with doubles.
The only time floating point numbers are orders of magnitude slower is when they must be emulated, because there is no hardware support. This generally only occurs on embedded platforms, or where uncommon floating point types are used (eg. 128-bit floats, or decimal floats).
The result of some benchmarks can be found at:
http://pastebin.com/Kx8WGUfg
https://stackoverflow.com/a/2550851/1578925
but generally,
32-bit platforms have a greater disparity between doubles and integers
integers are always at least twice as fast on adding and subtracting

If you are going to change the type integer to double in your program you must also have to rewrite those lines of code that comparing two integers. Like a and b are two integers and you campare if ( a == b) so after changing a, b type to double you also have to change this line and have to use compare method of the double.

Not knowing the exact needs of your program, my instinct is that you're over-optimizing. When choosing between using ints or doubles, you usually base the decision on what type of value you need over which will run faster. If you need floating point values that allow for (not necessarily precise) decimal values, go for doubles. If you need precise integer values, go for ints.
A couple more points:
Rounding your doubles to certain fractions should have no impact on performance. In fact, the overhead required to round them in the first place would probably have a negative impact.
While I would argue not to worry about the performance differences between int and double, there is a significant difference between int and Integer. While an int is a primitive data type that can be used efficiently, an Integer is an object that essentially just holds an int. This incurs a significant overhead. Integers are useful in that they can be stored in collections like Vectors while ints cannot, but in all other cases its best to use ints.

In general maths that naturally fits as an integer will be faster than maths that naturally fits as a double, BUT trying to force double maths to work as an integer is almost always slower, moving back and forth between the two costs more than the speed boost you get.
If you're considering something like:
I only want 1 decimal places within my 'quazi integer float' so i'll just multiply everything by 10;
5.5*6.5
so 5.5 --> 55 and
so 6.5 --> 65
with a special multiplying function
public int specialIntegerMultiply(int a, int b){
return a*b/10;
}
Then for the love of god don't, it'll probably be slower with all the extra overhead and it'll be really confusing to write.
p.s. rounding the doubles will make no difference at all as the remaining decimal places will still exist, they'll just all be 0 (in decimal that is, in binary that won't even be true).

Related

Java - Which data type for physical calculations?

I'm trying to create a physical calculation program in Java. Therefore I used some formulas, but they always returned a wrong value. I split them and and found: (I used long so far.)
8 * 830584000 = -1945262592
which is obviously wrong. There are fractions and very high numbers in the formulas, such as 6.095E23 and 4.218E-10 for example.
So what datatype would fit best to get a correct result?
Unless you have a very good reason not to, double is the best type for physical calculations. It was good enough for the wormhole modelling in the film Interstellar so, dare I say it, is probably good enough for you. Note well though that, as a rough guide, it only gives you only 15 decimal significant figures of precision.
But you need to help the Java compiler:
Write 8.0 * 830584000 for that expression to be evaluated in double precision. 8.0 is a double literal and causes the other arguments to be promoted to a similar type.
Currently you are using integer arithmetic, and are observing wrap-around effects.
Reference: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
If you need perfect accuracy for large decimal numbers, BigDecimal is the way to go, though it will be at the cost of performance. If you know you numbers are not that large, you can use long instead which should be faster, but has a much more limited range and will require you to convert from and to decimal numbers.
As physics calculations involves a lot of floating point operations, float data type can be a good option in such calculations. I Hope it will help. :)

Is there any reason that default decimal literal is not BigDecimal type in Clojure?

I learned that Clojure reader interprets decimal literal with suffix 'M', like 1.23M, as BigDecimal. And I also know that decimal numbers with no 'M' become Java double.
But I think it would be better that normal decimal number is BigDecimal, and host-dependent decimal has suffix, like 1.23H. So when the number is corrupted or truncated because of the precision limit of IEEE double, we can easily notice that the number is precision-limited. Also, I think easier expression should be host-independent.
Is there any reason that Clojure interprets literal decimal as Java double, other than time performance? Also, I don't think time performance is an answer, because it's not C/C++, and other way to declare host-dependent decimal can be implemented just like '1.23H'.
Once up on a time, for integers, Clojure would auto-promote to larger sizes when needed. This was changed so that overflow exceptions are thrown. My sense, from afar was that:
The powers that be meant for Clojure to be a practical language doing practical things in a practical amount of time. They didn't want performance to blow up because number operations were unexpectedly using arbitrary precision libraries instead of CPU integer operations. Contrast to scheme that seems to prioritize mathematical niceness over practicality.
People did not like being surprised at run time when inter-op calls would fail because the Java library expected a 32 bit integer instead of an arbitrary sized integer.
So it was decided that the default was to use normal integers (I think Java longs?) and only use arbitrarily large integers when the programmer called for it, when the programmer knowingly decided that they were willing to take the performance hit, and the inter-op hit.
My guess is similar decisions where made for numbers with decimal points.
Performance could be one thing. Perhaps clojure.core developers could chime in regarding the reasons.
I personally think it is not so much of a big deal not to have bigdecimal by default, since :
there a literal for that as you point out : M
there are operations like +', *', -'... (note the quote) that "support arbitrary precision".

What is meant by Real Numbers or values in java and a very elementary explanation of primitive types in java

A quote from a book on Java "I advise sticking to type double for real numbers" and also "you should stick to the double type for real values". I don't understand what is meant by a real number or value... Real number as opposed to what?
Real number as opposed to integers. In mathematics, a real number can be any value along the continuum, such as 4.2 or pi. The integers are a subset of the real numbers.
Here are the Java primitives. Some of the important ones for numbers include int when you want a whole number and double when you want to allow fractions. If you deviate from those, you generally have a specific reason for doing so.
Integers are very easy to represent in binary, and it's easy to specify a specific range of integers that can be represented exactly with a specified number of bits. An int in Java uses 32 bits and gets you from -2,147,483,648 to 2,147,483,647. The other integral types are similar but with varying numbers of bits.
However, representing numbers along the continuum is more difficult. Between any two points on the real number line, there are infinitely many other real numbers. As such, it's not possible to exactly represent all of the real numbers in an interval. One way around this is with floating-point numbers. I won't get too much into the details, but basically some precision is lost so that there are gaps in what can be represented exactly. For many purposes this is inconsequential, but for things like tracking a bank account balance, this can matter. It might be worth reading the famous What Every Computer Scientist Should Know About Floating-Point Arithmetic.
In Java, one way around some of these issues includes using something like BigDecimal. Some other languages might offer other primitives as alternatives. For instance, C# has a decimal data type while Java does not.
Real means floating point, i.e. double or float, as opposed to integral (int or long)

How to confirm whether different results are due to differences in floating point handling?

I have converted a relatively simple algorithm that performs a large number of calculations on numbers of the double type from C++ to Java, however running the algorithm on the two platforms but the same machine produces slightly different results. The algorithm multiplies and sums lots of doubles and ints. I am casting ints to double in the Java algorithm; the C algorithm does not cast.
For example, on one run I get the results:
(Java) 64684970
(C++) 65296408
(Printed to ignore decimal places)
Naturally, there may be an error in my algorithm, however before I start spending time debugging, is it possible that the difference could be explained by different floating point handling in C++ and Java? If so, can I prove that this is the problem?
Update - the place where the types differ is a multiplication between two ints that is then added to a running total double.
Having modified the C code, currently in both:
mydouble += (double)int1 * (double)int2
You could add rounding to each algorithm using the same precision. This would allow each calculation to handle the data the same way. If you use this, it would eliminate the algorithm being the problem as the data would be using the same precision at each step of the equation for both the C++ and Java versions
AFAIK there are times when the value of a double literal could change between two c++ compiler versions (when the algorithm used to convert the source to the next best double value changed).
Also on some cpus floating point registers are larger than 64/32bit (greater range and precision), and how that influences the result depends on how the compiler and JIT move values in and out of these registers - this is likely to differ between java and c++.
Java has the strictftp keyword to ensure that only 64/32 bit precision is used, however that comes with a run-time cost. There are also a large number of options to influence how c++ compilers treat and optimize floating point computations by throwing out guarantess/rules made by the IEEE standard.
if the algorithm is mostly the same then you could check where the first difference for the same input appears.
In Java, double is a 64-bit floating-point number.
In C++, double is a floating-point number guaranteed to have at least 32-bit precision.
To find out the actual size, in bytes, of a C++ double on your system, use sizeof(double).
If it turns out that sizeof(double) == 8, it is almost certain that the difference is due to an error in the translation from one language to another, rather than differences in the handling of floating-point numbers.
(Technically, the size of a byte is platform-dependant, but most modern architectures use 8-bit bytes.)

What is the right data type for calculations in Java

Should we use double or BigDecimal for calculations in Java?
How much is the overhead in terms of performance for BigDecimal as compared to double?
For a serious financial application BigDecimal is a must.
Depends on how many digits you need you can go with a long and a decimal factor for visualization.
For general floating point calculations, you should use double. If you are absolutely sure that you really do need arbitrary precision arithmetic (most applications don't), then you can consider BigDecimal.
You will find that double will significantly outperform BigDecimal (not to mention being easier to work with) for any application where double is sufficient precision.
Update: You commented on another answer that you want to use this for a finance related application. This is one of the areas where you actually should consider using BigDecimal, otherwise you may get unexpected rounding effects from double calculations. Also, double values have limited precision, and you won't be able to accurately keep track of pennies at the same time as millions of dollars.
How much is the overhead in terms of performance for BigDecimal as compared to double?
A lot. For example, a multiplication of two doubles is a single machine instruction. Multiplying two BigDecimals is probably a minimum of 50 machine instructions, and has complexity of O(N * M) where M and N are the number of bytes used to represent the two numbers.
However, if your application requires the calculation to be "decimally correct", then you need to accept the overhead.
However (#2) ... even BigDecimal can't do this calculation with real number accuracy:
1/3 + 1/3 + 1/3 -> ?
To do that computation precisely you would need to implement a Rational type; i.e. a pair of BigInteger values ... and some thing to reduce the common factors.
However (#3) ... even a hypothetical Rational type won't give you a precise numeric representation for (say) Pi.
As always: it depends.
If you need the precision (even for "small" numbers, when representing amounts for example) go with BigDecimal.
In some scientific applications, double may be a better choice.
Even in finance we can't answer without knowing what area. For instance if you were doing currency conversions of $billions, where the conversion rate could be to 5 d.p. you might have problems with double. Whereas for simply adding and subtracting balances you'd be fine.
If you don't need to work in fractions of a cent/penny, maybe an integral type might be more appropriate, again it depends on the size of numbers involved.

Categories