Why is dividing by zero an unrecoverable issue?

Why is dividing by zero an unrecoverable issue? - java

Why can't java gracefully return some value with a division by zero and instead has to throw an Exception?
I am getting an ArrayIndexOutOfBoundsException:0 This is because because damageTaken is actually an array of values that stores many different 'damages'.
In java I'm trying to create a progress bar. Our example: damage incurred, in a racing game, by setting the value for height as a percentage of maxmimum damage allowed before gameover.
At the start of the program damageTaken = 0;
(damageTaken / maximumDamage)
will give numbers between 0 - 1.
Then I just multiply that by the height of the progress bar, to create a fill bar of the appropriate height.
The program crashes. I want the progress bar to be of zero height!

You are not dividing by zero, you are dividing zero by something.
It is completely allowed to take two halves of zero. The answer is zero. Say you have zero apples. You split your zero apples between Alice and Bob. Alice and Bob now both have zero apples.
But, you cannot divide by zero. Say you have two apples. Now, you want to give these apples to zero people. How many apples does each person get? The answer is undefined, and so division by zero is impossible.

(damageTaken / maximumDamage)
This gives you a division by zero exception only if maximumDamage is zero.
If damageTaken is zero, there is no problem.

Just add a special case for 0;
private int getProgress()
if (damageTaken == 0) {
return 0;
} else {
return (damageTaken / maximumDamage) * progress.getHeight();
}
}
However, (and it's a big however) the reason you are getting divide by 0 is because maximumDamage is 0, not damageTaken. So, what you probably really want is:
private int getProgress()
if (maximumDamage== 0) {
return 0;
} else {
return (damageTaken / maximumDamage) * progress.getHeight();
}
}

Conceptually, having 4/0 yield some arbitrary number would be no worse than having an attempt to double a count of 2000000000 yield a count of -294967296. Most processors, however, will ignore most kinds of arithmetic overflow unless one explicitly checks for it, but cannot ignore an attempt to divide by zero unless one explicitly checks the operands beforehand (and skips the operation if they are invalid). Given that many processors have "overflow" flags, nothing would prevent a processor from specifying that an attempted divide-by-zero should simply do nothing but set the overflow flag (a successful divide operation should clear it); code which wants to trigger an exception in such a case could do so.
I suspect the reason for the distinct behavior stems from the early days of computing; the hardware for a division instruction could judge that it was complete when the remainder was less than the divisor. If that never happened, the instruction could get stalled until a general supervisory clock circuit (designed to signal a fault if for whatever reason instructions stop being executed) shut things down. Hardware to detect the problem and exit without stalling the CPU would have been trivial by today's standards, but in the days when computers were built from discrete transistors it was cheaper, and almost as effective, to tell programmers do not attempt to divide by zero, ever.

Let's stop talking about illegality, as though uniformed mathematics police were about to storm the room. ;) For the mathematical reasoning behind this, this related question is useful.
It is fair to ask why, after all this time, modern programming platforms don't handle division-by-zero errors for us in some standard way - but to answer the question: Java, like most any other platform, throws an error when division by zero occurs. In general, you can either:
Handle before: check the variable to be used as your denominator for zero before doing the division
Handle after: you can catch the exception thrown when the division by zero occurs and handle it gracefully.
Note that, as good practice, the second method is only appropriate when you wouldn't expect the variable to ever be zero under normal system operation.
So yes, you need a special case to handle this situation. Welcome to programming. ;)

Related

Setting Spark RDD sizes:Casting long to Double inside 10^9+ for loop, really bad idea?

(EDIT: Looking at where this question started, it really ended up in a much better place. It wound up being a nice resource on the limits of RDD sizes in Spark when set through SparkContext.parallelize() vs. the actual size limits of RDDs. Also uncovered some arguments to parallelize() not found in user docs. Look especially at zero323's comments and his accepted answer.)
Nothing new under the sun but I can't find this question already asked ... the question is about how wrong/inadvisable/improper it might be to run a cast inside a large for loop in Java.
I want to run a for loop to initialize an Arraylist before passing it to a SparkContext.parallelize() method. I have found passing an uninitialized array to Spark can cause an empty collection error.
I have seen many posts about how floats and doubles are bad ideas as counters, I get that, just seems like this is a bad idea too? Like there must be a better way?
numListLen will be 10^6 * 10^3 for now, maybe as large at 10^12 at some point.
List<Double> numList = new ArrayList<Double>(numListLen);
for (long i = 0; i < numListLen; i++) {
numList.add((double) i);
}
I would love to hear where specifically this code falls down and can be improved. I'm a junior-level CS student so I haven't seen all the angles yet haha. Here's a CMU page seemingly approving this approach in C using implicit casting.
Just for background, numList is going to be passed to Spark to tell it how many times to run a simulation and create a RDD with the results, like this:
JavaRDD dataSet = jsc.parallelize(numList,SLICES_AKA_PARTITIONS);
// the function will be applied to each member of dataSet
Double count = dataSet.map(new Function<Double, Double>() {...
(Actually I'd love to run this Arraylist creation through Spark but it doesn't seem to take enough time to warrant that, 5 seconds on my i5 dual-core but if boosted to 10^12 then ... longer )

davidstenberg and Konstantinos Chalkias already covered problems related to using Doubles as counters and radiodef pointed out an issue with creating objects in the loop but at the end of the day you simply cannot allocate ArrayList larger than Integer.MAX_VALUE. On top of that, even with 231 elements, this is a pretty large object and serialization and network traffic can add a substantial overhead to your job.
There a few ways you can handle this:
using SparkContext.range method:
range(start: Long, end: Long,
step: Long = 1, numSlices: Int = defaultParallelism)
initializing RDD using range object. In PySpark you can use or range (xrange in Python 2), in Scala Range:
val rdd = sc.parallelize(1L to Long.MaxValue)
It requires constant memory on the driver and constant network traffic per executor (all you have to transfer it just a beginning and end).
In Java 8 LongStream.range could work the same way but it looks like JavaSparkContext doesn't provide required constructors yet. If you're brave enough to deal with all the singletons and implicits you can use Scala Range directly and if not you can simply write a Java friendly wrapper.
initialize RDD using emptyRDD method / small number of seeds and populate it using mapPartitions(WithIndex) / flatMap. See for example Creating array per Executor in Spark and combine into RDD
With a little bit of creativity you can actually generate an infinite number of elements this way (Spark FlatMap function for huge lists).
given you particular use case you should also take a look at mllib.random.RandomRDDs. It provides number of useful generators from different distributions.

The problem is using a double or float as the loop counter. In your case the loop counter is a long and does not suffer the same problem.
One problem with a double or float as a loop counter is that the floating point precision will leave gaps in the series of numbers represented. It is possible to get to a place within the valid range of a floating point number where adding one falls below the precision of the number being represented (requires 16 digits when the floating point format only supports 15 digits for example). If your loop went through such a point in normal execution it would not increment and continue in an infinite loop.
The other problem with doubles as loop counters is the ability to compare two floating points. Rounding means that to compare the variables successfully you need to look at values within a range. While you might find 1.0000000 == 0.999999999 your computer would not. So rounding might also make you miss the loop termination condition.
Neither of these problems occurs with your long as the loop counter. So enjoy having done it right.

Although I don't recommend the use of floating-point values (either single or double precision) as for-loop counters, in your case, where the step is not a decimal number (you use 1 as a step), everything depends on your largest expected number Vs the fraction part of double representation (52 bits).
Still, double numbers from 2^52..2^53 represent the integer part correctly, but after 2^53, you cannot always achieve integer-part precision.
In practice and because your loop step is 1, you would not experience any problems till 9,007,199,254,740,992 if you used double as counter and thus avoiding casting (you can't avoid boxing though from double to Double).
Perform a simple increment-test; you will see that 9,007,199,254,740,995 is the first false positive!
FYI: for float numbers, you are safe incrementing till 2^24 = 16777216 (in the article you provided, it uses the number 100000001.0f > 16777216 to present the problem).

How to deal with float rounding errors

I'm trying to implement basic 2D vector math functions for a game, in Java. They will be intensively used by the game, so I want them to be as fast as possible.
I started with integers as the vector coordinates because the game needs nothing more precise for the coordinates, but for all calculations I still would have to change to double vectors to get a clear result (eg. intersection between two lines).
Using doubles, there are rounding errors. I could simply ignore them and use something like
d1 - d2 <= 0.0001
to compare the values, but I assume with further calculations the error could sum up until it becomes significant. So I thought I could round them after every possibly unprecise operation, but that turned out to produce much worse results, assumedly because the program also rounds unexact values (eg. 0.33333333... -> 0.3333300...).
Using BigDecimal would be far too slow.
What is the best way to solve this problem?

Inaccurate Method
When you are using numbers that require Precise calculations you need to be sure that you aren't doing something like: (and this is what it seems like you are currently doing)
This will result in the accumulation of rounding errors as the process continues; giving you extremely innacurate data long-term. In the above example, you are actually rounding off the starting float 4 times, each time it becomes more and more inaccurate!
Accurate Method
A better and more accurate way of obtaining numbers is to do this:
This will help you to avoid the accumulation of rounding errors because each calculation is based off of only 1 conversion and the results from that conversion are not compounded into the next calculation.
The best method of attack would be to start at the highest precision that is necessary, then convert on an as-needed basis, but leave the original intact. I would suggest you to follow the process from the second picture that I posted.
I started with integers as the vector coordinates because the game needs nothing more precise for the coordinates, but for all calculations I still would have to change to double vectors to get a clear result (eg. intersection between two lines).
It's important to note that you should not attempt to perform any type of rounding of your values if there is not noticeable impact on your end result; you will simply be doing more work for little to no gain, and may even suffer a performance decrease if done often enough.

This is a minor addition to the prior answer. When converting the float to an integer, it is important to round rather than just casting. In the following program, d is the largest double that is strictly less than 1.0. It could easily arise as the result of a calculation that would have result 1.0 in infinitely precise real number arithmetic.
The simple cast gets result 0. Rounding first gets result 1.
public class Test {
public static void main(String[] args) {
double d = Math.nextDown(1.0);
System.out.println(d);
System.out.println((int)d);
System.out.println((int)Math.round(d));
}
}
Output:
0.9999999999999999
0
1

Double as close to 0 as possible?

I need a value as close to 0 as possible. I need to be able to divide through this value, but it should be effectively 0.
Does Java provide an easy way of generating a double with only the least significant bit set? Or do I have to calculate it myself?
//EDIT: A little background information, because someone requested it. I know that my soultion is not a particularly clean one, but here you are:
I am writing a program for homework. It calculates the resistance of a circuit consisting of multiple resistors in parallel and serial circuits.
It is a 2nd year programming class. Our teacher still designs classes for us, we need to implement them according to his design.
Parallel circuits involve calculation of 1/*resistance*, therefore my program prohibits creation of resistors with 0 Ohm. Physics tells you that this is impossible anyway (you have just a tiny little resistance in every metal).
However, the example circuit we should use to test the program contains a 0 Ohm resistor. It is placed in a serial circuit, but resistors do not know where they are (the teacher designed it that way), so I cannot change my program to allow resistors with 0 Ohm resistance in serial circuits only.
Two solutions:
Allow 0 Ohm resistors in any case - if division by 0 occurs, well, bad luck
Set the resistor not to 0, but to a resistance one can neglect.
Both are not very good. The first one seemed not too good to me, and neither did the second, but I had to decide.
It was just a random choice that threw up the problem. I could not let go without solving it, so switching to the first one was not an option anymore ;-)

Use Double.MIN_VALUE:
A constant holding the smallest positive nonzero value of type double, 2-1074. It is equal to the hexadecimal floating-point literal 0x0.0000000000001P-1022 and also equal to Double.longBitsToDouble(0x1L).

If you would like to divide by "zero" you can actually just use Double.POSITIVE_INFINITY as the result.

Is it more efficient to reset a counter or let it increase and use modulo

Say you need to track the number of times a method is called and print something when it has been called n times. What would be the most efficient:
Use a long variable _counter and increase it each time the method is called. Each call you test for the equality "_counter % n == 0"
Use an int variable _counter and increase it each time the method is called. When _counter = n, print the message and reset the variable _counter to 0.
Some would say the difference is negligible and you are probably right. I am just curious of what method is most commonly used

In this particular case, since you need to have an if-statement ANYWAY, I would say that you should just set it to zero when it reaches the count.
However, for a case where you use the value every time, and just want to "wrap round to zero when we reach a certain value", then the case is less obvious.
If you can adjust n to be a power of 2 (2, 4, 8, 16, 32 ...), then you can use the trick of counter % n is the same as counter & (n-1) - which makes the operation REALLY quick.
If n is not a power of two, then chances are that you end up doing a real divide, which is a bad idea - divide is very expensive, compared to regular instructions, and a compare and reset is highly likely faster than the divide option.
Of course, as others have mentioned, if your counter ever reaches the MAX limit for the type, you could end up with all manner of fun and games.
Edit: And of course, if you are printing something, that probably takes 100 times longer than the divide, so it really is micro-optimization, unless n is quite large.

It depends on the value of n... but I bet resetting and a simple equality check is faster.
Additionally resetting the counter is safer, you will never reach the representation limit for your number.
Edit: also consider readability, doing micro optimizations may obscure your code.

Why not do both.
If it becomes a problem then look to see if it is worth optimizing.
But there is no point even looking at it until it is a problem (there will be much bigger problems in your algorithms).
count = (count+1) % countMax;

I believe that it is always better to reset the counter for the following reasons:
The code is clearer to an unfamiliar programmer (for example, the maintenance programmer).
There is less chance of an arithmetic (perhaps bad spelling) overflow when you reset the counter.

Inspection of Guava's RateLimiter will give you some idea of a similar utility implementation http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/RateLimiter.html

Here are performance times for 100000000 iterations, in ms
modTime = 1258
counterTime = 449
po2Time = 108
As we see Power of 2 outperforms other methods by far, but its only for powers of 2, also our plain counter is almost 2.5 times faster than modulus as well. So why would we like to use modulus increments at all? Well in my opinion I think they provide a clean code and if used properly they are a great tool to know of
original post

How does java handle integer overflow and underflow?

i know this is an old question, asked many times. but i am not able to find any satisfactory answer for this, hence asking again.
can someone explain what exactly happens in case of integer overflow and underflow?
i have heard about some 'lower order bytes' which handle this, can someone explain what is that?
thanks!

You could imagine that when you have only 2 places you are counting (so adding 1 each time)
00
01
10
11
100
But the last one gets cut down to "00" again. So there is your "overflow". You're back at 00. Now depending on what the bits mean, this can mean several things, but most of the time this means you are going from the highest value to the lowest. (11 to 00)
Mark peters adds a good one in the comments: even without overflow you'll have a problem, because the first bit is used as signing, so you'll go from high to low without losing that bit. You could say that the bit is 'separate' from the others

Java loops the number either to the maximum or minimum integer (depending on whether it is overflow or underflow).
So:
System.out.println(Integer.MAX_VALUE + 1 == Integer.MIN_VALUE);
System.out.println(Integer.MIN_VALUE - 1 == Integer.MAX_VALUE);
prints true twice.

It basically handles them without reporting an exception, performing the 2's complement arithmetic without concern for overflow or underflow, returning the expected (but incorrect) result based on the mechanics of 2's complement arithmetic.
This means that the bits which over or underflow are simply chopped, and that Integer.MIN_VALUE - 1 typically returns Integer.MAX_VALUE.
As far as "lower order bytes" being a workaround, they really aren't. What is happening when you use Java bytes to do the arithmetic is that they get expanded into ints, the arithmetic is generally performed on the ints, and the end result is likely to be completely contained in the returned it as it has far more storage capacity than the starting bytes.

Another way to think of how java handles overflow/underclock is to picture an anology clock. You can move it forward an hour at a time but eventually the hours will start again. You can wind the clock backward but once you go beyond the start you are at the end again.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.