How to avoid StackOverflowError for a recursive function

How to avoid StackOverflowError for a recursive function - java

I'm writing a function that will call itself up to about 5000 times. Ofcourse, I get a StackOverflowError. Is there any way that I can rewrite this code in a fairly simple way?:
void checkBlocks(Block b, int amm) {
//Stuff that might issue a return call
Block blockDown = (Block) b.getRelative(BlockFace.DOWN);
if (condition)
checkBlocks(blockDown, amm);
Block blockUp = (Block) b.getRelative(BlockFace.UP);
if (condition)
checkBlocks(blockUp, amm);
//Same code 4 more times for each side
}
By the way, what is the limitation of how deep we may call the functions?

Use an explicit stack of objects and a loop, rather than the call stack and recursion:
void checkBlocks(Block b, int amm) {
Stack<Block> blocks = new Stack<Block>();
blocks.push(b);
while (!blocks.isEmpty()) {
b = blocks.pop();
Block blockDown = (Block) b.getRelative(BlockFace.DOWN);
if (condition)
blocks.push(block);
Block blockUp = (Block) b.getRelative(BlockFace.UP);
if (condition)
blocks.push(block);
}
}

default stack size in java is 512kb. if you exceed that program will terminate throwing StackOverflowException
you can increase the stack size by passing a JVM argument :
-Xss1024k
now stack size is 1024kb. you may give higher value based on your environment
I don't think we can programmatically change this

You can increase the stack size by using -Xss4m.

You may put your "Block"s into a queue/stack and iterate as long as Blocks are available.

It's obvious that you get StackOverflow with such branching factor of your recursion. In other languages it can be achieved by Tail Call Optimization. But I suppose your problem needs another way to solve.
Ideally, you perform some check on Block. Maybe you can obtain list of all blocks and check each of them iteratively?

In most cases recursion is used in a wrong way. You shouldn't get a stack over flow exception.
Your method has no return type/value.
How do you ensure your initial Block b is valid?
If you are using recursion, answer yourself the following question:
what is my recursion anchor (when do i stop with recursion)
what is my recursion step (how do I reduce my number of calculations)
Example:
n! => n*n-1!
my recursion anchor is n == 2 (result is 2), so I can calculate all results beginnging from this anchor.
my recursion step is n-1 (so each step I get closer to the solution (and in this fact to my recursion anchor))

Related

Detection of Loops in Java Bytecode - Distinguishing back edge types

Background:
Before asking my question, I wish to state that I have checked the following links:
Identify loops in java byte code
goto in Java bytecode
http://blog.jamesdbloom.com/JavaCodeToByteCode_PartOne.html
I can detect the loops in the bytecode (class files) using dominator analysis algorithm-based approach of detecting back edges on the control-flow graphs (https://en.wikipedia.org/wiki/Control_flow_graph).
My problem:
After detection of loops, you can end up having two loops (defined by two distinct back edges) sharing the same loop head. This can be created by the following two cases as I have realized: (Case 1) In the source code, you have a for or while loop with a continue statement, (Case 2) In the source code, you have two loops - an outer loop that is a do-while and an inner loop; and no instructions between these loops.
My question is the following: By only looking at the bytecode, how can you distinguish between these two cases?
My thoughts:
In a do-while loop (that is without any continue statements), you don't expect a go-to statement that goes back to the loop head, in other words, creating a back edge.
For a while or for loop (that is again without any continue statements), it appears that there can be a go-to statement (I am not sure if there must be one). My compiler generates (I am using a standard 1.7 compiler) this go-to instruction outside of the loop, not as a back edge unlike what is mentioned in the given links (this go-to statement creates a control-flow to the head of the loop, but not as a jump back from the end of the loop).
So, my guess is, (repeating, in case of two back edges), if one of them is a back edge created by a go-to statement, then there is only one loop in the source code and it includes a continue statement (Case 1). Otherwise, there are two loops in the source code (Case 2).
Thank you.

When two loops are equivalent all you can do is to take the simplest one.
e.g. there is no way to tell the difference between while (true), do { } while (true) and for (;;)
If you have do { something(); } while (false) this loop might not appear in the byte code at all.

As Peter Lawrey already pointed out, there is no way to determine the source code form by looking at the bytecode. To name an example closer to your intention, the following single-loop code
do action(); while(condition1() || condition2());
produces exactly the same code as the nested loop
do do action(); while(condition1()); while(condition2());
Likewise the following loop
do {
action();
if(condition1()) continue;
break;
} while(condition2());
produces exactly the same code as
do action(); while(condition1() && condition2());
with current javac, whereas surprisingly
do {
action();
if(!condition1()) break;
} while(condition2());
does not, which only shows how much the exact form depends on compiler internals. The next version of javac might compile them differently.

stack overflow explanation from code sample

I saw this code snippet from my exam, and my first hint would be it will throw StackOverFlowError
for (int i = 10; i > 5; i++) {
if(i == 1000) i = 10;
System.out.println(i);
}
It happens to be that its not. From the code sample, can you please explain why this is not going to throw StackOverFlowError.

To have a StackOverflowError, you have to be adding things to the call stack.
You're adding calls to System.out.println, but they simply don't stack on top of one another, so there would only be one call on the stack at any given time.
Now, an example of StackOverflowError would be recursion that does not sufficiently resolve the previous entries on the call stack; something that simply has too many method calls to itself for a sufficiently large parameter, or creates more calls to itself for every call to itself than it can deal with. (The Ackermann function is a notorious example of this.)
If we define factorial as thus:
public long factorial(long value) {
return value == 0 ? 1 : value * factorial(value - 1);
}
...and give it a sufficiently large value...
System.out.println(factorial(1891279172981L));
...then we won't have enough stack space to handle all 1891279172981 of those entries on to it.

This snippet causes an infinite loop, but not an infinite recursion (since you don't have a method calling itself infinite times). Therefore it will not cause a StackOverflowError.

Why does this method print 4?

I was wondering what happens when you try to catch an StackOverflowError and came up with the following method:
class RandomNumberGenerator {
static int cnt = 0;
public static void main(String[] args) {
try {
main(args);
} catch (StackOverflowError ignore) {
System.out.println(cnt++);
}
}
}
Now my question:
Why does this method print '4'?
I thought maybe it was because System.out.println() needs 3 segments on the call stack, but I don't know where the number 3 comes from. When you look at the source code (and bytecode) of System.out.println(), it normally would lead to far more method invocations than 3 (so 3 segments on the call stack would not be sufficient). If it's because of optimizations the Hotspot VM applies (method inlining), I wonder if the result would be different on another VM.
Edit:
As the output seems to be highly JVM specific, I get the result 4 using
Java(TM) SE Runtime Environment (build 1.6.0_41-b02)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)
Explanation why I think this question is different from Understanding the Java stack:
My question is not about why there is a cnt > 0 (obviously because System.out.println() requires stack size and throws another StackOverflowError before something gets printed), but why it has the particular value of 4, respectively 0,3,8,55 or something else on other systems.

I think the others have done a good job at explaining why cnt > 0, but there's not enough details regarding why cnt = 4, and why cnt varies so widely among different settings. I will attempt to fill that void here.
Let
X be the total stack size
M be the stack space used when we enter main the first time
R be the stack space increase each time we enter into main
P be the stack space necessary to run System.out.println
When we first get into main, the space left over is X-M. Each recursive call takes up R more memory. So for 1 recursive call (1 more than original), the memory use is M + R. Suppose that StackOverflowError is thrown after C successful recursive calls, that is, M + C * R <= X and M + C * (R + 1) > X. At the time of the first StackOverflowError, there's X - M - C * R memory left.
To be able to run System.out.prinln, we need P amount of space left on the stack. If it so happens that X - M - C * R >= P, then 0 will be printed. If P requires more space, then we remove frames from the stack, gaining R memory at the cost of cnt++.
When println is finally able to run, X - M - (C - cnt) * R >= P. So if P is large for a particular system, then cnt will be large.
Let's look at this with some examples.
Example 1: Suppose
X = 100
M = 1
R = 2
P = 1
Then C = floor((X-M)/R) = 49, and cnt = ceiling((P - (X - M - C*R))/R) = 0.
Example 2: Suppose that
X = 100
M = 1
R = 5
P = 12
Then C = 19, and cnt = 2.
Example 3: Suppose that
X = 101
M = 1
R = 5
P = 12
Then C = 20, and cnt = 3.
Example 4: Suppose that
X = 101
M = 2
R = 5
P = 12
Then C = 19, and cnt = 2.
Thus, we see that both the system (M, R, and P) and the stack size (X) affects cnt.
As a side note, it does not matter how much space catch requires to start. As long as there is not enough space for catch, then cnt will not increase, so there are no external effects.
EDIT
I take back what I said about catch. It does play a role. Suppose it requires T amount of space to start. cnt starts to increment when the leftover space is greater than T, and println runs when the leftover space is greater than T + P. This adds an extra step to the calculations and further muddies up the already muddy analysis.
EDIT
I finally found time to run some experiments to back up my theory. Unfortunately, the theory doesn't seem to match up with the experiments. What actually happens is very different.
Experiment setup:
Ubuntu 12.04 server with default java and default-jdk. Xss starting at 70,000 at 1 byte increments to 460,000.
The results are available at: https://www.google.com/fusiontables/DataSource?docid=1xkJhd4s8biLghe6gZbcfUs3vT5MpS_OnscjWDbM
I've created another version where every repeated data point is removed. In other words, only points that are different from the previous are shown. This makes it easier to see anomalies. https://www.google.com/fusiontables/DataSource?docid=1XG_SRzrrNasepwZoNHqEAKuZlHiAm9vbEdwfsUA

This is the victim of bad recursive call. As you are wondering why the value of cnt varies, it is because the stack size depends on the platform. Java SE 6 on Windows has a default stack size of 320k in the 32-bit VM and 1024k in the 64-bit VM. You can read more here.
You can run using different stack sizes and you will see different values of cnt before the stack overflows-
java -Xss1024k RandomNumberGenerator
You don't see the value of cnt being printed multiple times even though the value is greater than 1 sometimes because your print statement is also throwing error which you can debug to be sure through Eclipse or other IDEs.
You can change the code to the following to debug per statement execution if you'd prefer-
static int cnt = 0;
public static void main(String[] args) {
try {
main(args);
} catch (Throwable ignore) {
cnt++;
try {
System.out.println(cnt);
} catch (Throwable t) {
}
}
}
UPDATE:
As this getting a lot more attention, let's have another example to make things clearer-
static int cnt = 0;
public static void overflow(){
try {
overflow();
} catch (Throwable t) {
cnt++;
}
}
public static void main(String[] args) {
overflow();
System.out.println(cnt);
}
We created another method named overflow to do a bad recursion and removed the println statement from the catch block so it doesn't start throwing another set of errors while trying to print. This works as expected. You can try putting System.out.println(cnt); statement after cnt++ above and compile. Then run multiple times. Depending on your platform, you may get different values of cnt.
This is why generally we do not catch errors because mystery in code is not fantasy.

The behavior is dependent upon the stack size (which can be manually set using Xss. The stack size is architecture specific. From JDK 7 source code:
// Default stack size on Windows is determined by the executable (java.exe
// has a default value of 320K/1MB [32bit/64bit]). Depending on Windows version, changing
// ThreadStackSize to non-zero may have significant impact on memory usage.
// See comments in os_windows.cpp.
So when the StackOverflowError is thrown, the error is caught in catch block. Here println() is another stack call which throws exception again. This gets repeated.
How many times it repeates? - Well it depends on when JVM thinks it is no longer stackoverflow. And that depends on the stack size of each function call (difficult to find) and the Xss. As mentioned above default total size and size of each function call (depends on memory page size etc) is platform specific. Hence different behavior.
Calling the java call with -Xss 4M gives me 41. Hence the correlataion.

I think the number displayed is the number of time the System.out.println call throws the Stackoverflow exception.
It probably depend on the implementation of the println and the number of stacking call it is made in it.
As an illustration:
The main() call trigger the Stackoverflow exception at call i.
The i-1 call of main catch the exception and call println which trigger a second Stackoverflow. cnt get increment to 1.
The i-2 call of main catch now the exception and call println. In println a method is called triggering a 3rd exception. cnt get increment to 2.
this continue until println can make all its needed call and finally display the value of cnt.
This is then dependent of the actual implementation of println.
For the JDK7 either it detect cycling call and throws the exception earlier either it keep some stack resource and throw the exception before reaching the limit to give some room for remediation logic either the println implementation doesn't make calls either the ++ operation is done after the println call thus is by pass by the exception.

main recurses on itself until it overflows the stack at recursion depth R.
The catch block at recursion depth R-1 is run.
The catch block at recursion depth R-1 evaluates cnt++.
The catch block at depth R-1 calls println, placing cnt's old value on the stack. println will internally call other methods and uses local variables and things. All these processes require stack space.
Because the stack was already grazing the limit, and calling/executing println requires stack space, a new stack overflow is triggered at depth R-1 instead of depth R.
Steps 2-5 happen again, but at recursion depth R-2.
Steps 2-5 happen again, but at recursion depth R-3.
Steps 2-5 happen again, but at recursion depth R-4.
Steps 2-4 happen again, but at recursion depth R-5.
It so happens that there is enough stack space now for println to complete (note that this is an implementation detail, it may vary).
cnt was post-incremented at depths R-1, R-2, R-3, R-4, and finally at R-5. The fifth post-increment returned four, which is what was printed.
With main completed successfully at depth R-5, the whole stack unwinds without more catch blocks being run and the program completes.

After digging around for a while, I can't say that I find the answer, but I think it's quite close now.
First, we need to know when a StackOverflowError will be thrown. In fact, the stack for a java thread stores frames, which containing all the data needed for invoking a method and resume. According to Java Language Specifications for JAVA 6, when invoking a method,
If there is not sufficient memory available to create such an activation frame, an StackOverflowError is thrown.
Second, we should make it clear what is "there is not sufficient memory available to create such an activation frame". According to Java Virtual Machine Specifications for JAVA 6,
frames may be heap allocated.
So, when a frame is created, there should be enough heap space to create a stack frame and enough stack space to store the new reference which point to the new stack frame if the frame is heap allocated.
Now let's go back to the question. From the above, we can know that when a method is execute, it may just costs the same amount of stack space. And invoking System.out.println (may) needs 5 level of method invocation, so 5 frames need to be created. Then when StackOverflowError is thrown out, it has to go back 5 times to get enough stack space to store 5 frames' references. Hence 4 is print out. Why not 5? Because you use cnt++. Change it to ++cnt, and then you will get 5.
And you will notice that when the size of stack go to a high level, you will get 50 sometimes. That is because the amount of available heap space need to be taken into consideration then. When the stack's size is too large, maybe heap space will run out before stack. And (maybe) the actual size of stack frames of System.out.println is about 51 times of main, therefore it goes back 51 times and print 50.

This is not exactly an answer to the question but I just wanted to add something to the original question that I came across and how I understood the problem:
In the original problem the exception is caught where it was possible:
For example with jdk 1.7 it is caught at first place of occurence.
but in earlier versions of jdk it looks like the exception is not being caught at the first place of occurence hence 4, 50 etc..
Now if you remove the try catch block as following
public static void main( String[] args ){
System.out.println(cnt++);
main(args);
}
Then you will see all the values of cnt ant the thrown exceptions (on jdk 1.7).
I used netbeans to see the output, as the cmd will not show all the output and exception thrown.

Stack Overflow Error java

I'm trying to solve a problem that calls for recursive backtracking and my solution produces a stackoverflow error. I understand that this error often indicates a bad termination condition, but my ternimation condition appears correct. Is there anything other than a bad termination condition that would be likely to cause a stackoverflow error? How can I figure out what the problem is?
EDIT: sorry tried to post the code but its too ugly..

As #irreputable says, even if your code has a correct termination condition, it could be that the problem is simply too big for the stack (so that the stack is exhausted before the condition is reached). There is also a third possibility: that your recursion has entered into a loop. For example, in a depth-first search through a graph, if you forget to mark nodes as visited, you'll end up going in circles, revisiting nodes that you have already seen.
How can you determine which of these three situations you are in? Try to make a way to describe the "location" of each recursive call (this will typically involve the function parameters). For instance, if you are writing a graph algorithm where a function calls itself on neighbouring nodes, then the node name or node index is a good description of where the recursive function is. In the top of the recursive function, you can print the description, and then you'll see what the function does, and perhaps you can tell whether it does the right thing or not, or whether it goes in circles. You can also store the descriptions in a HashMap in order to detect whether you have entered a circle.

Instead of using recursion, you could always have a loop which uses a stack. E.g. instead of (pseudo-code):
function sum(n){
if n == 0, return 0
return n + sum(n-1)
}
Use:
function sum(n){
Stack stack
while(n > 0){
stack.push(n)
n--
}
localSum = 0
while(stack not empty){
localSum += stack.pop()
}
return localSum
}
In a nutshell, simulate recursion by saving the state in a local stack.

You can use the -Xss option to give your stack more memory if your problem is too large to fix in the default stack limit size.

As the other fellas already mentioned, there might be few reasons for that:
Your code has problem by nature or in the logic of the recursion. It has to be a stoping condition, base case or termination point for any recursive function.
Your memory is too small to keep the number of recursive calls into the stack. Big Fibonacci numbers might be good example here. Just FYI Fibonacci is as follows (sometimes starts at zero):
1,1,2,3,5,8,13,...
Fn = Fn-1 + Fn-2
F0 = 1, F1 = 1, n>=2

If your code is correct, then the stack is simply too small for your problem. We don't have real Turing machines.

There are two common coding errors that could cause your program to get into an infinite loop (and therefore cause a stack overflow):
Bad termination condition
Bad recursion call
Example:
public static int factorial( int n ){
if( n < n ) // Bad termination condition
return 1;
else
return n*factorial(n+1); // Bad recursion call
}
Otherwise, your program could just be functioning properly and the stack is too small.

How can I refactor a large block of if statements in Java?

I recently profiled some code using JVisualVM, and found that one particular method was taking up a lot of execution time, both from being called often and from having a slow execution time. The method is made up of a large block of if statements, like so: (in the actual method there are about 30 of these)
EcState c = candidate;
if (waypoints.size() > 0)
{
EcState state = defaultDestination();
for (EcState s : waypoints)
{
state.union(s);
}
state.union(this);
return state.isSatisfied(candidate);
}
if (c.var1 < var1)
return false;
if (c.var2 < var2)
return false;
if (c.var3 < var3)
return false;
if (c.var4 < var4)
return false;
if ((!c.var5) & var5)
return false;
if ((!c.var6) & var6)
return false;
if ((!c.var7) & var7)
return false;
if ((!c.var8) & var8)
return false;
if ((!c.var9) & var9)
return false;
return true;
Is there a better way to write these if statements, or should I look elsewhere to improve efficiency?
EDIT: The program uses evolutionary science to develop paths to a given outcome. Specifically, build orders for Starcraft II. This method checks to see if a particular evolution satisfies the conditions of the given outcome.

First, you are using & instead of &&, so you're not taking advantage of short circuit evaluation. That is, the & operator is going to require that both conditions of both sides of the & be evaluated. If you are genuinely doing a bitwise AND operation, then this wouldn't apply, but if not, see below.
Assuming you return true if the conditions aren't met, you could rewrite it like this (I changed & to &&).
return
!(c.var1 < var1 ||
c.var2 < var2 ||
c.var3 < var3 ||
c.var4 < var4 ||
((!c.var5) && var5) ||
((!c.var6) && var6) ||
((!c.var7) && var7) ||
((!c.var8) && var8) ||
((!c.var9) && var9));
Secondly, you want to try to move the conditions that will most likely be true to the top of the expression chain, that way, it saves evaluating the remaining expressions. For example, if (c1.var4 < var4) is likely to be true 99% of the time, you could move that to the top.
Short of that, it seems a bit odd that you'd be getting a significant amount of time spent in this method unless these conditions hit a database or something like that.

First, try rewriting the sequence of if statements into one statement (per #dcp's answer).
If that doesn't make much difference, then the bottleneck might be the waypoints code. Some possibilities are:
You are using some collection type for which waypoints.size() is expensive.
waypoints.size() is a large number
defaultDestination() is expensive
state.union(...) is expensive
state.isSatisfied(...) is expensive
One quick-and-dirty way to investigate this is to move all of that code into a separate method and see if the profiler tells you it is a bottleneck.
If that's not the problem then your problem is intractable, and the only way around it would be to find some clever way to avoid having to do so many tests.
Rearranging the test order might help, if there is an order that is likely to return false more quickly.
If there is a significant chance that this and c are the same object, then an initial test of this == c may help.
If all of your EcState objects are compared repeatedly and they are immutable, then you could potentially implement hashCode to cache its return value, and use hashCode to speed up the equality testing. (This is a long shot ... lots of things have to be "right" for this to help.)
Maybe you could use hashCode equality as a proxy for equality ...

As always, the best thing to do is measure it yourself. You can instrument this code with calls to System.nanotime() to get very fine-grained durations. Get the starting time, and then compute how long various big chunks of your method actually take. Take the chunk that's the slowest and then put more nanotime() calls in it. Let us know what you find, too, that will be helpful to other folks reading your question.
So here's my seat of the pants guess ...
Optimizing the if statements will have nearly no measurable effect: these comparisons are all quite fast.
So let's assume the problem is in here:
if (waypoints.size() > 0)
{
EcState state = defaultDestination();
for (EcState s : waypoints)
{
state.union(s);
}
state.union(this);
return state.isSatisfied(candidate);
}
I'm guessing waypoints is a List and that you haven't overridden the size() method. In this case, List.size() is just accessing an instance variable. So don't worry about the if statement.
The for statement iterates over your List's elements quite quickly, so the for itself isn't it, though the problem could well be the code it executes. Assignments and returns take no time.
This leaves the following potential hot spots:
The one call to defaultDestination().
All the calls to EcState.union().
The one call to EcState.isSatisfied().
I'd be willing to bet your hotspot is in union(), especially since it's building up some sort of larger and larger collection of waypoints.
Measure with nanotime() first though.

You aren't going to find too many ways to actually speed that up. The two main ones would be taking advantage of short-circuit evaluation, as has already been said, by switching & to &&, and also making sure that the order of the conditions is efficient. For example, if there's one condition that throws away 90% of the possibilities, put that one condition first in the method.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.