This looks like a silly question but I found it is hard to get it right. I have asked different people but couldn't get an ideal answer.
I want to know what happens after we call a normal method in Java (Provided in a single threaded environment).
My understanding is that:
All current stack variables are poped-up and stored somewhere (where?)
The current method call halts
The arguments of the newly called method are pushed to the stack
The method code runs
After the method finished running, the stack is again emptied and the old stack contents is again restored. (What happened if the function returns a value?).
Code continues with the calling method.
This is a very incomplete and possibly wrong answer. Can someone provide a more detailed description?
Many thanks.
No, that's actually fairly accurate:
1) current stack variables remain on the stack
2) The current method pauses
3) The arguments of the newly called method are pushed to the stack
4) The method code runs
5) After the method finished running, we pop the stack. The called method's stack variables are no longer valid - they no longer "exist" at this point.
6) We pass the return value (if any) to the caller
7) Code continues with the calling method. All it's stack variables remain intact.
==============================
ADDENDUM:
#Kevin -
Conceptually, I think you got it just about right. I clarified a few points, I hope that helps.
David Wallace's link is very good if you want to go in depth on how the JVM implements "method calling".
Here is a good overview on how "a stack" works. Any stack, calling any subroutine - not just Java: http://en.wikipedia.org/wiki/Call_stack
Finally, Marko Topolnik is correct. "The reality" is almost always complex enough that it doesn't lend itself to a simple, one-size-fits all answer. But I definitely think your understanding is good. At least at the 10,000 foot level.
IMHO...
For the interpreter, assuming an instance method, and taking some minor liberties:
The object pointer is used to reference the object, and from there the Class object.
The method pointer is located in the Class object. (The lookup to convert method name to method index was largely done when the class was loaded, so this is basically just an array index operation.)
Generally some sort of a "mark" is pushed onto the JVM stack. This would contain the caller's instruction pointer, and a pointer to the base of his stack. (Lots of different implementations here.)
The method's definition is consulted to see how many local vars are needed. That many blank elements are pushed onto the stack.
The object ("this") pointer is stored in local var 0, and any parms are stored in 1,2,3... as appropriate.
Control is transferred to the called method.
On return, the stack is popped down to the point where the call started, any return value is pushed onto the stack, and control is transferred back to the caller.
Compiled code is conceptually similar, only it uses the "C" stack, and interpreted code in a JITC environment will make use of both the JVM stack and the "C" stack.
Related
How does Java generate a thread's stack trace?
Example:
Consider functionA calls functionB calls functionC calls functionD. If at any point in functionD getStackTraceElementArray is used it would give the array of functionCalls:
functionC->functionB->functionA
How does Java fill the StackTraceElement array at runtime? Assuming it fills the calling function when it reaches inside the called function, how does the JVM get the reference of the calling method inside the called method?
In the simplest case the stack trace is obtained from... the stack!
Each thread has a frame pointer (FP) register that points to the base address of a current stack frame. When a Java method is called, it first creates a new stack frame, i.e. it pushes a bunch of information onto the stack:
return address (where the method was called from);
current FP (which points to the caller frame);
method ID of the method being called;
pointers to the constant pool cache etc.
Then it updates FP to point to the newly created frame.
This way, you see, the frame pointers make a linked list: if I read a value pointed by FP, I get the base address of the previous frame.
Now notice that for each frame a method ID is always at the same offset from FP. So the stack walking is as easy as a while loop (written in pseudo-code):
address fp = currentThread.getFP();
while (fp != null) {
methodID m = (methodID) read_stack_at(fp + METHOD_OFFSET);
print_method(m);
fp = (address) read_stack_at(fp);
}
That's how it works inside JVM for interpreted methods. Compiled methods are a bit more complicated. They do not usually save a method reference on the stack. Instead there is a structure that maps addresses of compiled code to the metadata that contains compiled method info. But the idea of stack walking is still the same.
I was reading What and where are the stack and heap?. One thing I am a bit fuzzy on is what happens to the stack after a method exits. Take this image for example:
The stack is cleared upon exiting the method, but what does that mean? Is the pointer at the stack just moved back to the start of the stack making it empty? I hope this is not too broad of a question. I am not really sure what is going on behind the scenes when the stack is cleared from exiting a method.
When a method is called, local variables are located on the stack.
Object references are also stored on the stack, corresponding objects are store in the heap.
The stack is just a region of memory, it has a start and end address.
The JVM (java virtual machine) has a register which points to the current top of the stack (stack pointer). If a new method is called, an offset will be added to the register to get new space on the stack.
When a method call is over, the stack pointer will be decreased by this offset, this frees the allocated space.
Local variables and other stuff (like return address, parameters...) may still on the stack and will be overwritten by next method call.
BTW: this is why java stored all objects in heap. When an object would be located on the stack, and you would return the reference which points to the stack, the object could be destroyed by next method call.
During execution of a function, all local variables are created in the stack. That means that the stack grows to make enough room for those variables.
When the function ends, all the local variables goes out of scope and the stack is rewinded. Nothing else needs to happen, no implicit zeroing memory. But :
semantically the variables go out of scope and can no longer be used
in the hard way, the stack pointer is rewinded, effectively freeing the memory : it will be useable by next function call
Above is not only true for functions but can be the same for any block of code since semantically the variables defined in the block go out of scope at end of block.
Is the pointer at the stack just moved back to the start of the stack making it empty?
the pointer at the stack is moved back to where it was before the function call. The stack would not be empty because it contains data that belongs to calls that brought the program to that point.
To illustrate: if func1 called func2 called func3 the stack will look something like this:
func1 args/local vars... | func2 args/local vars... | func3 args/local vars...
After func3 returns it will be:
func1 args/local vars... | func2 args/local vars...
A stack is just that, a stack of things, usually a stack of frames, with the frames containing parameters, local variables and instances of objects, and some other things depending on your operating system.
If you have instantiated objects on the stack, i.e. MyClass x and not MyClass * x = new MyClass(), then the object x will be torn down and its destructor called when the stack is rewound to the previous frame, which essentially just makes the current stack pointer(internal) point to the previous frame. In most native languages no memory will be cleared, etc.
Finally this is why you should initialise local variables(in most languages) as a call to the next function will setup a new frame which will most likely be in the same place as the previously rewound stack frame, so your local variables will contain garbage.
It might be useful for you to think about what your compiled code might look like at a machine (or, better for us humans, assembly) level. Consider this as a possible example in X86 Assembly:
When the method is called, arguments will either be passed in the registers or passed on the stack itself. Either way, the code calling the method will eventually:
call the_method
When this happens, the current instruction pointer is pushed onto the stack. The stack pointer is pointing at it. Now we're in the function:
the_method:
push ebp
mov ebp, esp
The current base pointer is preserved on the stack and the base pointer is then used to reference things in the stack (like passed in variables).
sub esp, 8
Next, 8 bytes (assuming two four byte integers are allocated) are allocated on the stack.
mov [ebp-4], 4
mov [ebp-8], 2
The local variables are assigned. This could actually be accomplished by simply pushing them but more likely there will be a sub involved. Fast forward to the end:
mov esp, ebp
pop ebp
ret
When this happens, the stack pointer is right back where it was when we started, pointing at the stored base pointer (saved frame pointer). This is popped back into EBP leaving ESP pointing at the return pointer which is then "popped" into EIP with the ret. Effectively, the stack has unwound. Even though the actual memory locations haven't changed for the two local variables, they are effectively above the stack (physically below in memory, but I think you get what I mean.)
Keep in mind the stack is a zone in memory assigned to a process.
In summary, when in your code you call a function (tipically in assembly language), you need to store in memory the registers you're going to use (it could vary if you're following another contract) because these registers could be overwriten by calls to another function (you'd need to the store return address, arguments, and a lot more, but let's omite that). To do that you decrease the stack pointer by that number of registers. Before to exit, you need to make sure you increase the stack pointer by that same number. You don't need to do anything more because the values you were storing are not needed anymore, they will be overwrited by the next function call.
In Java, references to objects are in the stack when the object itself is in the heap. If all the references to an object are removed from the stack, the garbage collector will remove the object from heap.
I hope my answer helps you. Also, check this.
Note: This is regarding the Java Debug Interface (JDI).
I know there's the option to get a thread's stackframe, and from that a list of all visible variables and their values. However, I don't know how to get anonymous values, that is values not stored in a variable but "internally" on the stack (or maybe something else?).
Things like results from if-evaluations, comparisons, etc. For instance let's say we have this in our code:
if(array[i] > x)
How/where is that piece of data (i.e. the result: true or false) stored at runtime and what classes or methods within the JDI provide me access to it?
Thanks
The only way I know to get them without dropping down to debugging at the bytecode level is to change your code to assign them to a variable. Sorry, but that's the way it's designed; if it doesn't have a name it doesn't necessarily last long enough to get out of the execution stack and into a variable, and the debugger is designed to work on the latter, not the former.
Or ask the debugger for the values of x, i, and (once you know i), array[i] and do the computation/comparison yourself.
Or, in this case, simply watch what branch the if takes.
How is recursion implemented in Java? My question is about what happens behind when a recusrsive method is executed in Java. I vaguely understand that it uses the Stack, but i am looking for a clear explanation with example.
Recursion isn't handled much differently in Java than in other (imperative) languages.
There's a stack which holds a stack frame for every method invocation. That stack is the call stack (or simply just "stack", when the context makes it clear what is meant). The element on the stack are called "stack frames".
A stack frame holds the method arguments passed in and the local variables of a method invocation (and possibly some other data, such as the return address).
When a method invokes itself (or, in fact, any method) then a new stack frame is created for the parameters and local variables of the newly-called method.
During the method execution the code can only access the values in the current (i.e. top-most) stack frame.
This way a single (local) variable can seemingly have many different values at the same time.
Recursion isn't handled any other way than normal method calls, except that multiple stack frames will represent invocations of the same method at the same time.
when a method if invoked it needs space to keep its parameters, its local variables and the return adress this space is called activation record (stack frame).
Recursion is calling a method that happens to have the same name as the caller,
therefore a recursive call is not litterally a method calling it self but an instantiation of a method calling
another instantiation of the same original. these invocations are represented internally by different activation records
which means that they are differentiated by the system.
Quick background
I'm a Java developer who's been playing around with C++ in my free/bored time.
Preface
In C++, you often see pop taking an argument by reference:
void pop(Item& removed);
I understand that it is nice to "fill in" the parameter with what you removed. That totally makes sense to me. This way, the person who asked to remove the top item can have a look at what was removed.
However, if I were to do this in Java, I'd do something like this:
Item pop() throws StackException;
This way, after the pop we return either: NULL as a result, an Item, or an exception would be thrown.
My C++ text book shows me the example above, but I see plenty of stack implementations taking no arguments (stl stack for example).
The Question
How should one implement the pop function in C++?
The Bonus
Why?
To answer the question: you should not implement the pop function in C++, since it is already implemented by the STL. The std::stack container adapter provides the method top to get a reference to the top element on the stack, and the method pop to remove the top element. Note that the pop method alone cannot be used to perform both actions, as you asked about.
Why should it be done that way?
Exception safety: Herb Sutter gives a good explanation of the issue in GotW #82.
Single-responsibility principle: also mentioned in GotW #82. top takes care of one responsibility and pop takes care of the other.
Don't pay for what you don't need: For some code, it may suffice to examine the top element and then pop it, without ever making a (potentially expensive) copy of the element. (This is mentioned in the SGI STL documentation.)
Any code that wishes to obtain a copy of the element can do this at no additional expense:
Foo f(s.top());
s.pop();
Also, this discussion may be interesting.
If you were going to implement pop to return the value, it doesn't matter much whether you return by value or write it into an out parameter. Most compilers implement RVO, which will optimize the return-by-value method to be just as efficient as the copy-into-out-parameter method. Just keep in mind that either of these will likely be less efficient than examining the object using top() or front(), since in that case there is absolutely no copying done.
The problem with the Java approach is that its pop() method has at least two effects: removing an element, and returning an element. This violates the single-responsibility principle of software design, which in turn opens door for design complexities and other issues. It also implies a performance penalty.
In the STL way of things the idea is that sometimes when you pop() you're not interested in the item popped. You just want the effect of removing the top element. If the function returns the element and you ignore it then that's a wasted copy.
If you provide two overloads, one which takes a reference and another which doesn't then you allow the user to choose whether he (or she) is interested in the returned element or not. The performance of the call will optimal.
The STL doesn't overload the pop() functions but rather splits these into two functions: back() (or top() in the case of the std::stack adapter) and pop(). The back() function just returns the element, while the pop() function just removes it.
Using C++0x makes the whole thing hard again.
As
stack.pop(item); // move top data to item without copying
makes it possible to efficiently move the top element from the stack. Whereas
item = stack.top(); // make a copy of the top element
stack.pop(); // delete top element
doesn't allow such optimizations.
The only reason I can see for using this syntax in C++:
void pop(Item& removed);
is if you're worried about unnecessary copies taking place.
if you return the Item, it may require an additional copy of the object, which may be expensive.
In reality, C++ compilers are very good at copy elision, and almost always implement return value optimization (often even when you compile with optimizations disabled), which makes the point moot, and may even mean the simple "return by value" version becomes faster in some cases.
But if you're into premature optimization (if you're worried that the compiler might not optimize away the copy, even though in practice it will do it), you might argue for "returning" parameters by assigning to a reference parameter.
More information here
IMO, a good signature for the eqivalent of Java's pop function in C++ would be something like:
boost::optional<Item> pop();
Using option types is the best way to return something that may or may not be available.