How does Java generate a thread's stack trace?
Example:
Consider functionA calls functionB calls functionC calls functionD. If at any point in functionD getStackTraceElementArray is used it would give the array of functionCalls:
functionC->functionB->functionA
How does Java fill the StackTraceElement array at runtime? Assuming it fills the calling function when it reaches inside the called function, how does the JVM get the reference of the calling method inside the called method?
In the simplest case the stack trace is obtained from... the stack!
Each thread has a frame pointer (FP) register that points to the base address of a current stack frame. When a Java method is called, it first creates a new stack frame, i.e. it pushes a bunch of information onto the stack:
return address (where the method was called from);
current FP (which points to the caller frame);
method ID of the method being called;
pointers to the constant pool cache etc.
Then it updates FP to point to the newly created frame.
This way, you see, the frame pointers make a linked list: if I read a value pointed by FP, I get the base address of the previous frame.
Now notice that for each frame a method ID is always at the same offset from FP. So the stack walking is as easy as a while loop (written in pseudo-code):
address fp = currentThread.getFP();
while (fp != null) {
methodID m = (methodID) read_stack_at(fp + METHOD_OFFSET);
print_method(m);
fp = (address) read_stack_at(fp);
}
That's how it works inside JVM for interpreted methods. Compiled methods are a bit more complicated. They do not usually save a method reference on the stack. Instead there is a structure that maps addresses of compiled code to the metadata that contains compiled method info. But the idea of stack walking is still the same.
Related
In JNLUA, newThread() is a void function in Java but I don't quite understand the C code behind implementing the java side of the function. Also, would someone explain why the original author would return the index/pointer?
newThread pushes the new thread onto the Lua stack. There is no need to return a value. Interfacing via a pointer value would be challenging to make safe (and poor style anyway I guess), and the index is readily available without returning it (e.g. via getTop or simply by keeping track of the stack size or by using negative indices). This is more or less the same for all parts of the API that push a value onto the stack.
Note that JNLUA's newThread differs a bit from Lua's lua_newthread; threads in JNLUA are a little less general and more intended to be used like Lua coroutines (which are, of course, also Lua threads) -- newThread takes the value from the top of the stack and uses it as the "start function" for the coroutine (which will be invoked when calling resume for the thread the first time).
I won't speculate on the author's decision to expose threads in this way; returning a LuaState (and/or exposing lua_tothread) could also be a reasonable way to implement things a bit closer to the original Lua API.
This is the guts of the implementation:
static int newthread_protected (lua_State *L) {
lua_State *T;
T = lua_newthread(L);
lua_insert(L, 1);
lua_xmove(L, T, 1);
return 1;
}
This is done in a protected call because lua_newthread can throw an exception (if it runs out of memory). The only argument (and thus the only thing initially on the stack) is the start function at index 1. lua_newthread will push the new thread object to the stack at index 2. Then lua_insert will reverse the order of the new thread object and the start function and lua_xmove transfers the start function to the new thread and then the new thread is returned (which leaves it on top of the caller's stack).
In Lua, a new thread with the start function already on the stack can sort of be treated equivalently to a yielded thread by lua_resume, so this is basically what JNLUA does -- its resume can be used to start the thread with the previously provided function.
I want to perform the following loop:
for(absolute_date CURRENT_DATE = START_DATE; CURRENT_DATE.COMPARE_TO(END_DATE) <= 0; CURRENT_DATE.SHIFT_SELF(TIMESTEP))
{
CURRENT_STATE = Propagator.PROPAGATE(CURRENT_DATE);
}
Where CURRENT_STATE is an object of class state. propagator::PROPAGATE() is a method that returns an object of class state.
My state class is effectively a class wrapper for a Java library class which i'm calling via JNI invocation API. The problem I'm having is that I want to delete the local java references withDeleteLocalRefto prevent memory leaks (especially important as I will be looping many thousands of times).
However, since DeleteLocalRef is called in my state class destructor, the reference to the java jobject is destroyed as the RHS of the assignment returns, thus making CURRENT_STATE invalid as it contains a reference to a jobject which has been deleted.
How do I avoid this?
#Wheezil
Regarding your first point - since I am using the invocation API i.e. creating a virtual machine inside C++ and calling Java functions, I don't think I need to convert to Global References (as all local references remain valid until the JVM is destroyed or until a thread is detached). In this case I am not detaching and re-attaching threads to the JVM so the local references never get deleted. The important thing for me is to make sure the local references get deleted within JVM.
Regarding second point- I have already inhibited copying by setting my copy constructors/assignment operators = delete. My issue is more specifically about how to ensure these references get deleted.
My state class looks like this:
state::state(JNIEnv* ENV)
{
this->ENV = ENV;
this->jclass_state = ENV->FindClass("/path/to/class");
this->jobject_state = nullptr;
}
state::~state()
{
if(DOES_JVM_EXIST())
{
ENV->DeleteLocalRef(this->jclass_state);
ENV->DeleteLocalRef(this->jobject_state); //PROBLEMATIC
}
}
state::state(state&& state_to_move)
{
this->ENV = state_to_move.ENV;
//move jobjects from mover => new object
this->jobject_state = state_to_move.jobject_state;
this->jclass_state = state_to_move.jclass_state;
}
state& state::operator =(state&& state_to_move)
{
this->ENV = state_to_move.ENV;
//move jobjects from mover => current object
this->jobject_state= state_to_move.jobject_state;
this->jclass_state = state_to_move.jclass_state;
return *this;
}
TO describe the problem I'm facing in more detail: the propagator::PROPAGATE() method returns a state object by value (currently stack allocated). As soon as this function returns, the following happens:
1) The move assignment operator is invoked. This sets the jobject_state and jclass_state members in the CURRENT_STATE object.
2) The destructor is invoked for the state instance created in the PROPAGATE() function. This deletes the local reference to jobject_state and thus the CURRENT_STATE Object no longer has a valid member variable.
Where to start... JNI is incredibly finicky and unforgiving, and if you don't get things just right, it will blow up. Your description is rather thin (please provide more detail if this doesn't help), but I can make a good guess. There are several problems with your approach. You are presumably doing something like this:
struct state {
state(jobject thing_) : thing(thing_) {}
~state() { env->DeleteLocalRef(thing); }
jobject thing;
}
The first issue is that storing local refs is dangerous. You cannot hang onto them beyond the current JNI frame. So convert them to global:
struct state {
state(jobject thing_) : thing(env->NewGlobalRef(thing_)) {
env->DeleteLocaLRef(thing_);
}
~state() { env->DeleteGlobalRef(thing); }
jobject thing;
}
The second issue is that jobject is basically like the old C++ auto_ptr<> -- really unsafe, because copying it leads to danging pointers and double-frees. So you either need to disallow copying of state and maybe only pass around state*, or make a copy-constructor that works:
state(const state& rhs) thing(env->NewGlobalRef(rhs.thing)) {}
This should at least get you on the right track.
UPDATE: Ddor, regarding local vs global references, this link describes it well: "Local references become invalid when the execution returns from the native method in which the local reference is created. Therefore, a native method must not store away a local reference and expect to reuse it in subsequent invocations." You can keep local references, but only under strict circumstances. Note that in particular, you cannot hand these to another thread, which it seems you are not doing. Another thing -- there are limits on the total number of local references that can be active. This limit is frustratingly underspecified, but it seems JVM-specific. I advise caution and always convert to global.
I thought that I had read somewhere, that you do not need to delete jclass because FindClass() always returns the same thing, but I'm having a hard time verifying this. In our code, we always convert the jclass to a global reference as well.
ENV->DeleteLocalRef(this->jclass_state);
I must admit ignorance about C++ move semantics; just make sure that the default copy ctor is not called and your jobject_state is not freed twice.
this->jobject_state = state_to_move.jobject_state;
If your move constructor is being called instead of the copy constructor or assignment, I don't know why you would be seeing a delete on the destruction of the temporary. As I said, I am not expert on move semantics. I've always had the copy-constructor create a new global. reference.
You can not do this:
this->ENV = ENV;
You are caching the JNIEnv value passed to native code from the JVM.
You can't do that.
OK, to be pedantic, there are some instances where you can, but that can only work from a single thread, so there's no need to cache a JNIEnv * value for later reference when it's used by the same thread.
From the complexity of what you've posted, I seriously doubt you can guarantee your native code gets called by the same thread every single time.
You get passed a JNIenv * every time native code gets called from the JVM, so there's pretty much never any point in caching the JNIEnv value.
IMO, you are making your native code way too complex. There's no need to be caching and tracking all those references. It looks like you're trying to keep a native C++ object synchronized with a Java object. Why? If native code needs access to a Java object, simply pass that object in the call from Java to the native code.
I was reading What and where are the stack and heap?. One thing I am a bit fuzzy on is what happens to the stack after a method exits. Take this image for example:
The stack is cleared upon exiting the method, but what does that mean? Is the pointer at the stack just moved back to the start of the stack making it empty? I hope this is not too broad of a question. I am not really sure what is going on behind the scenes when the stack is cleared from exiting a method.
When a method is called, local variables are located on the stack.
Object references are also stored on the stack, corresponding objects are store in the heap.
The stack is just a region of memory, it has a start and end address.
The JVM (java virtual machine) has a register which points to the current top of the stack (stack pointer). If a new method is called, an offset will be added to the register to get new space on the stack.
When a method call is over, the stack pointer will be decreased by this offset, this frees the allocated space.
Local variables and other stuff (like return address, parameters...) may still on the stack and will be overwritten by next method call.
BTW: this is why java stored all objects in heap. When an object would be located on the stack, and you would return the reference which points to the stack, the object could be destroyed by next method call.
During execution of a function, all local variables are created in the stack. That means that the stack grows to make enough room for those variables.
When the function ends, all the local variables goes out of scope and the stack is rewinded. Nothing else needs to happen, no implicit zeroing memory. But :
semantically the variables go out of scope and can no longer be used
in the hard way, the stack pointer is rewinded, effectively freeing the memory : it will be useable by next function call
Above is not only true for functions but can be the same for any block of code since semantically the variables defined in the block go out of scope at end of block.
Is the pointer at the stack just moved back to the start of the stack making it empty?
the pointer at the stack is moved back to where it was before the function call. The stack would not be empty because it contains data that belongs to calls that brought the program to that point.
To illustrate: if func1 called func2 called func3 the stack will look something like this:
func1 args/local vars... | func2 args/local vars... | func3 args/local vars...
After func3 returns it will be:
func1 args/local vars... | func2 args/local vars...
A stack is just that, a stack of things, usually a stack of frames, with the frames containing parameters, local variables and instances of objects, and some other things depending on your operating system.
If you have instantiated objects on the stack, i.e. MyClass x and not MyClass * x = new MyClass(), then the object x will be torn down and its destructor called when the stack is rewound to the previous frame, which essentially just makes the current stack pointer(internal) point to the previous frame. In most native languages no memory will be cleared, etc.
Finally this is why you should initialise local variables(in most languages) as a call to the next function will setup a new frame which will most likely be in the same place as the previously rewound stack frame, so your local variables will contain garbage.
It might be useful for you to think about what your compiled code might look like at a machine (or, better for us humans, assembly) level. Consider this as a possible example in X86 Assembly:
When the method is called, arguments will either be passed in the registers or passed on the stack itself. Either way, the code calling the method will eventually:
call the_method
When this happens, the current instruction pointer is pushed onto the stack. The stack pointer is pointing at it. Now we're in the function:
the_method:
push ebp
mov ebp, esp
The current base pointer is preserved on the stack and the base pointer is then used to reference things in the stack (like passed in variables).
sub esp, 8
Next, 8 bytes (assuming two four byte integers are allocated) are allocated on the stack.
mov [ebp-4], 4
mov [ebp-8], 2
The local variables are assigned. This could actually be accomplished by simply pushing them but more likely there will be a sub involved. Fast forward to the end:
mov esp, ebp
pop ebp
ret
When this happens, the stack pointer is right back where it was when we started, pointing at the stored base pointer (saved frame pointer). This is popped back into EBP leaving ESP pointing at the return pointer which is then "popped" into EIP with the ret. Effectively, the stack has unwound. Even though the actual memory locations haven't changed for the two local variables, they are effectively above the stack (physically below in memory, but I think you get what I mean.)
Keep in mind the stack is a zone in memory assigned to a process.
In summary, when in your code you call a function (tipically in assembly language), you need to store in memory the registers you're going to use (it could vary if you're following another contract) because these registers could be overwriten by calls to another function (you'd need to the store return address, arguments, and a lot more, but let's omite that). To do that you decrease the stack pointer by that number of registers. Before to exit, you need to make sure you increase the stack pointer by that same number. You don't need to do anything more because the values you were storing are not needed anymore, they will be overwrited by the next function call.
In Java, references to objects are in the stack when the object itself is in the heap. If all the references to an object are removed from the stack, the garbage collector will remove the object from heap.
I hope my answer helps you. Also, check this.
This looks like a silly question but I found it is hard to get it right. I have asked different people but couldn't get an ideal answer.
I want to know what happens after we call a normal method in Java (Provided in a single threaded environment).
My understanding is that:
All current stack variables are poped-up and stored somewhere (where?)
The current method call halts
The arguments of the newly called method are pushed to the stack
The method code runs
After the method finished running, the stack is again emptied and the old stack contents is again restored. (What happened if the function returns a value?).
Code continues with the calling method.
This is a very incomplete and possibly wrong answer. Can someone provide a more detailed description?
Many thanks.
No, that's actually fairly accurate:
1) current stack variables remain on the stack
2) The current method pauses
3) The arguments of the newly called method are pushed to the stack
4) The method code runs
5) After the method finished running, we pop the stack. The called method's stack variables are no longer valid - they no longer "exist" at this point.
6) We pass the return value (if any) to the caller
7) Code continues with the calling method. All it's stack variables remain intact.
==============================
ADDENDUM:
#Kevin -
Conceptually, I think you got it just about right. I clarified a few points, I hope that helps.
David Wallace's link is very good if you want to go in depth on how the JVM implements "method calling".
Here is a good overview on how "a stack" works. Any stack, calling any subroutine - not just Java: http://en.wikipedia.org/wiki/Call_stack
Finally, Marko Topolnik is correct. "The reality" is almost always complex enough that it doesn't lend itself to a simple, one-size-fits all answer. But I definitely think your understanding is good. At least at the 10,000 foot level.
IMHO...
For the interpreter, assuming an instance method, and taking some minor liberties:
The object pointer is used to reference the object, and from there the Class object.
The method pointer is located in the Class object. (The lookup to convert method name to method index was largely done when the class was loaded, so this is basically just an array index operation.)
Generally some sort of a "mark" is pushed onto the JVM stack. This would contain the caller's instruction pointer, and a pointer to the base of his stack. (Lots of different implementations here.)
The method's definition is consulted to see how many local vars are needed. That many blank elements are pushed onto the stack.
The object ("this") pointer is stored in local var 0, and any parms are stored in 1,2,3... as appropriate.
Control is transferred to the called method.
On return, the stack is popped down to the point where the call started, any return value is pushed onto the stack, and control is transferred back to the caller.
Compiled code is conceptually similar, only it uses the "C" stack, and interpreted code in a JITC environment will make use of both the JVM stack and the "C" stack.
How is recursion implemented in Java? My question is about what happens behind when a recusrsive method is executed in Java. I vaguely understand that it uses the Stack, but i am looking for a clear explanation with example.
Recursion isn't handled much differently in Java than in other (imperative) languages.
There's a stack which holds a stack frame for every method invocation. That stack is the call stack (or simply just "stack", when the context makes it clear what is meant). The element on the stack are called "stack frames".
A stack frame holds the method arguments passed in and the local variables of a method invocation (and possibly some other data, such as the return address).
When a method invokes itself (or, in fact, any method) then a new stack frame is created for the parameters and local variables of the newly-called method.
During the method execution the code can only access the values in the current (i.e. top-most) stack frame.
This way a single (local) variable can seemingly have many different values at the same time.
Recursion isn't handled any other way than normal method calls, except that multiple stack frames will represent invocations of the same method at the same time.
when a method if invoked it needs space to keep its parameters, its local variables and the return adress this space is called activation record (stack frame).
Recursion is calling a method that happens to have the same name as the caller,
therefore a recursive call is not litterally a method calling it self but an instantiation of a method calling
another instantiation of the same original. these invocations are represented internally by different activation records
which means that they are differentiated by the system.