Java call stack inspection and manipulation - java

My question is: is it possible (in ANY way) to analyze and modify call stack (both content of frames and stack content) in runtime?
I'm looking for any possibility - low-level, unsafe or internal API, possibility to write C extension, etc. Only constraint: it should be usable in standard runtime, without debugging or profiling mode. This is the point where I'm doing research "is it possible at all?", not "is it good idea?".
I'd like to gather all local data from a frame, store it somewhere, and then remove that frame from stack, with possibility of restoring it later. Effectively that gives us continuations in JVM, and they will allow fast async frameworks (like gevents from python) and generator constructs (like those from python) to come up.
This may look like repeated question, but I've only found questions that were answered with "use Thread.currentThread().getStackTrace()" or "that should be done with debugging tools". There was similiar question to mine, but it was only answered in context of what asking guy wanted to do (work on async computations), while I need more general (java-stack oriented) answer. This question is similiar too, but as before, it is focused on parallelization, and answers are focused on that too.
I repeat: this is research step in process of coming up with new language feature proposal. I don't wanna risk corrupting anything in JVM - I'm looking for possibility, then I'm gonna analyse possible risks and look out for them. I know that manipulating stack by hand is ugly, but so is creating instances with ommiting consrtuctor - and it is basis for objenesis. Dirty hacks may be dirty, but they may help introducing something cool.
PS. I know that Quasar and Lightwolf exist, but, as above, those are concurrency-focused frameworks.
EDIT
Little clarification: I'm looking for something that will be compatible with future JVM and libraries versions. Preferably we're talking about something that is considered stable public API, but if the solution lies in something internal, yet almost standard or becoming standard after being internal (like sun.misc.Unsafe) - that will do too. If it is doable by C-extension using only C JVM API - that's ok. If that is doable with bytecode manipulation - that's ok too (I think that MAY be possible with ASM).

I think there is a way achieving what you want using JVMTI.
Although you cannot directly do what you want (as stated in a comment above), you may instrument/redefine methods (or entire classes) at run time. So you could just define every method to call another method directly to "restore execution context" and as soon as you have the stack you want, redefine them with your original code.
For example: lets say you want to restore a stack where just A called B and B called C.
When A is loaded, change the code to directly call B. As soon as B is loaded, redefine it to directly call C; Call the topmost method (A); As soon as C gets called (which should be very fast now), redefine A and B to their original code.
If there are multiple threads involved and parameter values that must be restored, it gets a little more complicated, but still doable with JVMTI. However, this would then be worth another question ;-).
Hope this helps. Feel free to contact me or comment if you need clarification on anything.
EDIT:
Although I think it IS doable, I also think this is a lot (!!!) of work, especially when you want to restore parameters, local variables, and calling contexts (like this pointers, held locks, ...).
EDIT as requested: Assume the same stack as above (A calling B calling C). Although A, B, and C have arbitrary code inside them, just redfine them like this: void A() { B(); } void B() { C(); } void C() { redefine(); } As soon as you reach the redefine method, redefine all classes with their original code. Then you have the stack you want.

Not sure in this tool, but you can check http://en.wikipedia.org/wiki/GNU_Debugger.
GDB offers extensive facilities for tracing and altering the execution of computer programs. The user can monitor and modify the values of programs' internal variables, and even call functions independently of the program's normal behavior.

Related

different approaches in retrieving Caller method details apart from stack trace [Java]

There is a need to pass caller method details in java. I do not want to use StackTrace to find out them.
Are there any alternative means to get them?
I know Aspects will help but there is a concern that it will slow down performance.
Any suggestions will help.
I am not aware of any.
In the end, you are asking for some sort of instrumentation. In other words: you want to tell the jvm to keep track of the call stack and more importantly, make that information available to you programmatically.
And even when you only want that to happen for specific methods, the jvm still has to track all method invocations, as it can't know whether one of the methods to track is called in the end. And the fact that java is interpreted and compiled to native machine code adds to the complexity, too.
So, as said: there is no way of tracking method invocations easily without performance impacts. And the tools I know that can keep that performance impact on a reasonable level, like XRebel are for later evaluation, not for programmatic consumption.
Finally: you should rather look into your requirements. Java is simply not a good language when you really need such information. It isn't meant to keep call stacks around. So: the real solution would be to either select a platform that works better for you, or (recommended) to step back and design a solution that doesn't have this requirement.

Simulating Destructors in Clojure

Problem Statement
I have two machines, A and B, both running Clojure.
B has some in memory data structure.
A holds an object A_P which is a reference/pointer to some object B_O in B's memory.
Now, as long as A_P is NOT GC-ed by A, I do not want B_O GC-ed by B.
However, once A_P has been GC-ed by A (and nothing else in A referes to B_O, and nothing else in B refers to B_O), then I want B_O to be elegible to be GC-ed.
Solution in Languages with Destructors
In C++, this is easy -- I use destructors. When A_P gets GC-ed, A sends B a msg to decrement the number of external references to B_O, and when that's 0, and internal refernes to B_0 is also 0, then B_O gets GC-ed.
Solution in Java/Clojure?
Now, I know that Java does not have destructors. However, I'm wondering if Clojure has a way around this problem.
Thanks!
No good solution exists, without a real distributed garbage collector. Even in C++, you cannot do this safely, because you implemented reference counting and pretended it was a real garbage collector; but if two objects point to each other across the machine divide, and are both unreferenced locally, they still both have a nonzero reference count and cannot be collected.
No, Clojure (based on JVM, CLR) doesn't have the "C++ type destructors" because of the automatic memory management model of JVM. There are things like finalizers but it is recommended to not use them. Instead you should model your solution based on message passing mechanism rather then A machine holding "pointer/reference" to data in B. I know this answer is very high level because you haven't provide any specific problem details in your question. If you need more details about how to solve a particular problem please provide the complete context and I am sure someone will able to help you.
This is an inherently difficult problem: distributed garbage problem is really hard if not impossible to get right.
However you might just be able to make it work using Java finalisers and overriding the finalize() method. You can then implement a messaging technique similar to the one you describe for C++.
This will have issues in the more general case (it won't help you with circular references across machines as amalloy points out) and there are some other quirks to be aware of (mostly around your lack of control over exactly when the finaliser gets called) but you might be able to get it to work in your specific situation.
Assuming you're using a data structure like a ref or atom for holding data structure A somewhere inside it, you can use listeners for monitoring the state of that structure for removals of A, and those listeners can send appropriate message to B. clojure.data/diff could be really useful for finding the structures that were removed.
The other option would be to have, immediately after the A structure is dereferenced, the function responsible for doing so send the message. As part of this though, make sure that that code was actually responsible for the removal of A, and not some other update.

how can I get the History of an object or trace an Object

I have a requirement, where support in my application a lot of processing is happening, at some point of time an exception occrured, due to an object. Now I would like to know the whole history of that object. I mean whatever happened with that object over the period of time since the application has started.
Is this peeping into this history of Object possible thru anyway using JMX or anything else ?
Thanks
In one word: No
With a few more words:
The JVM does not keep any history on any object past its current state, except for very little information related to garbage collection and perhaps some method call metrics needed for the HotSpot optimizer. Doing otherwise would imply a huge processing and memory overhead. There is also the question of granularity; do you log field changes only? Every method call? Every CPU instruction during a method call? The JVM simply takes the easy way out and does none of the above.
You have to isolate the class and/or specific instance of that object and log any operation that you need on your own. You will probably have to do that manually - I have yet to find a bytecode instrumentation library that would allow me to insert logging code at runtime...
Alternatively, you might be able to use an instrumenting profiler, but be prepared for a huge performance drop when doing that.
That's not possible with standard Java (or any other programming language I'm aware of). You should add sufficient logging to your application, which will allow you to get some idea of what's happened. Also, learn to use your IDE's debugger if you don't already know how.
I generally agree with #thkala and #artbristol (+1 for both).
But you have a requirement and have no choice: you need a solution.
I'd recommend you to try to wrap your objects with dynamic proxies that perform auditing, i.e. write all changes that happen to object.
You can probably use AspectJ for this. The aspect will note what method was called and what are the parameters that were sent. You can also use other, lower level tools, e.g. Javasist or CgLib.
Answer is No.JVM doesn't mainatain the history of object's state.Maximum what you can do you can keep track of states of your object that could be some where in-memory and when you get exception you can serialize that in-memory object and then i think you can do analysis.

Understanding this architecture

I have inherited a massive system from my predecessor and I am beginning to understand how it works but I cant fathom why.
It's in java and uses interfaces which, should add an extra layer, but they add 5 or 6.
Here's how it goes when the user interface button is pressed and that calls a function which looks like this
foo.create(stuff...)
{
bar.create;
}
bar.create is exactly the same except it calls foobar.creat and that in turn calls barfoo.create. this goes on through 9 classes before it finds a function that accessed the database.
as far as I know each extra function call incurs more performance cost so this seems stupid to me.
also in the foo.create all the variables are error checked, this makes sense but in every other call the error checks happen again, it looks like cut and paste code.
This seems like madness as once the variables are checked once they should not need to be re checked as this is just wastinh processor cycles in my opinion.
This is my first project using java and interfaces so im just confused as to whats going on.
can anyone explain why the system was designed like this, what benefits/drawbacks it has and what I can do to improve it if it is bad ?
Thank you.
I suggest you look at design patterns, and see if they are being used in the project. Search for words like factory and abstract factory initially. Only then will the intentions of the previous developer be understood correctly.
Also, in general, unless you are running on a resource constrained device, don't worry about the cost of an extra call or level of indirection. If it helps your design, makes it easier to understand or open to extension, then the extra calls are worth making.
However, if there is copy-paste in the code, then that is not a good sign, and the developer probably did not know what he was doing.
It is very hard to understand what exactly is done in your software. Maybe it even makes sense. But I've seen couple of projects done by some "design pattern maniacs". It looked like they wanted to demonstrate their knowledge of all sorts of delegates, indirections, etc. Maybe it is your case.
I cannot comment on the architecture without carefully examining it, but generally speaking separation of services across different layers is a good idea. That way if you change implementation of one service, other service remains unchanged. However this will be true only if there is loose coupling between different layers.
In addition, it is generally the norm that each service handles exceptions that specifically pertains to the kind of service it provides leaving the rest to others. This also allows us to reduce the coupling between service layers.

From Static Typing to Dynamic Typing

I have always worked on statically typed languages (C/C++, Java). I have been playing with Clojure and I really like it.
One thing I am worried about is: say that I have a windows that takes 3 modules as arguments and along the way the requirements change and I need to pass another module to the function. I just change the function and the compiler complains everywhere I used it. But in Clojure it won't complain until the function is called. I can just do a regex search and replace but it seems there is a chance to miss a call and it will go unnoticed until that function is actually called. How do you guys deal with this?
This is one of the reasons automated testing/test driven development is even more important in dynamically typed languages. I haven't used Clojure (I mostly use Ruby), so unfortunately I can't recommend a specific testing framework.
The first thing I'd like to mention is that Bruce Eckel has written a very interesting article called Strong Typing vs Strong Testing (the link is down at the moment, unfortunately, but hopefully it will be up soon).
His idea is that when dealing with compiled languages, the compiler is just acting as the first, automatic step of automatic testing. When making the move to a dynamic language, you lose this first level of automatic testing. But in both cases, this first, automatic level is just one part of testing, and not even a very important part.
His point is that if you're developing programs properly, i.e. doing some form of tests and regression tests, the lack of a compiler will only force you to add some more, somewhat basic tests anyways, which is why it's no big loss.
So I guess the first answer I'd give you is, focus on your testing, something you should be doing anyway, and such changes shouldn't affect you too badly.
The second thing I'd like to mention is many dynamic languages that I've seen (for example, Python) have much better abilities to change what methods/classes do without breaking existing code.
For example, with Python, if your method used to accept two parameters but now requires a third one, you can always add a default parameter without breaking any existing code, but that you can now utilize. This is a very basic technique, but in Python's case (and I assume most other dynamic languages as well), these techniques can get much more interesting; since they're dynamic, you can pretty much change the implementation of functions for specific modules, change what variables mean, etc.
I'd suggest looking at which techniques Clojure has that allow similair things, and deciding if they apply in your situation.
You do the same thing you did if the method was part of a public interface that you weren't the only user of.
You add a new method with the extra module and and change the old one to call the new one with a suitable default.
Oh and if your program is that big, make sure you have good tests (test-is should make it simpler than Java)
Test coverage is definitely important. But a dynamically typed language will allow you to work in a different way. In a strongly typed language (like Java), a change in the interface needs to modify all the callers. In Ruby, you could do this-- but probably won't. Instead, you'll probably add flexibility to the method on one of a few ways. Namely:
you tend to have very few methods that take as many as three parameters in Ruby (as opposed to Java). Because you don't have strong typed interface of Java, you break the problem down into smaller pieces and steps. It's much more common to write methods that take just 1 parameter, and then refactor when it becomes more complex.
it's possible-- and common-- to leave the old behavior in place while adding more arguments. For example, if you have to add a third argument to a two argument method, you will set its default value to preserve the old behavior (and save you a refactor). If you are familiar with Javascript libraries like jQuery, they take advantage of this everywhere with "optional" arguments.
similar to optional arguments, methods can grow to take a flexible parameter list. With solid test coverage, you can quite easily add a new behavior to an existing method and safely know you haven't broken the existing code. In Rails, methods like "render" take a wide range of options.
You're not completely without compiler support in Clojure. In the specific example you give, it's the arity of the function that changed, which would be picked up by compiling the Clojure code. I'm still making the strong -> dynamic typing transition and find this comforting!
You lose some level of refactoring and type safety when you move to dynamic languages. The more information the compiler has, the more it can do at compile time for you.
Tim Bray discusses it here,critique of which by Cedric is here,and a post on artima discussing it at length.
If you really need static typing, you can use https://github.com/clojure/core.typed and it's leiningen module to test static variable passing.

Categories