Simulating Destructors in Clojure - java

Problem Statement
I have two machines, A and B, both running Clojure.
B has some in memory data structure.
A holds an object A_P which is a reference/pointer to some object B_O in B's memory.
Now, as long as A_P is NOT GC-ed by A, I do not want B_O GC-ed by B.
However, once A_P has been GC-ed by A (and nothing else in A referes to B_O, and nothing else in B refers to B_O), then I want B_O to be elegible to be GC-ed.
Solution in Languages with Destructors
In C++, this is easy -- I use destructors. When A_P gets GC-ed, A sends B a msg to decrement the number of external references to B_O, and when that's 0, and internal refernes to B_0 is also 0, then B_O gets GC-ed.
Solution in Java/Clojure?
Now, I know that Java does not have destructors. However, I'm wondering if Clojure has a way around this problem.
Thanks!

No good solution exists, without a real distributed garbage collector. Even in C++, you cannot do this safely, because you implemented reference counting and pretended it was a real garbage collector; but if two objects point to each other across the machine divide, and are both unreferenced locally, they still both have a nonzero reference count and cannot be collected.

No, Clojure (based on JVM, CLR) doesn't have the "C++ type destructors" because of the automatic memory management model of JVM. There are things like finalizers but it is recommended to not use them. Instead you should model your solution based on message passing mechanism rather then A machine holding "pointer/reference" to data in B. I know this answer is very high level because you haven't provide any specific problem details in your question. If you need more details about how to solve a particular problem please provide the complete context and I am sure someone will able to help you.

This is an inherently difficult problem: distributed garbage problem is really hard if not impossible to get right.
However you might just be able to make it work using Java finalisers and overriding the finalize() method. You can then implement a messaging technique similar to the one you describe for C++.
This will have issues in the more general case (it won't help you with circular references across machines as amalloy points out) and there are some other quirks to be aware of (mostly around your lack of control over exactly when the finaliser gets called) but you might be able to get it to work in your specific situation.

Assuming you're using a data structure like a ref or atom for holding data structure A somewhere inside it, you can use listeners for monitoring the state of that structure for removals of A, and those listeners can send appropriate message to B. clojure.data/diff could be really useful for finding the structures that were removed.
The other option would be to have, immediately after the A structure is dereferenced, the function responsible for doing so send the message. As part of this though, make sure that that code was actually responsible for the removal of A, and not some other update.

Related

Duplicate planning entities in the solution

I'm new to Optaplanner, and I try to solve a quite simple problem (for now, I will add more constraints eventually).
My model is the following: I have tasks (MarkerNesting), that must run one at a time on a VirtualMachine; the goal is to assign a list of MarkerNestings to VirtualMachines, having all machines used (we can consider that we have more tasks than machines as a first approximation). As a result, I expect each task to have a start and a end date (as shadow variables - not implemented yet).
I think I must use a chained variable, with the VirtualMachine being the anchor (chained through time pattern) - am I right?
So I wrote a program inspired by some examples (tsp and coach and shuttle) with 4 machines and 4 tasks, and I expect each machine having one task when it is solved. When running it, though, I get some strange results : not all machines are used, but the worst is that I have duplicate MarkerNesting instances (output example):
[VM 1/56861999]~~~>[Nesting(155/2143571436)/[Marker m4/60s]]~~~>[Nesting(816/767511741)/[Marker m2/300s]]~~~>[Nesting(816/418304857)/[Marker m2/300s]]~~~>[Nesting(980/1292472219)/[Marker m1/300s]]~~~>[Nesting(980/1926764753)/[Marker m1/300s]]
[VM 2/1376400422]~~~>[Nesting(155/1815546035)/[Marker m4/60s]]
[VM 3/1619356001]
[VM 4/802771878]~~~>[Nesting(111/548795052)/[Marker m3/180s]]
The instances are different (to read the log: [Nesting(id/hashcode)]), but they have the same id, so they are the same entity in the end. If I understand well, Optaplanner clones the solution whenever it finds a best one, but I don't know why it mixes instances like that.
Is there anything wrong in my code? Is it a normal behavior?
Thank you in advance!
Duplicate MarkerNesting instances that you didn't create, have the same content, but a different memory address, so are != from each other: that means something when wrong in the default solution cloner, which is based on reflection. It's been a while since anyone ran into an issue there. See docs section on "planning clone". The complex model of chained variables (which will be improved) doesn't help here at all.
Sometimes a well placed #DeepPlanningClone fixes it, but in this case it might as well be due to the #InverseRelationShadowVariable not being picked.
In any case, those system.out's in the setter method are misleading - they can happen both by the solution cloner as well as by the moves, so without the solution hash (= memory address), they tell nothing. Try doing a similar system.out in either your best solution change events, or in the BestSolutionRecaller call to cloneWorkingSolution(), for both the original as well as the clone.
As expected, I was doing something wrong: in Schedule (the PlanningSolution), I had a getter for a collection of VirtualMachine, which calculate from another field (pools : each Pool holds VirtualMachines). As a result, there where no setter, and the solution cloner was probably not able to clone the solution properly (maybe because pools is not annotated as a problem fact or a planning entity?).
To fix the problem, I removed the Pool class (not really needed), leaving a collection of VirtualMachines in Schedule.
To sum up, never introduce too many classes before you need them ^_^'
I pushed the correct version of my code on github.

Java call stack inspection and manipulation

My question is: is it possible (in ANY way) to analyze and modify call stack (both content of frames and stack content) in runtime?
I'm looking for any possibility - low-level, unsafe or internal API, possibility to write C extension, etc. Only constraint: it should be usable in standard runtime, without debugging or profiling mode. This is the point where I'm doing research "is it possible at all?", not "is it good idea?".
I'd like to gather all local data from a frame, store it somewhere, and then remove that frame from stack, with possibility of restoring it later. Effectively that gives us continuations in JVM, and they will allow fast async frameworks (like gevents from python) and generator constructs (like those from python) to come up.
This may look like repeated question, but I've only found questions that were answered with "use Thread.currentThread().getStackTrace()" or "that should be done with debugging tools". There was similiar question to mine, but it was only answered in context of what asking guy wanted to do (work on async computations), while I need more general (java-stack oriented) answer. This question is similiar too, but as before, it is focused on parallelization, and answers are focused on that too.
I repeat: this is research step in process of coming up with new language feature proposal. I don't wanna risk corrupting anything in JVM - I'm looking for possibility, then I'm gonna analyse possible risks and look out for them. I know that manipulating stack by hand is ugly, but so is creating instances with ommiting consrtuctor - and it is basis for objenesis. Dirty hacks may be dirty, but they may help introducing something cool.
PS. I know that Quasar and Lightwolf exist, but, as above, those are concurrency-focused frameworks.
EDIT
Little clarification: I'm looking for something that will be compatible with future JVM and libraries versions. Preferably we're talking about something that is considered stable public API, but if the solution lies in something internal, yet almost standard or becoming standard after being internal (like sun.misc.Unsafe) - that will do too. If it is doable by C-extension using only C JVM API - that's ok. If that is doable with bytecode manipulation - that's ok too (I think that MAY be possible with ASM).
I think there is a way achieving what you want using JVMTI.
Although you cannot directly do what you want (as stated in a comment above), you may instrument/redefine methods (or entire classes) at run time. So you could just define every method to call another method directly to "restore execution context" and as soon as you have the stack you want, redefine them with your original code.
For example: lets say you want to restore a stack where just A called B and B called C.
When A is loaded, change the code to directly call B. As soon as B is loaded, redefine it to directly call C; Call the topmost method (A); As soon as C gets called (which should be very fast now), redefine A and B to their original code.
If there are multiple threads involved and parameter values that must be restored, it gets a little more complicated, but still doable with JVMTI. However, this would then be worth another question ;-).
Hope this helps. Feel free to contact me or comment if you need clarification on anything.
EDIT:
Although I think it IS doable, I also think this is a lot (!!!) of work, especially when you want to restore parameters, local variables, and calling contexts (like this pointers, held locks, ...).
EDIT as requested: Assume the same stack as above (A calling B calling C). Although A, B, and C have arbitrary code inside them, just redfine them like this: void A() { B(); } void B() { C(); } void C() { redefine(); } As soon as you reach the redefine method, redefine all classes with their original code. Then you have the stack you want.
Not sure in this tool, but you can check http://en.wikipedia.org/wiki/GNU_Debugger.
GDB offers extensive facilities for tracing and altering the execution of computer programs. The user can monitor and modify the values of programs' internal variables, and even call functions independently of the program's normal behavior.

Creating Objects on the stack memory in java ?

This is just a simple theoretical question out of curiosity. I have always been like a java fan boy. But one thing makes me wonder why java does not provide mechanism for creating objects on the stack ? Wouldn't it be more efficient if i could just create small Point(int x,int y ) object on the stack instead of the heap like creating a structure on C# . Is there any special security reason behind this restriction in java ? :)
The strategy here is that instead of leaking this decision into the language, Java lets the JVM/Hotspot/JIT/runtime decide where and how it wants to allocate memory.
There is research going on to use "escape analysis" to figure out what objects don't actually need to go onto the heap and stack-allocate them instead. I am not sure if this has made it into a mainstrem JVM already. But if it does, it will be controlled by the runtime (thing -XX:something), not the developer.
The upside of this is that even old code can benefit from these future enhancements without itself being updated.
If you like to manually manage this (but still have the compiler check that it stays "safe"), take a look at Rust.
This will tentatively be coming to Java, there is no real ETA set for this so you could only hope it will come by Java 10.
The proposal is called Value Types and you can follow it in the mailing list of Project Valhalla.
I do not know if there were any prior reasons as to why it wasn't in the language in the first place, maybe originally it was thought of as unneeded or there was simply no time to implement this.
A common problem would be to initialize some global reference with an object created on the stack. When the method which created the object exits what do you point to?
That being said object are created on the stack in Java, it's just being done behind your back using the escape analysis which makes sure the above scenario doesn't occur.

how can I get the History of an object or trace an Object

I have a requirement, where support in my application a lot of processing is happening, at some point of time an exception occrured, due to an object. Now I would like to know the whole history of that object. I mean whatever happened with that object over the period of time since the application has started.
Is this peeping into this history of Object possible thru anyway using JMX or anything else ?
Thanks
In one word: No
With a few more words:
The JVM does not keep any history on any object past its current state, except for very little information related to garbage collection and perhaps some method call metrics needed for the HotSpot optimizer. Doing otherwise would imply a huge processing and memory overhead. There is also the question of granularity; do you log field changes only? Every method call? Every CPU instruction during a method call? The JVM simply takes the easy way out and does none of the above.
You have to isolate the class and/or specific instance of that object and log any operation that you need on your own. You will probably have to do that manually - I have yet to find a bytecode instrumentation library that would allow me to insert logging code at runtime...
Alternatively, you might be able to use an instrumenting profiler, but be prepared for a huge performance drop when doing that.
That's not possible with standard Java (or any other programming language I'm aware of). You should add sufficient logging to your application, which will allow you to get some idea of what's happened. Also, learn to use your IDE's debugger if you don't already know how.
I generally agree with #thkala and #artbristol (+1 for both).
But you have a requirement and have no choice: you need a solution.
I'd recommend you to try to wrap your objects with dynamic proxies that perform auditing, i.e. write all changes that happen to object.
You can probably use AspectJ for this. The aspect will note what method was called and what are the parameters that were sent. You can also use other, lower level tools, e.g. Javasist or CgLib.
Answer is No.JVM doesn't mainatain the history of object's state.Maximum what you can do you can keep track of states of your object that could be some where in-memory and when you get exception you can serialize that in-memory object and then i think you can do analysis.

Java or C++ for my particular agent-based model (ABM)?

I unfortunately need to develop an agent-based model. My background is C++; I'm decent but not a professional programmer. My goal is to determine whether, my background aside for the moment, the following kind of algorithm would be faster or dramatically easier to write in C++ or Java.
My agents will be of class Host. Their private member variables include their infection and immune statuses (type int) with respect to different strains. (In C++, I might use an unordered_map or vector to hold this information, depending on the number of strains.) I plan to keep track of all hosts in a vector, vector&lt Host *&gt hosts .
The program will need to know at any time all the particular hosts infected with a particular strain or with immunity to a particular strain. For each strain, I could thus maintain two separate structures, e.g., vector&lt Host *&gt immune and vector&lt Host *&gt infectious (I might make each two-dimensional, indexed by strain and then host).
Hosts can die. It seems like this creates a mess in C++, in that I would have to find the right individual to kill in host and search through the other structures (immune and infectious) to find all pointers to this object. I'm under the impression that Java will delete all these pointers implicitly if I delete the underlying object. Is this true? Is there a dramatically better way to do this in C++ than what I have here?
Thanks in advance for any help.
I should add that if I use C++, I will use smart pointers. That said, I still don't see a slick way to delete all pointers to an object when the object needs to go. (When a host dies, I want to delete it from memory.)
I realize there's a lot to learn in Java. I'm hoping someone with more perspective on the differences between the languages, and who can understand what I need to do (above), can tell me if one language will obviously be more efficient than another.
I'm under the impression that Java will delete all these pointers implicitly if I delete the underlying object. Is this true?
Nope. You actually have it backwards; if you delete all the pointers, Java will delete the underlying object. So you'll still need to search through all three of your data structures (hosts, immune, and infectious) to kill that particular host.
However, this "search" will be fast and simple if you use the right data structures; a HashSet will do the job very nicely.
private HashSet<Host> hosts;
private HashSet<Host> immune;
private HashSet<Host> infectious;
public void killHost(Host deadManWalking) {
hosts.remove(deadManWalking);
immune.remove(deadManWalking);
infectious.remove(deadManWalking);
}
It's really that simple, and will take place in O(lg n) time. (Though you will have to override the equals and hashCode methods in your implementation of Host; this is not technically challenging.)
My memories of C++ are too hazy for me to give any sort of authoritative comparison between the two languages; I did a ton of C++ work in college, haven't touched it since. Will C++ code run faster? Done right and assuming you don't have any memory leaks, I'd suspect it would, though Java's rep as a slow language is mostly a holdover from its youth; it's pretty decent these days. Easier to write? Well, give that you'd be learning the language, probably not. But the learning curve from C++ to Java is pretty gentle, and I personally don't miss C++ at all. Once you know the languages, Java is, in my opinion, vastly easier to work with. YMMV, natch, but it may well be worth the effort for you.
I can't answer your all questions, but
I'm under the impression that Java will delete all these pointers implicitly if I delete the underlying object.
In Java you don't delete an object; instead, it gets effectively deleted when the reference count to it goes to zero. However, you may want to utilize weak references here; this way the object disappears when the strong reference count goes to zero.
Actually, your impression is basicly backwards: Java will assume an object (the host in this case) is dead when there's no longer any pointer to give access to that object. At that point it'll clean up the object (automatically).
At a guess, however, there's one collection that "owns" the hosts, and would be responsible for deleting a host when it dies. The other pointers to the host don't own it. If that's the case, then in C++ you'd normally handle this by having the "owning" collection contain a shared_ptr to the host, and the other collections contain weak_ptrs to the host. To use the object via a weak_ptr, you have to first convert that to a shared_ptr that you can dereference to get to the host itself. If, however, the object has been deleted, the attempt at converting the weak_ptr to a shared_ptr will fail, and you'll know the host is dead (and you can then delete your reference to it).

Categories