JNI, Garbage collection and Pointers- Java/C++ who should do what?

JNI, Garbage collection and Pointers- Java/C++ who should do what? - java

We have the concept of pointers in C++. Now if we allocate some memory in C++ and pass it on to Java as an object reference(using JNI) then who should be and who will be freeing it.
Will it be
1.)The Garbage collector does it automatically in Java?
2.)We need to explicitly do a delete on the pointer in the wrapped JNI class finalize method?
3.)Or we should just forget finalize(as finalizers cannot be trusted) and it is responsibility of Java to call a C++ code which deletes the object
4.)Or is there some way to deallocate the memory directly in Java itself (not sure how Java intreprets a C++ pointer inorder to delete it)?
What is the best practice for doing this and vice versa(when we pass objects from Java to C++)?

We have the concept of pointers in C++. Now if we allocate some memory in C++ and pass it on to Java as an object reference(using JNI) then who should be and who will be freeing it.
The best strategy is usually to have the allocator also be the one to free the data.
1.)The Garbage collector does it automatically in Java?
The problem with this is you don't know when, if ever it will run.
2.)We need to explicitly do a delete on the pointer in the wrapped JNI class finalize method?
Better to have a release() method in Java rather than imply that C++ has to delete it. You may want C++ to recycle the memory.
3.)Or we should just forget finalize(as finalizers cannot be trusted) and it is responsibility of Java to call a C++ code which deletes the object
If you mean, allocate the memory in Java and pass it to C++ to populate. This is my preference.
I would use can use ByteBuffer.allocateDirect() and you can call ((DirectBuffer) buffer).cleaner().clean(); to clean it up deterministically.
This can make recycling the memory simpler, possibly the same buffer can be used for the life of the application.

Related

Dynamic memory allocation across programming languages

I have a question regarding dynamic memory allocation.
When it comes to C, memory is allocated using the functions malloc(), calloc() and realloc() and de-allocated using free().
However in objected oriented languages like C++,C# and Java, memory is dynamically allocated using the new and deallocated using delete keywords (operators) in case of C++.
My question is, why are there operators instead of functions for these objected oriented languages for dynamic memory allocation? Even when using new, finally a pointer is returned to the class object reference during allocation, just like a function.
Is this done only to simplify the syntax? Or is there a more profound reason?

In C, the memory allocation functions are just that. They allocate memory. Nothing else. And you have to remember to release that memory when done.
In the OO languages (C++, C#, Java, ...), a new operator will allocate memory, but it will also call the object constructor, which is a special method for initializing the object.
As you can see, that is semantically a totally different thing. The new operator is not just simpler syntax, it's actually different from plain memory allocation.
In C++, you still have to remember to release that memory when done.
In C# and Java, that will be handled for you by the Garbage Collector.

I believe it's done solely to simplify the syntax as you've said.
Operators are simply another way to call methods (or functions).
using "12 + 13" is no different than using Add(12, 13).
A way to see this is via the operator overrides in C# for example:
// Sample from - https://msdn.microsoft.com/en-us/library/8edha89s.aspx
public static Complex operator +(Complex c1, Complex c2)
{
Return new Complex(c1.real + c2.real, c1.imaginary + c2.imaginary);
}
It's a regular method but allows the usage of operators over complex classes.
I'm using the Add operator as an example since I see it as no different than the memory allocation operators such as "new".

The whole point of Object Oriented design/programming is to provide meaningful abstractions.
When you are doing good OO design; you do not think (immediately) on areas in memory. One thinks about of objects; that carry state and provide behavior.
Even when writing code in C++, in most cases, I don't have to worry about subtleties like "why will my bits be aligned", "how much memory does one of my objects required at runtime" and so on. Of course, these questions are relevant in certain situations; but within OO design; the true value comes from creating useful abstractions that help to solve "whatever domain" problems as precise, easy, maintainable, ... as possible.
For the "keyword" versus "function" thing: just have a look at Java. The fathers of the language simply didn't want Java programmers start thinking about "memory pointers". You only deal with objects; and references to objects. Thus, the concept of "allocating" memory, and getting back a "pointer" simply does not exist at all here. So, how would you then provide this functionality as library method?! Well, if you would like to: you can't.
Finally, to a certain degree, this is a matter of "taste/style" by the people designing the language. Sometimes people prefer a small language core; and do everything in libraries; and other people prefer to have "more" things built-in.

The new keyword is ideed to simplify the syntax, which is pretty suggestive and also does more than memory allocation, it invokes the constructor(s) also.
One thing you have said:
C++,C# and Java, memory is dynamically allocated and de-allocated using the new and delete keywords (operators)
for Java and C# it is only the new keyword, there is no delete. I know that in C# you are able to use using blocks to ensure that the resource will be released when the object is not used anymore, but this does not involves memory deallocation in every case, such as it's calling the Dispose method.
One more thing which needs to be pointed is that the goal of an object oriented programming language, as GhostCat just said, is to release the programmer to think of how memory is allocated in most of the cases, and more important, how are the objects released, this is why garbage collector was introduced.
The main principle is that as the programming language is higher, it has to abstract such things as memory management, and provide easy ways to solve the actual business problems one is looking for. Of course this might been considered when a programming langage is chosed for a specific task.

C :malloc calloc are basically the only ways in C to allocate memory.
malloc : it allocate uninitialized memory according to requested size without initializing it to any value
calloc : almost same as malloc ,plus it also initialize it to zero(0).
In both cases , you required something :
The requested memory size for allocation should be given at the time of initialization and it can be increase with realloc.
The allocated memory need to be deleted with free ,sometimes it can be result in a OOM error if somebody don't have a good memory to free the allocated memory although free is quite handy when you are doing lot of memory extensive work.
NOTE : Casting and size(to allocate memory) is required with malloc and calloc
C++: C++ also has malloc and calloc (free and reallocate too) along new and delete ,new and delete can think of as a modern way to allocate and free memory but not all of the OOP's based language have both. e.g java don't have delete.
new uses constructors to initialize default value so it's pretty useful while working with objects when you have various scenarios to set initial value using parameterize ,default or copy constructors.
NOTE : With new you don't have to do the appropriate casing unlike with malloc and calloc and no need to give a memory size for allocation. one less thing , right.
delete is used to release the memory, the delete call on some object also calls destructor which is the last place of the life-cycle of that object where you can do some farewell tasks like saving current state etc and then memory will be released .
Note : In C# and java the deallocation of memory is handled by Garbage-Collector who does the memory management to release the memory.It used various algos like mark-sweep to release the memory if there is no reference variable pointing to that memory or the reference variable value is set as null.
This may also lead to memory leak if there is a reference variable pointing to that object in memory which is no longer required.
The downside of GC is, this makes things slow

Equivalent code to GCHandle.Alloc() in Java?

I'm working on wrapping a C DLL library to Java using JNA. The library has provided a C# wrapper. In the constructor of C# wrapper, a object is created and the memory of the object is pinned by
this.m_object = _CreateObject();
this.m_objectGCH = GCHandle.Alloc(this.m_object, GCHandleType.Pinned);
m_object is an integer pointing to the created object, and the memory of the object is pinned by GCHandle.Alloc(). I can create a object and get the pointer to the object by JNA. However, I have no idea to pin the object memory in Java.

Java's GC has no awareness of the native memory allocated for your object. If you are responsible for deleting the memory at some future point, you must do so explicitly in your Java code by calling whatever "free" method is recommended by your object allocation.
If you need to ensure that Java does not GC a given Java object, then you need to ensure there is a reference to it until you no longer need it (the easiest way to do so is by storing it in a static (class) variable).

Is there any way to pass a Java Array to C through JNI without making a copy of it?

I understand that using GetDoubleArrayElements, it is the JVM who decides whether or not to copy the elements of Array. In this case, is there any way to avoid the copy? If not, is there an other way to transfer from Java to C without copying? I'm passing very big Arrays, and I wish I could avoid the copy.
Thanks

The JNI guide says:
In JDK/JRE 1.1, programmers can use Get/ReleaseArrayElements functions to obtain a pointer to primitive array elements. If the VM supports pinning, the pointer to the original data is returned; otherwise, a copy is made.
New functions introduced in JDK/JRE 1.3 allow native code to obtain a direct pointer to array elements even if the VM does not support pinning.
These "new functions" are GetPrimitiveArrayCritical and ReleasePrimitiveArrayCritical which disable garbage collection completely and have thus to be used with care. So in summary it is a VM problem rather than an API problem. Don't forget that without pinning the garbage collector might decide to compact the heap and physically move your array, so the direct pointer would be of little use after all.
As Peter suggested you could work with a java.nio.DoubleBuffer instead of using arrays. The JNI function
void* GetDirectBufferAddress(JNIEnv* env, jobject buf);
allows you to access it.

Is there something like malloc/free in java?

I've never seen such statements though,does it exist in java world at all?

Java's version of malloc is new -- it creates a new object of a specified type.
In Java, memory is managed for you, so you cannot explicitly delete or free an object.

Java has a garbage collector. That's why you never see such statements in your code(which is nice if you ask me)
In computer science, garbage
collection (GC) is a form of automatic
memory management. It is a special
case of resource management, in which
the limited resource being managed is
memory. The garbage collector, or just
collector, attempts to reclaim
garbage, or memory occupied by objects
that are no longer in use by the
program. Garbage collection was
invented by John McCarthy around 1959
to solve problems in Lisp.

new instead of malloc, garbage collector instead of free.

No direct equivalents exist in Java:
C malloc creates an untyped heap node and returns you a pointer to it that allows you to access the memory however you want.
Java does not have the concept of an untyped object, and does not allow you to access memory directly. The closest that you can get in Java to malloc would be new byte[size], but that returns you a strongly typed object that you can ONLY use as a byte array.
C free releases a heap node.
Java does not allow you to explicitly release objects. Object deallocation in Java is totally in the hands of the garbage collector. In some cases you can influence the behaviour of the GC; e.g. by assigning null to a reference variable and calling System.gc(). However, this does not force the object to be deallocated ... and is a very expensive way to proceed.

If you are up to no good (tm) I suppose you can get access to raw memory though the JNI interface. This is where you can call C programs from Java Programs. Of course you have to be running in an environment where your program has the privileges to do so (a browser won't normally allow this unless it is suicidal) but you can access objects via C pointers that way.
I sort of wonder where the original question is coming from. At one point long ago I was totally skeptical of the notion that C-style memory management and C-style pointers were not needed, but at this point I am true believer.

JNI memory management using the Invocation API

When I'm building a java object using JNI methods, in order to pass it in as a parameter to a java method I'm invoking using the JNI invocation API, how do I manage its memory?
Here's what I am working with:
I have a C object that has a destructor method that is more complex that free(). This C object is to be associated with a Java object, and once the application is finished with the Java object, I have no more need for the C object.
I am creating the Java object like so (error checking elided for clarity):
c_object = c_object_create ();
class = (*env)->FindClass (env, "my.class.name");
constructor = (*env)->GetMethodID (env, class, "<init>", "(J)V");
instance = (*env)->NewObject (env, class, constructor, (jlong) c_object);
method = (*env)->GetMethodID (env, other_class, "doSomeWork", "(Lmy.class.name)V");
(*env)->CallVoidMethod (env, other_class, method, instance);
So, now that I'm done with instance, what do I do with it? Ideally, I'd like to leave the garbage collection up to the VM; when it's done with instance it would be fantastic if it also called c_object_destroy() on the pointer I provided to it. Is this possible?
A separate, but related question has to do with the scope of Java entities that I create in a method like this; do I have to manually release, say, class, constructor, or method above? The JNI doc is frustratingly vague (in my judgement) on the subject of proper memory management.

The JNI spec covers the issue of who "owns" Java objects created in JNI methods here. You need to distinguish between local and global references.
When the JVM makes a JNI call out to native code, it sets up a registry to keep track of all objects created during the call. Any object created during the native call (i.e. returned from a JNI interface function) are added to this registry. References to such objects are known as local references. When the native method returns to the JVM, all local references created during the native method call are destroyed. If you're making calls back into the JVM during a native method call, the local reference will still be alive when control returns back to the native method. If the JVM invoked from native code makes another call back into the native code, a new registry of local references is created, and the same rules apply.
(In fact, you can implement you're own JVM executable (i.e. java.exe) using the JNI interface, by creating a JVM (thereby receiving a JNIEnv * pointer), looking up the class given on the command line, and invoking the main() method on it.)
All references returned from JNI interface methods are local. This means that under normal circumstances you do not need to manually deallocate references return by JNI methods, since they are destroyed when returning to the JVM. Sometimes you still want to destroy them "prematurely", for example when you lots of local references which you want to delete before returning to the JVM.
Global references are created (from local references) by using the NewGlobalRef(). They are added to a special registry and have to be deallocated manually. Global references are only used for Java object which the native code needs to hold a reference to across multiple JNI calls, for example if you have native code triggering events which should be propagated back to Java. In that case, the JNI code needs to store a reference to a Java object which is to receive the event.
Hope this clarifies the memory management issue a little bit.

There are a couple of strategies for reclaiming native resources (objects, file descriptors, etc.)
Invoke a JNI method during finalize() which frees the resource. Some people recommend against implementing finalize, and basically you can't really be sure that your native resource is ever freed. For resources such as memory this is probably not a problem, but if you have a file for example which needs to be flushed at a predictable time, finalize() probably not a good idea.
Manually invoke a cleanup method. This is useful if you have a point in time where you know that the resource must be cleaned up. I used this method when I had a resource which had to be deallocated before unloading a DLL in the JNI code. In order to allow the DLL to later be reloaded, I had to be sure that the object was really deallocated before attempting to unload the DLL. Using only finalize(), I would not have gotten this guaranteed. This can be combined with (1) to allow the resource to be allocated either during finalize() or at the manually called cleanup method. (You probably need a canonical map of WeakReferences to track which objects needs to have their cleanup method invoked.)
Supposedly the PhantomReference can be used to solve this problem as well, but I'm not sure exactly how such a solution would work.
Actually, I have to disagree with you on the JNI documentation. I find the JNI specification exceptionally clear on most of the important issues, even if the sections on managing local and global references could have been more elaborated.

Re: "A separate, but related question"... you do not need to manually release jclass, jfieldID and jmethodID when you use them in a "local" context. Any actual object references you obtain (not jclass, jfieldID, jmethodID) should be released with DeleteLocalRef.

The GC would collect your instance, but it will not automatically release the non-java heap memory allocated in the native code. You should have explicit method in your class to release the c_object instance.
This is one of the cases where I'd recommend using a finalizer checking if c_object has been released and release it, logging a message if it's not.
A useful technique is to create a Throwable instance in the Java class constructor and store it in a field (or just initialize the field inline). If the finalizer detects that the class has not been properly disposed it would print the stacktrace, pinpointing the allocation stack.
A suggestion is to avoid doing straight JNI and go with gluegen or Swig (both generate code and can be statically linked).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.