Idenfifying when a object is created using unsafe.allocateInstance()

Idenfifying when a object is created using unsafe.allocateInstance() - java

Just trying to learn on sun.misc.Unsafe low-level java operations. I was reading this article however, my question is related to advantages of using Unsafe.
In this example, Player p = (Player) unsafe.allocateInstance(Player.class); Where is the object created? On JVM heap or non-heap direct memory?
Are all the operations that are explained in the article non-heap allocations? I ask this because, when you use “new” key-word, its supposed to create an instance on the heap. If it does, then what is the actual advantage, because then, it doesnt by-pass the GC.

Method Unsafe#allocateInstance(Class<?>) will allocate only memory in the heap, whithout initialization phase. By next link described that this method avoid initialization paragraph. You can also review openJDK thread where this question is also disscuessed.
You can allocate non-heap memory using method Unsafe#allocateMemory(long). You can also review example of usage in DirectByteBuffer class, which created by ByteBuffer#allocateDirect(int).
If you want to work with non-heap memory consider using ByteBuffer allocateDirect. But actual advantage of using non-heap memory is doubtful. You should make performance benchmarks to be sure you have performance advantage. Also consider using -XX:MaxDirectMemorySize parameter to force native memory reusing if you use direct buffers.

Related

JNA memory clean up

As per documentation of JNA for Memory class, finalize() method needs to be called when memory need to release which has no longer reference. But in JNA example, it is mentioned that Memory object is getting released when it is out of scope.
// note: like Memory, StringArray will free the contiguous block of memory it copied the Strings into when the instance goes out of scope
The questions are :
Does it mean, Memory class object is calling finalize() internally when it is out of scope and frees the underlying native memory?
The StringArray class and Memory class are same in behavior w.r.t. memory management? and how?

Update:
As of JNA 5.12.1, JNA's Memory class no longer uses finalize() to free memory. It registers a Cleaner (custom internal class based on JDK9+ Cleaner implementation) which releases the native memory using a separate thread.
Along with this change, Memory was made Closeable (JDK6) which extends AutoCloseable in JDK7+ implementations. You may release the native memory simply by calling close() or better, allocating the memory in a try-with-resources block:
try (Memory m = new Memory(123)) {
// use m
}
Original answer:
To elaborate on Matthias Bläsing's answer in light of the specific questions asked, I want to add a few points:
You generally don't directly call finalize(). It is called by the JVM as part of the garbage collection process.
In Memory the finalize() method simply calls the dispose() method. If you really want to get rid of memory immediately this would be the preferred method to call. But dispose() is protected so you'd need to extend Memory to take advantage of this method if you really felt the need to clean up native memory allocations.
One such subclass you might consider is one that extends Closeable, where the close() method implementation calls dispose() from the superclass. Then you could, for example, use a try with resources block and have the native memory (resource) cleaned up at the end of the block. You'd still have the Java Object hanging around until GC, of course.
Note that freeing native memory comes with a processing cost, however, and unless you're really short on memory it doesn't gain much as you still have the Java heap memory associated with the object until it is GC'd. If you're that short of memory and going to that level of detail to control the timing of the native allocation cleanup, you probably want to go directly to the malloc() and free() calls yourself and control it at a higher level, perhaps recycling/reusing it...
You also asked about StringArray, but the closer parallel to Memory is the NativeString objects which are members of the array. And in fact their internal implementation is a StringMemory object which extends Memory so it would behave identically; that is, free() the native memory via dispose() via finalize() at the point the NativeString is garbage collected by the JVM.

The Memory implementation of JNA relies on the java garbage collection (GC). Java has no functions to explicitly acquire memory of objects, the memory necessary to hold object data is managed by the VM, allocated when an object is instantiated.
The GC is the process, that frees all memory not referenced anymore. All classes in Java can declare a method finalize, which will be called by the GC when the objects of that class are about to be cleared and gives objects the option to do some final cleanup work. In case of JNA this cleanup is releasing the native memory, that is allocated outside the GC controlled area.
It should be noted, that using finalize is being deprecated and should not be done anymore, but switch to alternative methods also means introducing slightly different behavior, which is one of the reasons, that JNA still relies on GC for cleanup.

What is the purpose to use direct memory in Java?

Direct memory was introduced since java 1.4. The new I/O (NIO) classes introduced a new way of performing I/O based on channels and buffers. NIO added support for direct ByteBuffers, which can be passed directly to native memory rather than Java heap. Making them significantly faster in some scenarios because they can avoid copying data between Java heap and native heap.
I never understand why do we use direct memory. Can someone help to give an example?

I never understand why do we use direct memory. can someone help to give an example?
All system calls such as reading and writing sockets and files only use native memory. They can't use the heap. This means while you can copy to/from native memory from the heap, avoiding this copy can improve efficiency.
We use off-heap/native memory for storing most of our data which has a number of advantages.
it can be larger than the heap size.
it can be larger than main memory.
it can be shared between JVMs. i.e. one copy for multiple JVMs.
it can be persisted and retained across restarts of the JVM or even machine.
it has little to no impact on GC pause times.
depending on usage it can be faster
The reason it is not used more is that it is harder to make it both efficient and work like normal Java objects. For this reason, we have libraries such as Chronicle Map which act as a ConcurrentMap but using off-heap memory, and Chronicle Queue which is a journal, logger and persisted IPC between processes.

The JVM relies on the concept of garbage collection for reclaiming memory that is no longer used. This allows JVM language developers (e.g., Java, Scala, etc) to not have to worry about memory allocation and deallocation. You simply ask for memory, and let the JVM worry about when it will be reclaimed, or garbage collected.
While this is extremely convenient, it comes with the added overhead of a separate thread, consuming CPU and having to go through the JVM heap constantly, reclaiming objects that are not reachable anymore. There's entire books written about the topic, but if you want to read a bit more about JVM garbage collection, there's a ton of references out there, but this one is decent: https://dzone.com/articles/understanding-the-java-memory-model-and-the-garbag
Anyway, if in your app, you know you're going to be doing massive amounts of copying, updating objects and values, you can elect to handle those objects and their memory consumption yourself. So, regardless of how much churn there is in those objects, those objects will never be moved around in the heap, they will never be garbage collected, and thus, won't impact garbage collection in the JVM. There's a bit more detail in this answer: https://stackoverflow.com/a/6091680/236528
From the Official Javadoc:
Direct vs. non-direct buffers
A byte buffer is either direct or non-direct. Given a direct byte
buffer, the Java virtual machine will make a best effort to perform
native I/O operations directly upon it. That is, it will attempt to
avoid copying the buffer's content to (or from) an intermediate buffer
before (or after) each invocation of one of the underlying operating
system's native I/O operations.
A direct byte buffer may be created by invoking the allocateDirect
factory method of this class. The buffers returned by this method
typically have somewhat higher allocation and deallocation costs than
non-direct buffers. The contents of direct buffers may reside
outside of the normal garbage-collected heap, and so their impact upon
the memory footprint of an application might not be obvious. It is
therefore recommended that direct buffers be allocated primarily for
large, long-lived buffers that are subject to the underlying system's
native I/O operations. In general it is best to allocate direct
buffers only when they yield a measureable gain in program
performance.
https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/nio/ByteBuffer.html

Is there any way that I can free up memory in the java code that is generated to bind C code via JNI/JNA?

I am using a java library that use JNA to bind to the original C library (That library is called Leptonica). I encountered a situation where free(data) has to be called in the C code to free up the memory. But, is there any function in java that I can free up the memory?
In the C code
void ImageData::SetPixInternal(Pix* pix, GenericVector<char>* image_data) {
l_uint8* data;
size_t size;
pixWriteMem(&data, &size, pix, IFF_PNG);
pixDestroy(&pix);
image_data->init_to_size(size, 0);
memcpy(&(*image_data)[0], data, size);
free(data);
}
The function pixWriteMem() will create and allocate memory to the "data", which you need to do free(data) to free up the memory later.
In Java code, I can only access pixWriteMem(), not the SetPixInternal(), so I have no way to free up the "data", which create a memory leak.

The other comments and answers here all seem to be suggesting that you just rely on the garbage collector or tell the garbage collector to run. That is not the correct answer for memory allocated in C and being used in Java via JNI.
It looks like that execution() does free the memory. The last line you show us is free(data). Still, to answer your the question as you asked it, the answer is "not directly." If you have the ability to add to the C code, you could create another C function which frees the data and then call that using JNI. Perhaps there is more that we are not seeing which relates better to your concern about the memory leak?
Also, be careful about freeing memory allocated by a library you are using. You should make sure that the library doesn't still need it and is leaking it before you go trying to free it.
And now back to memory management in general...
Java is indeed a garbage-collected language. This means that you do not specifically delete objects. Instead, you make sure there are no references to it, then the garbage collector takes care of the memory management. This does not mean that Java is free from memory leaks, as there are ways to accidentally keep a reference hanging around such that the object never gets garbage collected. If you have a situation like this, you might want to read up on the different kinds of references in Java (strong/weak/etc.).
Again, this is not the problem here. This is a C/Java hybrid, and the code in question is in C being called by Java. In C, you allocate the memory you want to use and then you need to free the memory yourself when you are done with it. Even if the C code is being run by Java via the JNI, you are still responsible for your own memory. You cannot just malloc() a bunch of memory and expect the Java garbage collector to know when to clean it up. Hence the OP's question.
If you need to add the functionality yourself to do a free, even without the source code for the C part, you might still be able to write your own C interface for freeing the memory if you have access to the pointer to the memory in question. You could write basically a tiny library that just frees the memory for you, make the JNI interface for it, and pass the pointer to that. If you go this route then, depending on your OS, you might need to guarantee that your tiny free library's native code is running in the same process as the rest of the native code, or if not the same process then at least that the process you run it from has write access to the memory owned by the other code's process; this memory/process issue is probably not an issue in your case, but I'm throwing it out there for completeness.

In Java code, I can only access createData(), not the excution(), so I have no way to free up the "data", which create a memory leak.
Then it sucks to be you.
Seriously, if you want to free memory allocated by a native method and not freed before that method returns, then you need to maintain a handle of some kind on that memory and later pass it to another native method that will free the memory. If you do not presently have such a native method available, then you'll need to create one.
The other question is how to ensure that the needed native method is invoked. Relying on users to invoke it, directly or indirectly, leaves you open to memory leaks should users fail to do so. There are two main ways to solve that problem:
Give your class a finalizer that ensures the memory is freed. This is the core use case for finalizers, but even so, there are good reasons to prefer to avoid writing them. The other alternative is to
Create a reference object (SoftReference, WeakReference, or PhantomReference), associate the reference with a mechanism for freeing the native-allocated memory belonging to the referenced Java object (but not via that object), and register that object with a reference queue. The reference will be enqueued when the object is GC'd, at which point you know to free the native-allocated memory.
That does not necessarily mean that you should prevent users from explicitly freeing the memory, for with enough bookkeeping you can track whether anything still needs to be freed at any given time. Allowing users to release resources explicitly may help keep your overall resource usage lower. But if you want to avoid memory leaks then you need to have a fallback.

No there is no function like C's free() in Java. But you can suggest garbage collector to run by calling System.gc()

GC.AddMemoryPressure equivalent in Java

Project: Java, JNI (C++), Android.
I'm going to manage native C++ object's lifetime by creating a managed wrapper class, which will hold a pointer to the native object (as a long member) and will delete the native object in it's overridden finalize() method. See this question for details.
The C++ object does not consume other types of resources, only memory. The memory footprint of the object is not extremely high, but it is essentially higher than 64 bit of a long in Java. Is there any way to tell Java's GC, that my wrapper is responsible for more than just a long value, and it's not a good idea to create millions of such objects before running garbage collection? In .NET there is a GC's AddMemoryPressure() method, which is there for exactly this purpose. Is there an equivalent in Java?

After some more googling, I've found a good article from IBM Research Center.
Briefly, they recommend using Java heap instead of native heap for native objects. This way memory pressure on JVM garbage collector is more realistic for the native objects, referenced from Java code through handles.
To achieve this, one needs to override the default C++ heap allocation and deallocation functions: operator new and operator delete. In the operator new, if JVM is available (JNI_OnLoad has been already called), then the one calls NewByteArray and GetByteArrayElements, which returns the allocated memory needed. To protect the created ByteArray from being garbage collected, the one also need to create a NewGlobalRef to it, and store it e.g. in the same allocated memory block. In this case, we need to allocate as much memory as requested, plus the memory for the references. In the operator delete, the one needs to DeleteGlobalRef and ReleaseByteArrayElements. In case JVM is not available, the one uses native malloc and free functions instead.

I believe that native memory is allocated outside the scope of Java's heap size. Meaning, you don't have to worry about your allocation taking memory away from the value you reserved using -Xmx<size>.
That being said, you could use ByteBuffer.allocateDirect() to allocate a buffer and GetDirectBufferAddress to access it from your native code. You can control the size of the direct memory heap using -XX:MaxDirectMemorySize=<size>

Dynamic Memory Handling Java vs C++

I am a C++ programmer currently trying to work on Java. Working on C++ I have an habit of keeping track of dynamic memory allocations and employing various techniques like RAII to avoid memory leak. Java as we know provides a Garbage Collector(GC) to take care of memory leaks.So while programing in Java should one just let go all the wholesome worries of heap memory and leave it for GC to take care of the memory leaks or should one have a approach similar to that while programming languages without GC, try to take care of the memory you allocate and just let GC take care of ones that you might miss out. What should be the approach? What are the downsides of either?

I'm not sure what you mean by trying to take care of the memory you allocate in presence of a GC, but I'll try some mind reading.
Basically, you shouldn't "worry" about your memory being collected. If you don't reference objects anymore, they will be picked up. Logical memory leaks are still possible if you create a situation where objects are referenced for the rest of your program (example: register listeners and never un-register them, example: implementing a vector-like collection that doesn't set items to null when removing items off the end).
However, if you have a strong RAII background, you'll be disapointed to learn that there is no direct equivalent in Java. The GC is a first-class tool for dealing with memory, but there is no guaranteed on when (or even if) finalizers are called. This means that the first-class treatment applied to memory is not applied to any other resource, such as: windows, database connections, sockets, files, synchronization primitives, etc.

With Java and .net (and i imagine, most other GC'ed languages), you don't need to worry much about heap memory at all. What you do need to worry about, is native resources like file handles, sockets, GUI primitives like fonts and such. Those generally need to be "disposed", which releases the native resources. (They often dispose themselves on finalization anyway, but it's kinda iffy to fall back on that. Dispose stuff yourself.)

With a GC, you have to:
still take care to properly release non-memory resources like file handles and DB connections.
make sure you don't keep references to objects you don't need anymore (like keeping them in collections). Because if you do that, you have a memory leak.
Apart from that, you can't really "take care of the memory you allocate", and trying to do so would be a waste of time.

Technically you don't need to worry about cleaning up after memory allocations since all objects are properly reference counted and the GC will take care of everything. In practice, an overactive GC will negatively impact performance. So while Java does not have a delete operator, you will do well to reuse objects as much as possible.
Also Java does not have destructors since objects will exists until the GC gets to them,. Java therefore has the finally construct which you should use to ensure that all non-memory related resources (files sockets etc) are closed when you are finished with them. Do not rely on the finalise method to do this for you.

In Java the GC takes care of allocating memory and freeing unused memory. This does not mean you can disregard the issue alltogether.
The Java GC frees objects that have are not referenced from the root. This means that Java can still have memory leaks if you are not carefull to remove references from global contexts like caches in global HashMaps, etc.
If any cluster of objects that reference eachother are not referenced from the root, the Java GC will free them. I.e. it does not work with reference counts, so you do not need to null all object references (although some coding styles do prefer clearing references as sonn as they are not needed anymore.)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.