Java, buffer lazily allocated, free-able on demand and addressed as byte []

Java, buffer lazily allocated, free-able on demand and addressed as byte [] - java

I am trying to implement a proof-of-concept memory-aware scheduling functionality by extending an existing Java program. The program uses buffers under the form of byte []. For my purpose byte [] are problematic because
they are garbage collected
they are allocated upfront instead of lazily (the JVM seems to touch all pages it has allocated when creating the buffer)
they make the JVM allocate more and more memory which is not given back the OS.
To achieve my goal I would like buffers to be lazily allocated memory (pages allocated only when written to) and free-able on demand. This is similar to how it would happen in C++.
In addition, as much as possible, I would like to minimize the changes to the existing code-base.
I looked at nio.ByteBuffer and at the Unsafe classes. Neither fits my case because
java.nio.ByteBuffers don't seem to be lazily allocated. When I allocate an empty 1GB buffer the RSS of the program immediately goes to 1GB.
Unsafe.allocateMemory is lazily allocated but I do not know how to reference it as byte [].
Is there any way to solve this?
Any way to view memory allocated with Unsafe.allocateMemory() as a byte []?
Or change an existing byte [] to point to memory allocated with Unsafe?
Thank you

Java is designed to have seperate regions of memory.
on heap like byte[] which is designed to be re-allocable and is zero'ed out and not lazy. The reason this is so is that the memory is managed.
off heap which has the advantage it can be lazy but it can't pretend to be managed data types like byte[].

Related

How does MappedByteBuffer get garbage-collected?

From the Java docs,
The contents of direct buffers may reside outside of the normal garbage-collected heap, and so their impact upon the memory footprint of an application might not be obvious
Also from the Java docs,
MappedByteBuffer: A direct byte buffer whose content is a memory-mapped region of a file.
and
A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected.
I believe that off the heap memory allocations cannot be garbage-collected by the GC. In this case, these statements make me curious about memory management of a MappedByteBuffer. What happens if the direct ByteBuffer backing a MappedByteBuffer sits outside the normal heap?

What is the purpose to use direct memory in Java?

Direct memory was introduced since java 1.4. The new I/O (NIO) classes introduced a new way of performing I/O based on channels and buffers. NIO added support for direct ByteBuffers, which can be passed directly to native memory rather than Java heap. Making them significantly faster in some scenarios because they can avoid copying data between Java heap and native heap.
I never understand why do we use direct memory. Can someone help to give an example?

I never understand why do we use direct memory. can someone help to give an example?
All system calls such as reading and writing sockets and files only use native memory. They can't use the heap. This means while you can copy to/from native memory from the heap, avoiding this copy can improve efficiency.
We use off-heap/native memory for storing most of our data which has a number of advantages.
it can be larger than the heap size.
it can be larger than main memory.
it can be shared between JVMs. i.e. one copy for multiple JVMs.
it can be persisted and retained across restarts of the JVM or even machine.
it has little to no impact on GC pause times.
depending on usage it can be faster
The reason it is not used more is that it is harder to make it both efficient and work like normal Java objects. For this reason, we have libraries such as Chronicle Map which act as a ConcurrentMap but using off-heap memory, and Chronicle Queue which is a journal, logger and persisted IPC between processes.

The JVM relies on the concept of garbage collection for reclaiming memory that is no longer used. This allows JVM language developers (e.g., Java, Scala, etc) to not have to worry about memory allocation and deallocation. You simply ask for memory, and let the JVM worry about when it will be reclaimed, or garbage collected.
While this is extremely convenient, it comes with the added overhead of a separate thread, consuming CPU and having to go through the JVM heap constantly, reclaiming objects that are not reachable anymore. There's entire books written about the topic, but if you want to read a bit more about JVM garbage collection, there's a ton of references out there, but this one is decent: https://dzone.com/articles/understanding-the-java-memory-model-and-the-garbag
Anyway, if in your app, you know you're going to be doing massive amounts of copying, updating objects and values, you can elect to handle those objects and their memory consumption yourself. So, regardless of how much churn there is in those objects, those objects will never be moved around in the heap, they will never be garbage collected, and thus, won't impact garbage collection in the JVM. There's a bit more detail in this answer: https://stackoverflow.com/a/6091680/236528
From the Official Javadoc:
Direct vs. non-direct buffers
A byte buffer is either direct or non-direct. Given a direct byte
buffer, the Java virtual machine will make a best effort to perform
native I/O operations directly upon it. That is, it will attempt to
avoid copying the buffer's content to (or from) an intermediate buffer
before (or after) each invocation of one of the underlying operating
system's native I/O operations.
A direct byte buffer may be created by invoking the allocateDirect
factory method of this class. The buffers returned by this method
typically have somewhat higher allocation and deallocation costs than
non-direct buffers. The contents of direct buffers may reside
outside of the normal garbage-collected heap, and so their impact upon
the memory footprint of an application might not be obvious. It is
therefore recommended that direct buffers be allocated primarily for
large, long-lived buffers that are subject to the underlying system's
native I/O operations. In general it is best to allocate direct
buffers only when they yield a measureable gain in program
performance.
https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/nio/ByteBuffer.html

preallocated memory in java

I heard that on real time system it is preferred to be used pre-allocated memory to avoid garbage as much as possible. But what exactly does it mean? As I know whenever we call new operator we use heap memory on runtime. So how achieve to use pre-allocated memory?

"Pre-allocated memory" means that a program should allocate all the required memory blocks once after startup (using the new operator, as usual), rather than allocate memory multiple times during execution and leave memory which is no longer needed for the garbage collector to free.

Pre-allocated memory means a memory which is allocate at the time of loading a program, In java using static keyword we can achieve.
For more info refer this

What's holding up my ByteBuffer from freeing?

I'm allocating a lot of byte buffers. After I'm done with them I set all reference to null. This is supposedly the "correct" way to release bytebuffers? Dereference it and let the GC clean it up ? I also call System.gc() to try and help it along.
Anyways, I create a bunch of buffers, deference them; but after "some time" I get all sorts of memory errors: java.lang.OutOfMemoryError: Direct buffer memory
I can increase the MaxDirectMemorySize but it just delays the above error.
I'm 99% positive I don't have anything referencing the old ByteBuffers. Is there a way to check this to see what the heck still has a ByteBuffer allocated?

You can use a tool like MAT that's free with Eclipse to see what is keeping your byte buffer by letting it do some heapdump analysis.
Another way I can think of is to wrap your byte buffer with something else that has a finalizer method.
Also Systen.gc() does not guarantee that finalizers will be executed you need to do System.runFinalization() to increase the likelihood.

Setting the references to null is the correct way to let the garbage collector that you are finished with that object. There must still be some other dangling reference. The best tool I have found for finding memory leaks is YourKit. A free alternative that is also very good is Visual VM from the JDK.
Remember that the slice() operation creates a new byte buffer that references the first one.

This is a problem with older versions of Java. The latest version of Java 6 will call System.gc() before throwing an OutOfMemoryError. If you don't want to trigger a GC you can release the direct memory manually on the Oracle/Sun JVM with
((DirectBuffer) buffer).cleaner().clean();
However, it is a better approach to recycle the direct buffers yourself so doing this is not so critical. (Creating direct ByteBuffers is relatively expensive)

Direct java.nio.ByteBuffer is (by definition) stored outside of java heap space. It is not freed until GC runs on heap, and frees it. But you can imagine a situation where heap usage is low therefore it does not need GC, but non-heap memory gets allocated and runs out of bounds.
Based on very interesting read:
http://www.ibm.com/developerworks/library/j-nativememory-linux/
The pathological case would be that the native heap becomes full and
one or more direct ByteBuffers are eligible for GC (and could be freed
to make some space on the native heap), but the Java heap is mostly
empty so GC doesn't occur.

Too Many Garbage Problem on Java

I have an application, basically, create a new byte array (less than 1K) store some data after few seconds (generally less than 1 minute, but some data stored up to 1 hour) write to disk and data will goes to garbage. Approximatelly 400 packets per second created. I read some articles that say don't worry about GC especially quickly created and released memory parts (on Java 6).
GC runs too long cause some problem about on my application.
I set some GC parameters(Bigger XMX and ParalelGC),this decrease Full GC time decrease but not enough yet. I have 2 idea,
Am I focus GC parameters or create Byte array memory pool mechanism? Which one is better?

The frequency of performing a GC is dependant on the object size, but the cost (the clean up time) is more dependant on the number of objects. I suspect the long living arrays are being copied between the spaces until it end up in the old space and finally discarded. Cleaning the old gen is relatively expensive.
I suggest you try using ByteBuffer to store data. These are like byte[] but have a variable size and can be slightly more efficient if you can use direct byte buffers with NIO. Pre-allocating your buffers can be more efficient to preallocate your buffers. (though can waste virtual memory)
BTW: The direct byte buffers use little heap space as they use memory in the "C" space.

I suggest you do some analysis into why GC is not working well enough for you. You can use jmap to dump out the heap and then use jhat or Eclipse Memory Analyser to see what objects are living in it. You might find that you are holding on to references that you no longer need.
The GC is very clever and you could actually make things worse by trying to outsmart it with your own memory management code. Try tuning the parameters and maybe you can try out the new G1 Garbage Collector too.
Also, remember, that GC loves short-lived, immutable objects.

Use profiler to identify the code snippet
Try with WeakReferences.
Suggest an GC algo to the VM
-Xgc: parallel
Set a big Heap and shared mem
-XX:+UseISM -XX:+AggressiveHeap
set below for garbage collection.
-XX:SurvivorRatio 8
This may help
http://download.oracle.com/docs/cd/E12840_01/wls/docs103/perform/JVMTuning.html#wp1130305

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.