Is there a way to increase allocated memory of off-heap ByteBuffer once it has created?
A direct byte buffer may also be created by mapping a region of a file directly into memory. An implementation of the Java platform may optionally support the creation of direct byte buffers from native code via JNI. If an instance of one of these kinds of buffers refers to an inaccessible region of memory then an attempt to access that region will not change the buffer's content and will cause an unspecified exception to be thrown either at the time of the access or at some later time.
The API has no provisions, but there might be a JVM that allows it via JNI.
I would say NO.
Related
I am reading HBase docs and came across the Off-heap read path
As far as I understand this Off-heap is a place in memory where Java stores bytes/objects outside the reach of the Garbage Collector. I also went to search for some libs that facilitate using the off-heap memory and found Ehcatche However, I could not find any official docs from oracle or JVM about his. So is this a standard functionality of JVM or it is some kind of a hack and if it is what are the underlying classes and techniques used to do this?
You should look for ByteBuffer
Direct vs. non-direct buffers
A byte buffer is either direct or non-direct. Given a direct byte
buffer, the Java virtual machine will make a best effort to perform
native I/O operations directly upon it. That is, it will attempt to
avoid copying the buffer's content to (or from) an intermediate buffer
before (or after) each invocation of one of the underlying operating
system's native I/O operations.
A direct byte buffer may be created by invoking the allocateDirect
factory method of this class. The buffers returned by this method
typically have somewhat higher allocation and deallocation costs than
non-direct buffers. The contents of direct buffers may reside outside
of the normal garbage-collected heap, and so their impact upon the
memory footprint of an application might not be obvious. It is
therefore recommended that direct buffers be allocated primarily for
large, long-lived buffers that are subject to the underlying system's
native I/O operations. In general it is best to allocate direct
buffers only when they yield a measureable gain in program
performance.
A direct byte buffer may also be created by mapping a region of a file
directly into memory. An implementation of the Java platform may
optionally support the creation of direct byte buffers from native
code via JNI. If an instance of one of these kinds of buffers refers
to an inaccessible region of memory then an attempt to access that
region will not change the buffer's content and will cause an
unspecified exception to be thrown either at the time of the access or
at some later time.
Whether a byte buffer is direct or non-direct may be determined by
invoking its isDirect method. This method is provided so that explicit
buffer management can be done in performance-critical code.
It's up to JVM implementation how it handles direct ByteBuffers, but at least OpenJDK JVM is allocating memory off-heap.
The JEP 383: Foreign-Memory Access API (Second Incubator) feature is incubating in Java 15. This feature will make accessing off-heap memory standard by providing public API.
From the Java docs,
The contents of direct buffers may reside outside of the normal garbage-collected heap, and so their impact upon the memory footprint of an application might not be obvious
Also from the Java docs,
MappedByteBuffer: A direct byte buffer whose content is a memory-mapped region of a file.
and
A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected.
I believe that off the heap memory allocations cannot be garbage-collected by the GC. In this case, these statements make me curious about memory management of a MappedByteBuffer. What happens if the direct ByteBuffer backing a MappedByteBuffer sits outside the normal heap?
Direct memory was introduced since java 1.4. The new I/O (NIO) classes introduced a new way of performing I/O based on channels and buffers. NIO added support for direct ByteBuffers, which can be passed directly to native memory rather than Java heap. Making them significantly faster in some scenarios because they can avoid copying data between Java heap and native heap.
I never understand why do we use direct memory. Can someone help to give an example?
I never understand why do we use direct memory. can someone help to give an example?
All system calls such as reading and writing sockets and files only use native memory. They can't use the heap. This means while you can copy to/from native memory from the heap, avoiding this copy can improve efficiency.
We use off-heap/native memory for storing most of our data which has a number of advantages.
it can be larger than the heap size.
it can be larger than main memory.
it can be shared between JVMs. i.e. one copy for multiple JVMs.
it can be persisted and retained across restarts of the JVM or even machine.
it has little to no impact on GC pause times.
depending on usage it can be faster
The reason it is not used more is that it is harder to make it both efficient and work like normal Java objects. For this reason, we have libraries such as Chronicle Map which act as a ConcurrentMap but using off-heap memory, and Chronicle Queue which is a journal, logger and persisted IPC between processes.
The JVM relies on the concept of garbage collection for reclaiming memory that is no longer used. This allows JVM language developers (e.g., Java, Scala, etc) to not have to worry about memory allocation and deallocation. You simply ask for memory, and let the JVM worry about when it will be reclaimed, or garbage collected.
While this is extremely convenient, it comes with the added overhead of a separate thread, consuming CPU and having to go through the JVM heap constantly, reclaiming objects that are not reachable anymore. There's entire books written about the topic, but if you want to read a bit more about JVM garbage collection, there's a ton of references out there, but this one is decent: https://dzone.com/articles/understanding-the-java-memory-model-and-the-garbag
Anyway, if in your app, you know you're going to be doing massive amounts of copying, updating objects and values, you can elect to handle those objects and their memory consumption yourself. So, regardless of how much churn there is in those objects, those objects will never be moved around in the heap, they will never be garbage collected, and thus, won't impact garbage collection in the JVM. There's a bit more detail in this answer: https://stackoverflow.com/a/6091680/236528
From the Official Javadoc:
Direct vs. non-direct buffers
A byte buffer is either direct or non-direct. Given a direct byte
buffer, the Java virtual machine will make a best effort to perform
native I/O operations directly upon it. That is, it will attempt to
avoid copying the buffer's content to (or from) an intermediate buffer
before (or after) each invocation of one of the underlying operating
system's native I/O operations.
A direct byte buffer may be created by invoking the allocateDirect
factory method of this class. The buffers returned by this method
typically have somewhat higher allocation and deallocation costs than
non-direct buffers. The contents of direct buffers may reside
outside of the normal garbage-collected heap, and so their impact upon
the memory footprint of an application might not be obvious. It is
therefore recommended that direct buffers be allocated primarily for
large, long-lived buffers that are subject to the underlying system's
native I/O operations. In general it is best to allocate direct
buffers only when they yield a measureable gain in program
performance.
https://download.java.net/java/early_access/jdk11/docs/api/java.base/java/nio/ByteBuffer.html
I just read a wiki here, one of the passages said :
Although theoretically these are general-purpose data structures, the
implementation may select memory for alignment or paging
characteristics, which are not otherwise accessible in Java.
Typically, this would be used to allow the buffer contents to occupy
the same physical memory used by the underlying operating system for
its native I/O operations, thus allowing the most direct transfer
mechanism, and eliminating the need for any additional copying
I am curious about the words "eliminating the need for any additional copying", when will JVM need this and why NIO could avoid it ?
It's talking about a direct mapping between a kernel data structure and a user space data structure; normally a context switch is required when moving between the two. However, with nio and a direct buffer, the context switch (and corresponding memory copies) does not occur.
From java.nio package API:
A byte buffer can be allocated as a direct buffer, in which case the Java virtual machine will make a best effort to perform native I/O operations directly upon it.
Example:
FileChannel fc = ...
ByteBuffer buf = ByteBuffer.allocateDirect(8192);
int n = fc.read(buf);
simply, old IO way always copy data from the kernel to memory in the heap. Using NIO allows to use buffers where file/network stream is mapped by the kernel directly. Result: less memory consumption and far better performance.
Many developers know only a single JVM, the Oracle HotSpot JVM, and speak of garbage collection in general when they are referring to Oracle’s HotSpot implementation specifically. but the thing is check Bob's post
New input/output (NIO) library, introduced with JDK 1.4, provides high-speed, block-oriented I/O in standard Java code.
Few points on NIO,
IO is stream oriented, where NIO is buffer oriented.
Offer non-blocking I/O operations
Avoid an extra copy of data passed between Java and native memory
Allows to read and write blocks of
data direct from disk, rather than byte by byte
The NIO API introduces a new primitive I/O abstraction called channel. A channel represents an open connection to an entity such as a hardware device, a file, a network socket.
When you are using APIs FileChannel.transferTo() or FileChannel.transferFrom() JVM uses the OS's access to DMA (Direct Memory Access) which is potential advantage.
According to Ron Hitches on Java NIO
Direct buffers are intended for interaction with channels and native
I/O routines. They make a best effort to store the byte elements in a
memory area that a channel can use for direct, or raw, access by using
native code to tell the operating system to drain or fill the memory
area directly.
Direct byte buffers are usually the best choice for I/O operations. By
design, they support the most efficient I/O mechanism available to the
JVM. Nondirect byte buffers can be passed to channels, but doing so
may incur a performance penalty. It's usually not possible for a
nondirect buffer to be the target of a native I/O operation.
Direct buffers are optimal for I/O, but they may be more expensive to
create than nondirect byte buffers. The memory used by direct buffers
is allocated by calling through to native, operating system-specific
code, bypassing the standard JVM heap. Setting up and tearing down
direct buffers could be significantly more expensive than
heap-resident buffers, depending on the host operating system and JVM
implementation. The memory-storage areas of direct buffers are not
subject to garbage collection because they are outside the standard
JVM heap
Chapter 2 on below tutorial will give you more insight ( especially 2.4, 2.4.2 etc)
http://blogimg.chinaunix.net/blog/upfile2/090901134800.pdf
I have an Android project (targeting Android 1.6 and up) which includes native code written in C/C++, accessed via NDK. I'm wondering what the most efficient way is to pass an array of bytes from Java through NDK to my JNI glue layer. My concern is around whether or not NDK for Android will copy the array of bytes, or just give me a direct reference. I need read-only access to the bytes at the C++ level, so any copying behind the scenes would be a waste of time from my perspective.
It's easy to find info about this on the web, but I'm not sure what is the most pertinent info. Examples:
Get the pointer of a Java ByteBuffer though JNI
http://www.milk.com/kodebase/dalvik-docs-mirror/docs/jni-tips.html
http://elliotth.blogspot.com/2007/03/optimizing-jni-array-access.html
So does anyone know what is the best (most efficient, least copying) way to do this in the current NDK? GetByteArrayRegion? GetByteArrayElements? Something else?
According to the documentation, GetDirectBufferAddress will give you the reference without copying the array.
However, to call this function you need to allocate a direct buffer with ByteBuffer.allocateDirect() instead of a simple byte array. It has a counterpart as explained here :
A direct byte buffer may be created by invoking the allocateDirect
factory method of this class. The buffers returned by this method
typically have somewhat higher allocation and deallocation costs than
non-direct buffers. The contents of direct buffers may reside outside
of the normal garbage-collected heap, and so their impact upon the
memory footprint of an application might not be obvious. It is
therefore recommended that direct buffers be allocated primarily for
large, long-lived buffers that are subject to the underlying system's
native I/O operations. In general it is best to allocate direct
buffers only when they yield a measureable gain in program
performance.