Is metaspace allocated out of native memory? - java

In Java 8, meta space is allocated out of native memory , But i did not get anywhere on net what is native memory ? Per this link it is the memory available to the OS but at Difference between Metaspace and Native Memory in Java , native memory is also shown as part of memory given to JVM process
Example :-
If yes consider the case where i have 15 GB ram on windows OS. I have just one process (Java process) running on machine with -Xmx 4GB.
Does it mean OS can use up to (15-4)=11 GB out of which meta space memory will be allocated ?

Is metaspace allocated out of native memory?
Yes.
Definitive source: https://blogs.oracle.com/poonam/entry/about_g1_garbage_collector_permanent
But i did not get anywhere on net what is native memory ?
The native heap is the malloc / free heap that provides dynamic memory for those parts of the JVM that are implemented in native code (C++). It can also be used by user-supplied native libraries loaded by the JVM. The native heap is not garbage collected per se, but metaspace is.
One benefit of using the native heap to hold metaspace objects is that the native heap does not have a fixed maximum size (by default) like the Java heap does.
If yes consider the case where i have 15 GB ram on windows OS. I have just one process (Java process) running on machine with -Xmx 4GB. Does it mean OS can use up to (15-4)=11 GB out of which meta space memory will be allocated?
Maybe:
There will be other process on a Windows machine. Lots of them. It is just that they are system processes.
There possibly are OS enforced limits on how big the Java process is allowed to grow. (I'm assuming that Windows has something that fills the role of ulimit on a UNIX / Linux system.)
If there is disk space available for paging, the OS may actually allocate the Java process more memory than is available as physical memory pages.

Native memory is the normal memory of an application. This is apposed to heap memory which is managed by the JVM. In a C program for example, it would be called just "memory"

Related

Why the container has used 30GB of memory more than 25Gb expected [duplicate]

For my application, the memory used by the Java process is much more than the heap size.
The system where the containers are running starts to have memory problem because the container is taking much more memory than the heap size.
The heap size is set to 128 MB (-Xmx128m -Xms128m) while the container takes up to 1GB of memory. Under normal condition, it needs 500MB. If the docker container has a limit below (e.g. mem_limit=mem_limit=400MB) the process gets killed by the out of memory killer of the OS.
Could you explain why the Java process is using much more memory than the heap? How to size correctly the Docker memory limit? Is there a way to reduce the off-heap memory footprint of the Java process?
I gather some details about the issue using command from Native memory tracking in JVM.
From the host system, I get the memory used by the container.
$ docker stats --no-stream 9afcb62a26c8
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
9afcb62a26c8 xx-xxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.0acbb46bb6fe3ae1b1c99aff3a6073bb7b7ecf85 0.93% 461MiB / 9.744GiB 4.62% 286MB / 7.92MB 157MB / 2.66GB 57
From inside the container, I get the memory used by the process.
$ ps -p 71 -o pcpu,rss,size,vsize
%CPU RSS SIZE VSZ
11.2 486040 580860 3814600
$ jcmd 71 VM.native_memory
71:
Native Memory Tracking:
Total: reserved=1631932KB, committed=367400KB
- Java Heap (reserved=131072KB, committed=131072KB)
(mmap: reserved=131072KB, committed=131072KB)
- Class (reserved=1120142KB, committed=79830KB)
(classes #15267)
( instance classes #14230, array classes #1037)
(malloc=1934KB #32977)
(mmap: reserved=1118208KB, committed=77896KB)
( Metadata: )
( reserved=69632KB, committed=68272KB)
( used=66725KB)
( free=1547KB)
( waste=0KB =0.00%)
( Class space:)
( reserved=1048576KB, committed=9624KB)
( used=8939KB)
( free=685KB)
( waste=0KB =0.00%)
- Thread (reserved=24786KB, committed=5294KB)
(thread #56)
(stack: reserved=24500KB, committed=5008KB)
(malloc=198KB #293)
(arena=88KB #110)
- Code (reserved=250635KB, committed=45907KB)
(malloc=2947KB #13459)
(mmap: reserved=247688KB, committed=42960KB)
- GC (reserved=48091KB, committed=48091KB)
(malloc=10439KB #18634)
(mmap: reserved=37652KB, committed=37652KB)
- Compiler (reserved=358KB, committed=358KB)
(malloc=249KB #1450)
(arena=109KB #5)
- Internal (reserved=1165KB, committed=1165KB)
(malloc=1125KB #3363)
(mmap: reserved=40KB, committed=40KB)
- Other (reserved=16696KB, committed=16696KB)
(malloc=16696KB #35)
- Symbol (reserved=15277KB, committed=15277KB)
(malloc=13543KB #180850)
(arena=1734KB #1)
- Native Memory Tracking (reserved=4436KB, committed=4436KB)
(malloc=378KB #5359)
(tracking overhead=4058KB)
- Shared class space (reserved=17144KB, committed=17144KB)
(mmap: reserved=17144KB, committed=17144KB)
- Arena Chunk (reserved=1850KB, committed=1850KB)
(malloc=1850KB)
- Logging (reserved=4KB, committed=4KB)
(malloc=4KB #179)
- Arguments (reserved=19KB, committed=19KB)
(malloc=19KB #512)
- Module (reserved=258KB, committed=258KB)
(malloc=258KB #2356)
$ cat /proc/71/smaps | grep Rss | cut -d: -f2 | tr -d " " | cut -f1 -dk | sort -n | awk '{ sum += $1 } END { print sum }'
491080
The application is a web server using Jetty/Jersey/CDI bundled inside a fat far of 36 MB.
The following version of OS and Java are used (inside the container). The Docker image is based on openjdk:11-jre-slim.
$ java -version
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment (build 11+28-Debian-1)
OpenJDK 64-Bit Server VM (build 11+28-Debian-1, mixed mode, sharing)
$ uname -a
Linux service1 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 GNU/Linux
https://gist.github.com/prasanthj/48e7063cac88eb396bc9961fb3149b58
Virtual memory used by a Java process extends far beyond just Java Heap. You know, JVM includes many subsytems: Garbage Collector, Class Loading, JIT compilers etc., and all these subsystems require certain amount of RAM to function.
JVM is not the only consumer of RAM. Native libraries (including standard Java Class Library) may also allocate native memory. And this won't be even visible to Native Memory Tracking. Java application itself can also use off-heap memory by means of direct ByteBuffers.
So what takes memory in a Java process?
JVM parts (mostly shown by Native Memory Tracking)
1. Java Heap
The most obvious part. This is where Java objects live. Heap takes up to -Xmx amount of memory.
2. Garbage Collector
GC structures and algorithms require additional memory for heap management. These structures are Mark Bitmap, Mark Stack (for traversing object graph), Remembered Sets (for recording inter-region references) and others. Some of them are directly tunable, e.g. -XX:MarkStackSizeMax, others depend on heap layout, e.g. the larger are G1 regions (-XX:G1HeapRegionSize), the smaller are remembered sets.
GC memory overhead varies between GC algorithms. -XX:+UseSerialGC and -XX:+UseShenandoahGC have the smallest overhead. G1 or CMS may easily use around 10% of total heap size.
3. Code Cache
Contains dynamically generated code: JIT-compiled methods, interpreter and run-time stubs. Its size is limited by -XX:ReservedCodeCacheSize (240M by default). Turn off -XX:-TieredCompilation to reduce the amount of compiled code and thus the Code Cache usage.
4. Compiler
JIT compiler itself also requires memory to do its job. This can be reduced again by switching off Tiered Compilation or by reducing the number of compiler threads: -XX:CICompilerCount.
5. Class loading
Class metadata (method bytecodes, symbols, constant pools, annotations etc.) is stored in off-heap area called Metaspace. The more classes are loaded - the more metaspace is used. Total usage can be limited by -XX:MaxMetaspaceSize (unlimited by default) and -XX:CompressedClassSpaceSize (1G by default).
6. Symbol tables
Two main hashtables of the JVM: the Symbol table contains names, signatures, identifiers etc. and the String table contains references to interned strings. If Native Memory Tracking indicates significant memory usage by a String table, it probably means the application excessively calls String.intern.
7. Threads
Thread stacks are also responsible for taking RAM. The stack size is controlled by -Xss. The default is 1M per thread, but fortunately things are not so bad. The OS allocates memory pages lazily, i.e. on the first use, so the actual memory usage will be much lower (typically 80-200 KB per thread stack). I wrote a script to estimate how much of RSS belongs to Java thread stacks.
There are other JVM parts that allocate native memory, but they do not usually play a big role in total memory consumption.
Direct buffers
An application may explicitly request off-heap memory by calling ByteBuffer.allocateDirect. The default off-heap limit is equal to -Xmx, but it can be overridden with -XX:MaxDirectMemorySize. Direct ByteBuffers are included in Other section of NMT output (or Internal before JDK 11).
The amount of direct memory in use is visible through JMX, e.g. in JConsole or Java Mission Control:
Besides direct ByteBuffers there can be MappedByteBuffers - the files mapped to virtual memory of a process. NMT does not track them, however, MappedByteBuffers can also take physical memory. And there is no a simple way to limit how much they can take. You can just see the actual usage by looking at process memory map: pmap -x <pid>
Address Kbytes RSS Dirty Mode Mapping
...
00007f2b3e557000 39592 32956 0 r--s- some-file-17405-Index.db
00007f2b40c01000 39600 33092 0 r--s- some-file-17404-Index.db
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
Native libraries
JNI code loaded by System.loadLibrary can allocate as much off-heap memory as it wants with no control from JVM side. This also concerns standard Java Class Library. In particular, unclosed Java resources may become a source of native memory leak. Typical examples are ZipInputStream or DirectoryStream.
JVMTI agents, in particular, jdwp debugging agent - can also cause excessive memory consumption.
This answer describes how to profile native memory allocations with async-profiler.
Allocator issues
A process typically requests native memory either directly from OS (by mmap system call) or by using malloc - standard libc allocator. In turn, malloc requests big chunks of memory from OS using mmap, and then manages these chunks according to its own allocation algorithm. The problem is - this algorithm can lead to fragmentation and excessive virtual memory usage.
jemalloc, an alternative allocator, often appears smarter than regular libc malloc, so switching to jemalloc may result in a smaller footprint for free.
Conclusion
There is no guaranteed way to estimate full memory usage of a Java process, because there are too many factors to consider.
Total memory = Heap + Code Cache + Metaspace + Symbol tables +
Other JVM structures + Thread stacks +
Direct buffers + Mapped files +
Native Libraries + Malloc overhead + ...
It is possible to shrink or limit certain memory areas (like Code Cache) by JVM flags, but many others are out of JVM control at all.
One possible approach to setting Docker limits would be to watch the actual memory usage in a "normal" state of the process. There are tools and techniques for investigating issues with Java memory consumption: Native Memory Tracking, pmap, jemalloc, async-profiler.
Update
Here is a recording of my presentation Memory Footprint of a Java Process.
In this video, I discuss what may consume memory in a Java process, how to monitor and restrain the size of certain memory areas, and how to profile native memory leaks in a Java application.
https://developers.redhat.com/blog/2017/04/04/openjdk-and-containers/:
Why is it when I specify -Xmx=1g my JVM uses up more memory than 1gb
of memory?
Specifying -Xmx=1g is telling the JVM to allocate a 1gb heap. It’s not
telling the JVM to limit its entire memory usage to 1gb. There are
card tables, code caches, and all sorts of other off heap data
structures. The parameter you use to specify total memory usage is
-XX:MaxRAM. Be aware that with -XX:MaxRam=500m your heap will be approximately 250mb.
Java sees host memory size and it is not aware of any container memory limitations. It doesn't create memory pressure, so GC also doesn't need to release used memory. I hope XX:MaxRAM will help you to reduce memory footprint. Eventually, you can tweak GC configuration (-XX:MinHeapFreeRatio,-XX:MaxHeapFreeRatio, ...)
There is many types of memory metrics. Docker seems to be reporting RSS memory size, that can be different than "committed" memory reported by jcmd (older versions of Docker report RSS+cache as memory usage).
Good discussion and links: Difference between Resident Set Size (RSS) and Java total committed memory (NMT) for a JVM running in Docker container
(RSS) memory can be eaten also by some other utilities in the container - shell, process manager, ... We don't know what else is running in the container and how do you start processes in container.
TL;DR
The detail usage of the memory is provided by Native Memory Tracking (NMT) details (mainly code metadata and garbage collector). In addition to that, the Java compiler and optimizer C1/C2 consume the memory not reported in the summary.
The memory footprint can be reduced using JVM flags (but there is impacts).
The Docker container sizing must be done through testing with the expected load the application.
Detail for each components
The shared class space can be disabled inside a container since the classes won't be shared by another JVM process. The following flag can be used. It will remove the shared class space (17MB).
-Xshare:off
The garbage collector serial has a minimal memory footprint at the cost of longer pause time during garbage collect processing (see Aleksey Shipilëv comparison between GC in one picture). It can be enabled with the following flag. It can save up to the GC space used (48MB).
-XX:+UseSerialGC
The C2 compiler can be disabled with the following flag to reduce profiling data used to decide whether to optimize or not a method.
-XX:+TieredCompilation -XX:TieredStopAtLevel=1
The code space is reduced by 20MB. Moreover, the memory outside JVM is reduced by 80MB (difference between NMT space and RSS space). The optimizing compiler C2 needs 100MB.
The C1 and C2 compilers can be disabled with the following flag.
-Xint
The memory outside the JVM is now lower than the total committed space. The code space is reduced by 43MB. Beware, this has a major impact on the performance of the application. Disabling C1 and C2 compiler reduces the memory used by 170 MB.
Using Graal VM compiler (replacement of C2) leads to a bit smaller memory footprint. It increases of 20MB the code memory space and decreases of 60MB from outside JVM memory.
The article Java Memory Management for JVM provides some relevant information the different memory spaces.
Oracle provides some details in Native Memory Tracking documentation. More details about compilation level in advanced compilation policy and in disable C2 reduce code cache size by a factor 5. Some details on Why does a JVM report more committed memory than the Linux process resident set size? when both compilers are disabled.
Java needs a lot a memory. JVM itself needs a lot of memory to run. The heap is the memory which is available inside the virtual machine, available to your application. Because JVM is a big bundle packed with all goodies possible it takes a lot of memory just to load.
Starting with java 9 you have something called project Jigsaw, which might reduce the memory used when you start a java app(along with start time). Project jigsaw and a new module system were not necessarily created to reduce the necessary memory, but if it's important you can give a try.
You can take a look at this example: https://steveperkins.com/using-java-9-modularization-to-ship-zero-dependency-native-apps/. By using the module system it resulted in CLI application of 21MB(with JRE embeded). JRE takes more than 200mb. That should translate to less allocated memory when the application is up(a lot of unused JRE classes will no longer be loaded).
Here is another nice tutorial: https://www.baeldung.com/project-jigsaw-java-modularity
If you don't want to spend time with this you can simply get allocate more memory. Sometimes it's the best.
How to size correctly the Docker memory limit?
Check the application by monitoring it for some-time. To restrict container's memory try using -m, --memory bytes option for docker run command - or something equivalant if you are running it otherwise
like
docker run -d --name my-container --memory 500m <iamge-name>
can't answer other questions.

What is the reason that the size of java native tracking total committed memory and the RES memory of the top command are different? [duplicate]

For my application, the memory used by the Java process is much more than the heap size.
The system where the containers are running starts to have memory problem because the container is taking much more memory than the heap size.
The heap size is set to 128 MB (-Xmx128m -Xms128m) while the container takes up to 1GB of memory. Under normal condition, it needs 500MB. If the docker container has a limit below (e.g. mem_limit=mem_limit=400MB) the process gets killed by the out of memory killer of the OS.
Could you explain why the Java process is using much more memory than the heap? How to size correctly the Docker memory limit? Is there a way to reduce the off-heap memory footprint of the Java process?
I gather some details about the issue using command from Native memory tracking in JVM.
From the host system, I get the memory used by the container.
$ docker stats --no-stream 9afcb62a26c8
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
9afcb62a26c8 xx-xxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.0acbb46bb6fe3ae1b1c99aff3a6073bb7b7ecf85 0.93% 461MiB / 9.744GiB 4.62% 286MB / 7.92MB 157MB / 2.66GB 57
From inside the container, I get the memory used by the process.
$ ps -p 71 -o pcpu,rss,size,vsize
%CPU RSS SIZE VSZ
11.2 486040 580860 3814600
$ jcmd 71 VM.native_memory
71:
Native Memory Tracking:
Total: reserved=1631932KB, committed=367400KB
- Java Heap (reserved=131072KB, committed=131072KB)
(mmap: reserved=131072KB, committed=131072KB)
- Class (reserved=1120142KB, committed=79830KB)
(classes #15267)
( instance classes #14230, array classes #1037)
(malloc=1934KB #32977)
(mmap: reserved=1118208KB, committed=77896KB)
( Metadata: )
( reserved=69632KB, committed=68272KB)
( used=66725KB)
( free=1547KB)
( waste=0KB =0.00%)
( Class space:)
( reserved=1048576KB, committed=9624KB)
( used=8939KB)
( free=685KB)
( waste=0KB =0.00%)
- Thread (reserved=24786KB, committed=5294KB)
(thread #56)
(stack: reserved=24500KB, committed=5008KB)
(malloc=198KB #293)
(arena=88KB #110)
- Code (reserved=250635KB, committed=45907KB)
(malloc=2947KB #13459)
(mmap: reserved=247688KB, committed=42960KB)
- GC (reserved=48091KB, committed=48091KB)
(malloc=10439KB #18634)
(mmap: reserved=37652KB, committed=37652KB)
- Compiler (reserved=358KB, committed=358KB)
(malloc=249KB #1450)
(arena=109KB #5)
- Internal (reserved=1165KB, committed=1165KB)
(malloc=1125KB #3363)
(mmap: reserved=40KB, committed=40KB)
- Other (reserved=16696KB, committed=16696KB)
(malloc=16696KB #35)
- Symbol (reserved=15277KB, committed=15277KB)
(malloc=13543KB #180850)
(arena=1734KB #1)
- Native Memory Tracking (reserved=4436KB, committed=4436KB)
(malloc=378KB #5359)
(tracking overhead=4058KB)
- Shared class space (reserved=17144KB, committed=17144KB)
(mmap: reserved=17144KB, committed=17144KB)
- Arena Chunk (reserved=1850KB, committed=1850KB)
(malloc=1850KB)
- Logging (reserved=4KB, committed=4KB)
(malloc=4KB #179)
- Arguments (reserved=19KB, committed=19KB)
(malloc=19KB #512)
- Module (reserved=258KB, committed=258KB)
(malloc=258KB #2356)
$ cat /proc/71/smaps | grep Rss | cut -d: -f2 | tr -d " " | cut -f1 -dk | sort -n | awk '{ sum += $1 } END { print sum }'
491080
The application is a web server using Jetty/Jersey/CDI bundled inside a fat far of 36 MB.
The following version of OS and Java are used (inside the container). The Docker image is based on openjdk:11-jre-slim.
$ java -version
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment (build 11+28-Debian-1)
OpenJDK 64-Bit Server VM (build 11+28-Debian-1, mixed mode, sharing)
$ uname -a
Linux service1 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 GNU/Linux
https://gist.github.com/prasanthj/48e7063cac88eb396bc9961fb3149b58
Virtual memory used by a Java process extends far beyond just Java Heap. You know, JVM includes many subsytems: Garbage Collector, Class Loading, JIT compilers etc., and all these subsystems require certain amount of RAM to function.
JVM is not the only consumer of RAM. Native libraries (including standard Java Class Library) may also allocate native memory. And this won't be even visible to Native Memory Tracking. Java application itself can also use off-heap memory by means of direct ByteBuffers.
So what takes memory in a Java process?
JVM parts (mostly shown by Native Memory Tracking)
1. Java Heap
The most obvious part. This is where Java objects live. Heap takes up to -Xmx amount of memory.
2. Garbage Collector
GC structures and algorithms require additional memory for heap management. These structures are Mark Bitmap, Mark Stack (for traversing object graph), Remembered Sets (for recording inter-region references) and others. Some of them are directly tunable, e.g. -XX:MarkStackSizeMax, others depend on heap layout, e.g. the larger are G1 regions (-XX:G1HeapRegionSize), the smaller are remembered sets.
GC memory overhead varies between GC algorithms. -XX:+UseSerialGC and -XX:+UseShenandoahGC have the smallest overhead. G1 or CMS may easily use around 10% of total heap size.
3. Code Cache
Contains dynamically generated code: JIT-compiled methods, interpreter and run-time stubs. Its size is limited by -XX:ReservedCodeCacheSize (240M by default). Turn off -XX:-TieredCompilation to reduce the amount of compiled code and thus the Code Cache usage.
4. Compiler
JIT compiler itself also requires memory to do its job. This can be reduced again by switching off Tiered Compilation or by reducing the number of compiler threads: -XX:CICompilerCount.
5. Class loading
Class metadata (method bytecodes, symbols, constant pools, annotations etc.) is stored in off-heap area called Metaspace. The more classes are loaded - the more metaspace is used. Total usage can be limited by -XX:MaxMetaspaceSize (unlimited by default) and -XX:CompressedClassSpaceSize (1G by default).
6. Symbol tables
Two main hashtables of the JVM: the Symbol table contains names, signatures, identifiers etc. and the String table contains references to interned strings. If Native Memory Tracking indicates significant memory usage by a String table, it probably means the application excessively calls String.intern.
7. Threads
Thread stacks are also responsible for taking RAM. The stack size is controlled by -Xss. The default is 1M per thread, but fortunately things are not so bad. The OS allocates memory pages lazily, i.e. on the first use, so the actual memory usage will be much lower (typically 80-200 KB per thread stack). I wrote a script to estimate how much of RSS belongs to Java thread stacks.
There are other JVM parts that allocate native memory, but they do not usually play a big role in total memory consumption.
Direct buffers
An application may explicitly request off-heap memory by calling ByteBuffer.allocateDirect. The default off-heap limit is equal to -Xmx, but it can be overridden with -XX:MaxDirectMemorySize. Direct ByteBuffers are included in Other section of NMT output (or Internal before JDK 11).
The amount of direct memory in use is visible through JMX, e.g. in JConsole or Java Mission Control:
Besides direct ByteBuffers there can be MappedByteBuffers - the files mapped to virtual memory of a process. NMT does not track them, however, MappedByteBuffers can also take physical memory. And there is no a simple way to limit how much they can take. You can just see the actual usage by looking at process memory map: pmap -x <pid>
Address Kbytes RSS Dirty Mode Mapping
...
00007f2b3e557000 39592 32956 0 r--s- some-file-17405-Index.db
00007f2b40c01000 39600 33092 0 r--s- some-file-17404-Index.db
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
Native libraries
JNI code loaded by System.loadLibrary can allocate as much off-heap memory as it wants with no control from JVM side. This also concerns standard Java Class Library. In particular, unclosed Java resources may become a source of native memory leak. Typical examples are ZipInputStream or DirectoryStream.
JVMTI agents, in particular, jdwp debugging agent - can also cause excessive memory consumption.
This answer describes how to profile native memory allocations with async-profiler.
Allocator issues
A process typically requests native memory either directly from OS (by mmap system call) or by using malloc - standard libc allocator. In turn, malloc requests big chunks of memory from OS using mmap, and then manages these chunks according to its own allocation algorithm. The problem is - this algorithm can lead to fragmentation and excessive virtual memory usage.
jemalloc, an alternative allocator, often appears smarter than regular libc malloc, so switching to jemalloc may result in a smaller footprint for free.
Conclusion
There is no guaranteed way to estimate full memory usage of a Java process, because there are too many factors to consider.
Total memory = Heap + Code Cache + Metaspace + Symbol tables +
Other JVM structures + Thread stacks +
Direct buffers + Mapped files +
Native Libraries + Malloc overhead + ...
It is possible to shrink or limit certain memory areas (like Code Cache) by JVM flags, but many others are out of JVM control at all.
One possible approach to setting Docker limits would be to watch the actual memory usage in a "normal" state of the process. There are tools and techniques for investigating issues with Java memory consumption: Native Memory Tracking, pmap, jemalloc, async-profiler.
Update
Here is a recording of my presentation Memory Footprint of a Java Process.
In this video, I discuss what may consume memory in a Java process, how to monitor and restrain the size of certain memory areas, and how to profile native memory leaks in a Java application.
https://developers.redhat.com/blog/2017/04/04/openjdk-and-containers/:
Why is it when I specify -Xmx=1g my JVM uses up more memory than 1gb
of memory?
Specifying -Xmx=1g is telling the JVM to allocate a 1gb heap. It’s not
telling the JVM to limit its entire memory usage to 1gb. There are
card tables, code caches, and all sorts of other off heap data
structures. The parameter you use to specify total memory usage is
-XX:MaxRAM. Be aware that with -XX:MaxRam=500m your heap will be approximately 250mb.
Java sees host memory size and it is not aware of any container memory limitations. It doesn't create memory pressure, so GC also doesn't need to release used memory. I hope XX:MaxRAM will help you to reduce memory footprint. Eventually, you can tweak GC configuration (-XX:MinHeapFreeRatio,-XX:MaxHeapFreeRatio, ...)
There is many types of memory metrics. Docker seems to be reporting RSS memory size, that can be different than "committed" memory reported by jcmd (older versions of Docker report RSS+cache as memory usage).
Good discussion and links: Difference between Resident Set Size (RSS) and Java total committed memory (NMT) for a JVM running in Docker container
(RSS) memory can be eaten also by some other utilities in the container - shell, process manager, ... We don't know what else is running in the container and how do you start processes in container.
TL;DR
The detail usage of the memory is provided by Native Memory Tracking (NMT) details (mainly code metadata and garbage collector). In addition to that, the Java compiler and optimizer C1/C2 consume the memory not reported in the summary.
The memory footprint can be reduced using JVM flags (but there is impacts).
The Docker container sizing must be done through testing with the expected load the application.
Detail for each components
The shared class space can be disabled inside a container since the classes won't be shared by another JVM process. The following flag can be used. It will remove the shared class space (17MB).
-Xshare:off
The garbage collector serial has a minimal memory footprint at the cost of longer pause time during garbage collect processing (see Aleksey Shipilëv comparison between GC in one picture). It can be enabled with the following flag. It can save up to the GC space used (48MB).
-XX:+UseSerialGC
The C2 compiler can be disabled with the following flag to reduce profiling data used to decide whether to optimize or not a method.
-XX:+TieredCompilation -XX:TieredStopAtLevel=1
The code space is reduced by 20MB. Moreover, the memory outside JVM is reduced by 80MB (difference between NMT space and RSS space). The optimizing compiler C2 needs 100MB.
The C1 and C2 compilers can be disabled with the following flag.
-Xint
The memory outside the JVM is now lower than the total committed space. The code space is reduced by 43MB. Beware, this has a major impact on the performance of the application. Disabling C1 and C2 compiler reduces the memory used by 170 MB.
Using Graal VM compiler (replacement of C2) leads to a bit smaller memory footprint. It increases of 20MB the code memory space and decreases of 60MB from outside JVM memory.
The article Java Memory Management for JVM provides some relevant information the different memory spaces.
Oracle provides some details in Native Memory Tracking documentation. More details about compilation level in advanced compilation policy and in disable C2 reduce code cache size by a factor 5. Some details on Why does a JVM report more committed memory than the Linux process resident set size? when both compilers are disabled.
Java needs a lot a memory. JVM itself needs a lot of memory to run. The heap is the memory which is available inside the virtual machine, available to your application. Because JVM is a big bundle packed with all goodies possible it takes a lot of memory just to load.
Starting with java 9 you have something called project Jigsaw, which might reduce the memory used when you start a java app(along with start time). Project jigsaw and a new module system were not necessarily created to reduce the necessary memory, but if it's important you can give a try.
You can take a look at this example: https://steveperkins.com/using-java-9-modularization-to-ship-zero-dependency-native-apps/. By using the module system it resulted in CLI application of 21MB(with JRE embeded). JRE takes more than 200mb. That should translate to less allocated memory when the application is up(a lot of unused JRE classes will no longer be loaded).
Here is another nice tutorial: https://www.baeldung.com/project-jigsaw-java-modularity
If you don't want to spend time with this you can simply get allocate more memory. Sometimes it's the best.
How to size correctly the Docker memory limit?
Check the application by monitoring it for some-time. To restrict container's memory try using -m, --memory bytes option for docker run command - or something equivalant if you are running it otherwise
like
docker run -d --name my-container --memory 500m <iamge-name>
can't answer other questions.

phantom jvm memory usage, native allocations, and sizing k8s pods [duplicate]

For my application, the memory used by the Java process is much more than the heap size.
The system where the containers are running starts to have memory problem because the container is taking much more memory than the heap size.
The heap size is set to 128 MB (-Xmx128m -Xms128m) while the container takes up to 1GB of memory. Under normal condition, it needs 500MB. If the docker container has a limit below (e.g. mem_limit=mem_limit=400MB) the process gets killed by the out of memory killer of the OS.
Could you explain why the Java process is using much more memory than the heap? How to size correctly the Docker memory limit? Is there a way to reduce the off-heap memory footprint of the Java process?
I gather some details about the issue using command from Native memory tracking in JVM.
From the host system, I get the memory used by the container.
$ docker stats --no-stream 9afcb62a26c8
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
9afcb62a26c8 xx-xxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.0acbb46bb6fe3ae1b1c99aff3a6073bb7b7ecf85 0.93% 461MiB / 9.744GiB 4.62% 286MB / 7.92MB 157MB / 2.66GB 57
From inside the container, I get the memory used by the process.
$ ps -p 71 -o pcpu,rss,size,vsize
%CPU RSS SIZE VSZ
11.2 486040 580860 3814600
$ jcmd 71 VM.native_memory
71:
Native Memory Tracking:
Total: reserved=1631932KB, committed=367400KB
- Java Heap (reserved=131072KB, committed=131072KB)
(mmap: reserved=131072KB, committed=131072KB)
- Class (reserved=1120142KB, committed=79830KB)
(classes #15267)
( instance classes #14230, array classes #1037)
(malloc=1934KB #32977)
(mmap: reserved=1118208KB, committed=77896KB)
( Metadata: )
( reserved=69632KB, committed=68272KB)
( used=66725KB)
( free=1547KB)
( waste=0KB =0.00%)
( Class space:)
( reserved=1048576KB, committed=9624KB)
( used=8939KB)
( free=685KB)
( waste=0KB =0.00%)
- Thread (reserved=24786KB, committed=5294KB)
(thread #56)
(stack: reserved=24500KB, committed=5008KB)
(malloc=198KB #293)
(arena=88KB #110)
- Code (reserved=250635KB, committed=45907KB)
(malloc=2947KB #13459)
(mmap: reserved=247688KB, committed=42960KB)
- GC (reserved=48091KB, committed=48091KB)
(malloc=10439KB #18634)
(mmap: reserved=37652KB, committed=37652KB)
- Compiler (reserved=358KB, committed=358KB)
(malloc=249KB #1450)
(arena=109KB #5)
- Internal (reserved=1165KB, committed=1165KB)
(malloc=1125KB #3363)
(mmap: reserved=40KB, committed=40KB)
- Other (reserved=16696KB, committed=16696KB)
(malloc=16696KB #35)
- Symbol (reserved=15277KB, committed=15277KB)
(malloc=13543KB #180850)
(arena=1734KB #1)
- Native Memory Tracking (reserved=4436KB, committed=4436KB)
(malloc=378KB #5359)
(tracking overhead=4058KB)
- Shared class space (reserved=17144KB, committed=17144KB)
(mmap: reserved=17144KB, committed=17144KB)
- Arena Chunk (reserved=1850KB, committed=1850KB)
(malloc=1850KB)
- Logging (reserved=4KB, committed=4KB)
(malloc=4KB #179)
- Arguments (reserved=19KB, committed=19KB)
(malloc=19KB #512)
- Module (reserved=258KB, committed=258KB)
(malloc=258KB #2356)
$ cat /proc/71/smaps | grep Rss | cut -d: -f2 | tr -d " " | cut -f1 -dk | sort -n | awk '{ sum += $1 } END { print sum }'
491080
The application is a web server using Jetty/Jersey/CDI bundled inside a fat far of 36 MB.
The following version of OS and Java are used (inside the container). The Docker image is based on openjdk:11-jre-slim.
$ java -version
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment (build 11+28-Debian-1)
OpenJDK 64-Bit Server VM (build 11+28-Debian-1, mixed mode, sharing)
$ uname -a
Linux service1 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 GNU/Linux
https://gist.github.com/prasanthj/48e7063cac88eb396bc9961fb3149b58
Virtual memory used by a Java process extends far beyond just Java Heap. You know, JVM includes many subsytems: Garbage Collector, Class Loading, JIT compilers etc., and all these subsystems require certain amount of RAM to function.
JVM is not the only consumer of RAM. Native libraries (including standard Java Class Library) may also allocate native memory. And this won't be even visible to Native Memory Tracking. Java application itself can also use off-heap memory by means of direct ByteBuffers.
So what takes memory in a Java process?
JVM parts (mostly shown by Native Memory Tracking)
1. Java Heap
The most obvious part. This is where Java objects live. Heap takes up to -Xmx amount of memory.
2. Garbage Collector
GC structures and algorithms require additional memory for heap management. These structures are Mark Bitmap, Mark Stack (for traversing object graph), Remembered Sets (for recording inter-region references) and others. Some of them are directly tunable, e.g. -XX:MarkStackSizeMax, others depend on heap layout, e.g. the larger are G1 regions (-XX:G1HeapRegionSize), the smaller are remembered sets.
GC memory overhead varies between GC algorithms. -XX:+UseSerialGC and -XX:+UseShenandoahGC have the smallest overhead. G1 or CMS may easily use around 10% of total heap size.
3. Code Cache
Contains dynamically generated code: JIT-compiled methods, interpreter and run-time stubs. Its size is limited by -XX:ReservedCodeCacheSize (240M by default). Turn off -XX:-TieredCompilation to reduce the amount of compiled code and thus the Code Cache usage.
4. Compiler
JIT compiler itself also requires memory to do its job. This can be reduced again by switching off Tiered Compilation or by reducing the number of compiler threads: -XX:CICompilerCount.
5. Class loading
Class metadata (method bytecodes, symbols, constant pools, annotations etc.) is stored in off-heap area called Metaspace. The more classes are loaded - the more metaspace is used. Total usage can be limited by -XX:MaxMetaspaceSize (unlimited by default) and -XX:CompressedClassSpaceSize (1G by default).
6. Symbol tables
Two main hashtables of the JVM: the Symbol table contains names, signatures, identifiers etc. and the String table contains references to interned strings. If Native Memory Tracking indicates significant memory usage by a String table, it probably means the application excessively calls String.intern.
7. Threads
Thread stacks are also responsible for taking RAM. The stack size is controlled by -Xss. The default is 1M per thread, but fortunately things are not so bad. The OS allocates memory pages lazily, i.e. on the first use, so the actual memory usage will be much lower (typically 80-200 KB per thread stack). I wrote a script to estimate how much of RSS belongs to Java thread stacks.
There are other JVM parts that allocate native memory, but they do not usually play a big role in total memory consumption.
Direct buffers
An application may explicitly request off-heap memory by calling ByteBuffer.allocateDirect. The default off-heap limit is equal to -Xmx, but it can be overridden with -XX:MaxDirectMemorySize. Direct ByteBuffers are included in Other section of NMT output (or Internal before JDK 11).
The amount of direct memory in use is visible through JMX, e.g. in JConsole or Java Mission Control:
Besides direct ByteBuffers there can be MappedByteBuffers - the files mapped to virtual memory of a process. NMT does not track them, however, MappedByteBuffers can also take physical memory. And there is no a simple way to limit how much they can take. You can just see the actual usage by looking at process memory map: pmap -x <pid>
Address Kbytes RSS Dirty Mode Mapping
...
00007f2b3e557000 39592 32956 0 r--s- some-file-17405-Index.db
00007f2b40c01000 39600 33092 0 r--s- some-file-17404-Index.db
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
Native libraries
JNI code loaded by System.loadLibrary can allocate as much off-heap memory as it wants with no control from JVM side. This also concerns standard Java Class Library. In particular, unclosed Java resources may become a source of native memory leak. Typical examples are ZipInputStream or DirectoryStream.
JVMTI agents, in particular, jdwp debugging agent - can also cause excessive memory consumption.
This answer describes how to profile native memory allocations with async-profiler.
Allocator issues
A process typically requests native memory either directly from OS (by mmap system call) or by using malloc - standard libc allocator. In turn, malloc requests big chunks of memory from OS using mmap, and then manages these chunks according to its own allocation algorithm. The problem is - this algorithm can lead to fragmentation and excessive virtual memory usage.
jemalloc, an alternative allocator, often appears smarter than regular libc malloc, so switching to jemalloc may result in a smaller footprint for free.
Conclusion
There is no guaranteed way to estimate full memory usage of a Java process, because there are too many factors to consider.
Total memory = Heap + Code Cache + Metaspace + Symbol tables +
Other JVM structures + Thread stacks +
Direct buffers + Mapped files +
Native Libraries + Malloc overhead + ...
It is possible to shrink or limit certain memory areas (like Code Cache) by JVM flags, but many others are out of JVM control at all.
One possible approach to setting Docker limits would be to watch the actual memory usage in a "normal" state of the process. There are tools and techniques for investigating issues with Java memory consumption: Native Memory Tracking, pmap, jemalloc, async-profiler.
Update
Here is a recording of my presentation Memory Footprint of a Java Process.
In this video, I discuss what may consume memory in a Java process, how to monitor and restrain the size of certain memory areas, and how to profile native memory leaks in a Java application.
https://developers.redhat.com/blog/2017/04/04/openjdk-and-containers/:
Why is it when I specify -Xmx=1g my JVM uses up more memory than 1gb
of memory?
Specifying -Xmx=1g is telling the JVM to allocate a 1gb heap. It’s not
telling the JVM to limit its entire memory usage to 1gb. There are
card tables, code caches, and all sorts of other off heap data
structures. The parameter you use to specify total memory usage is
-XX:MaxRAM. Be aware that with -XX:MaxRam=500m your heap will be approximately 250mb.
Java sees host memory size and it is not aware of any container memory limitations. It doesn't create memory pressure, so GC also doesn't need to release used memory. I hope XX:MaxRAM will help you to reduce memory footprint. Eventually, you can tweak GC configuration (-XX:MinHeapFreeRatio,-XX:MaxHeapFreeRatio, ...)
There is many types of memory metrics. Docker seems to be reporting RSS memory size, that can be different than "committed" memory reported by jcmd (older versions of Docker report RSS+cache as memory usage).
Good discussion and links: Difference between Resident Set Size (RSS) and Java total committed memory (NMT) for a JVM running in Docker container
(RSS) memory can be eaten also by some other utilities in the container - shell, process manager, ... We don't know what else is running in the container and how do you start processes in container.
TL;DR
The detail usage of the memory is provided by Native Memory Tracking (NMT) details (mainly code metadata and garbage collector). In addition to that, the Java compiler and optimizer C1/C2 consume the memory not reported in the summary.
The memory footprint can be reduced using JVM flags (but there is impacts).
The Docker container sizing must be done through testing with the expected load the application.
Detail for each components
The shared class space can be disabled inside a container since the classes won't be shared by another JVM process. The following flag can be used. It will remove the shared class space (17MB).
-Xshare:off
The garbage collector serial has a minimal memory footprint at the cost of longer pause time during garbage collect processing (see Aleksey Shipilëv comparison between GC in one picture). It can be enabled with the following flag. It can save up to the GC space used (48MB).
-XX:+UseSerialGC
The C2 compiler can be disabled with the following flag to reduce profiling data used to decide whether to optimize or not a method.
-XX:+TieredCompilation -XX:TieredStopAtLevel=1
The code space is reduced by 20MB. Moreover, the memory outside JVM is reduced by 80MB (difference between NMT space and RSS space). The optimizing compiler C2 needs 100MB.
The C1 and C2 compilers can be disabled with the following flag.
-Xint
The memory outside the JVM is now lower than the total committed space. The code space is reduced by 43MB. Beware, this has a major impact on the performance of the application. Disabling C1 and C2 compiler reduces the memory used by 170 MB.
Using Graal VM compiler (replacement of C2) leads to a bit smaller memory footprint. It increases of 20MB the code memory space and decreases of 60MB from outside JVM memory.
The article Java Memory Management for JVM provides some relevant information the different memory spaces.
Oracle provides some details in Native Memory Tracking documentation. More details about compilation level in advanced compilation policy and in disable C2 reduce code cache size by a factor 5. Some details on Why does a JVM report more committed memory than the Linux process resident set size? when both compilers are disabled.
Java needs a lot a memory. JVM itself needs a lot of memory to run. The heap is the memory which is available inside the virtual machine, available to your application. Because JVM is a big bundle packed with all goodies possible it takes a lot of memory just to load.
Starting with java 9 you have something called project Jigsaw, which might reduce the memory used when you start a java app(along with start time). Project jigsaw and a new module system were not necessarily created to reduce the necessary memory, but if it's important you can give a try.
You can take a look at this example: https://steveperkins.com/using-java-9-modularization-to-ship-zero-dependency-native-apps/. By using the module system it resulted in CLI application of 21MB(with JRE embeded). JRE takes more than 200mb. That should translate to less allocated memory when the application is up(a lot of unused JRE classes will no longer be loaded).
Here is another nice tutorial: https://www.baeldung.com/project-jigsaw-java-modularity
If you don't want to spend time with this you can simply get allocate more memory. Sometimes it's the best.
How to size correctly the Docker memory limit?
Check the application by monitoring it for some-time. To restrict container's memory try using -m, --memory bytes option for docker run command - or something equivalant if you are running it otherwise
like
docker run -d --name my-container --memory 500m <iamge-name>
can't answer other questions.

Understanding JVM Memory Allocation and Java Out of Memory: Heap Space

I'm looking into really understanding how memory allocation works in the JVM.
I'm writing an application in which I'm getting Out of Memory: Heap Space exceptions.
I understand that I can pass in VM arguments such as Xms and Xmx to up the heap space that the JVM allocates for the running process. This is one possible solution to the problem, or I can inspect my code for memory leaks and fix the issue there.
My questions are:
1) How does the JVM actually allocate memory for itself? How does this relate to how the OS communicates available memory to the JVM? Or more generally, how does memory allocation for any process actually work?
2) How does virtual memory come into play? Let's say you have a system with 32GB of physical memory and you allocate all 32GB to your Java process. Let's say that your process actually consumes all 32GB of memory, how can we enforce the process to use virtual memory instead of running into OOM exceptions?
Thanks.
How does the JVM actually allocate memory for itself?
For the heap it allocate one large continuous region of memory of the maximum size. Initially this is virtual memory however, over time it becomes real memory for the portions which are used, under control of the OS
How does this relate to how the OS communicates available memory to the JVM?
The JVM has no idea about free memory in the OS.
Or more generally, how does memory allocation for any process actually work?
In general it uses malloc and free.
How does virtual memory come into play?
Initially virtual memory is allocated and this turns into real memory as used. This is normal for any process.
Let's say you have a system with 32GB of physical memory and you allocate all 32GB to your Java process.
You can't. The OS need some memory and there will be memory for other purposes. Even within the JVM the heap is only a portion of the memory used. If you have 32 GB of memory I suggest as 24 GB heap max.
Let's say that your process actually consumes all 32GB of memory,
Say you have 48 GB and you start a process which uses 32 GB of main memory.
how can we enforce the process to use virtual memory instead of running into OOM exceptions?
The application uses virtual memory right from the start. You cannot make the heap too large because if it starts swapping your machine (not just your application) will become unusable.
You can use more memory than you have physical by using off heap memory, carefully. However managed memory must be in physical memory so if you need a 32 GB heap, buy 64 GB of main memory.
The JVM (or for that matter any process) that wants to allocate memory will call the C runtime 'malloc' function. This function maintains the heap memory of the C runtime. It, in turn, obtains memory from the operating system kernel - the function used for this is platform dependent; in Linux it could be using the brk or sbrk system calls.
Once the memory has been obtained by the JVM, it manages the memory itself, allocating parts of it to the various objects created by the running program.
Virtual memory is handled entirely by the operating system kernel. The kernel manages mapping of physical memory pages to the address space of various processes; if there is less physical memory than is needed by all the processes in the system then the OS Kernel will swap some of it out to disk.
You can't (and don't need to) force processes to use Virtual Memory. It is transparent to your process.
If you are getting 'out of memory' errors, then the causes are likely to be:
The JVM limits are being exceeded. These are controlled by various command line arguments and/or properties as you stated in your question
The OS may have run out of swap space (or not have any swap space configured to start with). Or some OSs don't even support virtual memory, in which case you have run out of real memory.
Most OSs have facilities for the administrator to limit the amount of memory consumed by a process - for example, in Linux the setrlimit system call and/or the ulimit shell command, both of which set limits that the kernel will observe. If a process requests more memory than is allowed by the limits, then the attempt will fail (typically this results in an out of memory message).
This blog looks at Java memory utilisation which you might find useful:
http://www.waratek.com/blog/november-2013/introduction-to-real-world-jvm-memory-utilisation
The JVM allocates Java heap memory from the OS and then manages the
heap for the Java application. When an application creates a new
object, the JVM sub-allocates a contiguous area of heap memory to
store it. An object in the heap that is referenced by any other object
is "live," and remains in the heap as long as it continues to be
referenced. Objects that are no longer referenced are garbage and can
be cleared out of the heap to reclaim the space they occupy. The JVM
performs a garbage collection (GC) to remove these objects,
reorganizing the objects remaining in the heap.
Source: http://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html
In a system using virtual memory, the physical memory is divided into
equally-sized pages. The memory addressed by a process is also divided
into logical pages of the same size. When a process references a
memory address, the memory manager fetches from disk the page that
includes the referenced address, and places it in a vacant physical
page in the RAM.
Source: http://searchstorage.techtarget.com/definition/virtual-memory

Java 32bit Xmx vs java 64bit Xmx

I am really confused with this.
Xmx according to the java docs, is the maximum allowable heap size.
Xms is the minimum required java heap size, and is allocated at JVM start.
On a 32 bit JVM (4GB ram), java -Xmx1536M HelloWorld gives a cannot allocate enough memory error.
On a 64 bit JVM (4GB Ram), java -Xmx20G HelloWorld works just fine. But I don't even have that much virtual or physical memory allocated.
So from this, I conclude that Java 32 bit is allocating the 1536M at JVM startup, but Java 64 bit is not.
Why? A simple Hello World should not need 1536M to run. I am just specifying that 1536M is the maximum, not that it is needed.
Explanations anyone?
There is a distinction between allocating the memory and allocating the address space. The Oracle JVM is allocating the address space on startup to ensure the heap is contiguous. This allows for certain optimizations to be used with the heap.
If the allocation fails, then Java won't start... as you have seen. It isn't necessarily using that much memory, but it is allocating the required address space up-front. Since you are passing -Xmx1536m, it is saying ok, I need to allocate that in case you need it... and since it must be contiguous it does it up-front so it can guarantee it (or fails trying).
This behavior is the same on both 32-bit and 64-bit JVMs. What you are seeing is the 2GB per-process address space limitation of 32-bit processes (at least, on Windows this is the limitation - it may be slightly larger on other platforms) causing this allocation to fail on 32-bit, where 64-bit has no issues since it has much larger address space. But, you say, 1536m is less than 2GB, I should be good, right? Not quite - the heap is not the only thing being allocated in the address space, DLLs and other stuff is also allocated in the address space...so getting a contiguous 1536m chunk out of 2GB max on 32-bit is unfortunately very unlikely. I've seen values below 1000m fail on 32-bit processes with particularly bad fragmentation, and usually 1200-1300m is the max heap you can specify on 32-bit.
On modern OSes, ASLR (Address Space Layout Randomization) makes fragmentation of 32-bit process address space worse. It intentionally loads DLLs into random addresses for security reasons... making it even less likely you can get a big, contiguous heap in 32-bit.
In 64-bit, the address space is so large that fragmentation is no longer a problem and giant heaps can be allocated without issues. Even if you have 4GB of RAM on 32-bit, though, the 2GB per process address space limitation (at least on Windows) usually means the max heap is usually only 1300m or so.
Actually, the application is not allocating the Xmx memory at startup.
The -Xms parameter configure the startup memory. (What are the Xms and Xmx parameters when starting JVMs?)
The 64bit environment allows a bigger memory allocation then 32bits. But, in fact, it's using the HD space, not the memory ram.
See this other post for more info.
Estimating maximum safe JVM heap size in 64-bit Java
Under Windows there is a difference in memory allocation operations with native WinAPI low-level functions like VirtualAlloc.
"Reserving" means allocation of a continuous area within process' address space without actually making this area of virtual memory usable. Allocated area is not backed by actual physical RAM or swap space and does not consume any free memory. Any application can reserve any amount of address space limited only by processor's memory addressing capability (bitness).
"Committing" means backing some of previously "reserved" memory with real memory - RAM or swap space, making it actually readable/writable by the process. This memory is taken from available OS virtual memory pool (RAM and swap).
An alternative to "committing" memory (taking a blank page from the pool) is "mapping" a file into the previously "reserved" memory. This does not consume memory from the swap pool but uses the mapped file in a manner similar to a dedicated swap space for that specific region of a process' address space.
Native Windows applications (like JVM itself) reserve memory for all heaps needed in the future, but commit it only as needed.
High-level memory operations like malloc() style functions or "new" operators really do "commits" as needed or even do their own heap management logic user-mode with memory committed ahead by large chunks as it is'a CPU-intensive kernel call and works at page (4k) granularity.
During process startup JVM "reserves" -Xmx memory, but "commits" only -Xms amount of it. The remaining reserved memory is committed on demand as heaps grow. So heaps can grow up to available memory or -Xmx parameter, whichever is smaller.

Categories