JVM leaking memory outside heap and buffer pools

JVM leaking memory outside heap and buffer pools - java

We have a java application as a long running service (actual up-time for this JVM 31 days 3 hrs 35 min)
Due to Windows taskmanager the process uses 1,075,384,320 B - nearly one GB.
Heap size of the JVM is restricted to 256 MB (-Xmx256m)
Memory-Data
Memory:
Size: 268,435,456 B
Max: 268,435,456 B
Used: 100,000,000 up to 200,000,000 B
- no leak here
Buffer Pools
Direct:
Count: 137
Memory Used and Total Capacity: 1,348,354 B
Mapped:
Count: 0
Memory Used and Total Capacity: 0 B
- no leak here
my question: where does the JVM uses the additional memory?
Additional informations:
Java: version 1.8.0_74 32 bit (Oracle)
Classes:
Total loaded: 17,248
Total unloaded: 35,761
Threads:
Live: 273
Live peak: 285
Daemon: 79
Total started: 486,282
After a restart it takes some days for the process size to grow, so of course regular restart would help, and maybe using a newer java version also may solve the problem, but I would like to have an explanation for this behaviour, e. g. known bug in 1.8.0 before 111, fixed in ... - I did not find anything, yet.
We use about 350 of such installations in different places so changing is not so easy.

Don't forget to run your JVM in server mode on long running tasks!
The reason for this kind of memory leak was the JVM running in client mode.
Our solutions runs in a couple of chain stores on old windows xp 32-bit PCs.
Default for JVMs on this platform is client mode.
In most cases we run JRE 1.8.0_74-32bit. With our application this JVM leaks memory in "Thread Arena Space" - seems to be nothing returned ever.
After switching to server mode by setting the parameter -server on JVM start the problems disappeared.

There are two common reasons for off-heap memory consumption:
Your application or one of the libraries you are using (e.g. JDBC driver) perform something natively (a module invoked via JNI)
Your application or one of the libraries are using off-heap memory in other ways, i.e. with direct buffers or with some "clever" use of sun.misc.Unsafe.
You can verify this tracking the native memory usage with jcmd as explained here (you'll need to restart the application).

Related

Windows memory management and java

I'm running a Windows 2016 (x64) server with 32GB RAM. According to Resource Monitor the memory map looks like this:
1MB Reserved, 17376MB In Use, 96MB Modified, 4113MB Standby, 11016MB Free. Summary:
15280MB Available,
4210 MB Cached,
32767MB Total,
32768MB Installed
I have a java (64-bit JVM) service that I want to run on 8GB of memory:
java -Xms8192m -Xmx8192m -XX:MaxMetaspaceSize=128m ...
which results in
Error occurred during initialization of VM
Could not reserve enough space for object heap
I know that 32-bit OS and 32-bit JVM would limit the usable heap, but I verified both are 64-bit. I read that on 32-bit windows / JVM, the heap has be be contiguous. But here I had hoped to be able to even allocate 15GB for the heap, as over 15GB are 'Available' (available for whom / what?).
Page file size is automatically managed, and currently at 7680MB.
I'd be thankful for an explanation why Windows refuses to hand out the memory (or why java cannot make use of it), and what are my options (apart from resizing the host or using like 4GB, which works but is insufficient for the service).
I have tried rebooting the server, but when it's this service's turn to start, other services have already "worked" the memory quite a bit.
Edit: I noticed that the Resource Monitor has a graph called 'Commit Charge' which is over 90%. Task manager has a 'Committed' line which (today) lists 32,9/40,6 GB. Commit charge explains the term, and yes, I've seen the mentioned virtual memory popups already. It seems that for a reason unknown to me, a very high Commit Charge has built up and prevents the 8GB-java from starting. This puts even more emphasis on the question: What does '15 GB Available' memory mean - and to whom is it available, if not for a process?

Docker memory limit causes SLUB unable to allocate with large page cache

Given a process that creates a large linux kernel page cache via mmap'd files, running in a docker container (cgroup) with a memory limit causes kernel slab allocation errors:
Jul 18 21:29:01 ip-10-10-17-135 kernel: [186998.252395] SLUB: Unable to allocate memory on node -1 (gfp=0x2080020)
Jul 18 21:29:01 ip-10-10-17-135 kernel: [186998.252402] cache: kmalloc-2048(2412:6c2c4ef2026a77599d279450517cb061545fa963ff9faab731daab2a1f672915), object size: 2048, buffer size: 2048, default order: 3, min order: 0
Jul 18 21:29:01 ip-10-10-17-135 kernel: [186998.252407] node 0: slabs: 135, objs: 1950, free: 64
Jul 18 21:29:01 ip-10-10-17-135 kernel: [186998.252409] node 1: slabs: 130, objs: 1716, free: 0
Watching slabtop I can see the number of buffer_head, radix_tree_node and kmalloc* objects is heavily restricted in a container started with a memory limit. This appears to have pathologic consequences for IO throughput in the application and observable with iostat. This does not happen even when the page cache consumes all available memory on the host OS running outside a container or a container with no memory limit.
This appears to be an issue in the kernel memory accounting where the kernel page cache is not counted against the containers memory, but the SLAB objects that support it are. The behavior appears to be aberrant because running when a large slab object pool is preallocated, the memory constrained container works fine, freely reusing the existing slab space. Only slab allocated in the container counts against the container. No combination of container options for memory and kernel-memory seems to fix the issue (except not setting a memory limit at all or a limit so large that is does not restrict the slab but this restricts the addressable space). I have tried to disable kmem accounting entirely with no success by passing cgroup.memory=nokmem at boot.
System Info:
Linux ip-10-10-17-135 4.4.0-1087-aws #98-Ubuntu SMP
AMI ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-20190204.3
Docker version 18.09.3, build 774a1f4
java 10.0.1 2018-04-17
To reproduce the issue you can use my PageCache java code. This is a bare bones repro case of an embedded database library that heavily leverages memory mapped files to be deployed on a very fast file system. The application is deployed on AWS i3.baremetal instances via ECS. I am mapping a large volume from the host to the docker container where the memory mapped files are stored. The AWS ECS agent requires setting a non zero memory limit for all containers. The memory limit causes the pathologic slab behavior and the resulting application IO throughput is totally unacceptable.
It is helpful to drop_caches between runs using echo 3 > /proc/sys/vm/drop_caches. This will clear the page cache and the associated pool of slab objects.
Suggestions on how to fix, work around or even where report this issue would be welcome.
UPDATE
It appears that updating to Ubuntu 18.04 with the 4.15 kernel does fix the observed kmalloc allocation error. The version of Java seems to be irrelevant. This appears to be because each v1 CGroup can only allocate page cache up to the memory limit (with multiple cgroups it is more complicated with only one cgroup being "charged" for the allocation via the Shared Page Accounting scheme). I believe this is now consistent with the intended behavior. In the 4.4 kernel we found that the observed kmalloc errors were an intersection of using software raid0 in a v1 Cgroup with a memory limit and a very large page cache. I believe the cgroups in the 4.4 kernel were able to map an unlimited number of pages (a bug which we found useful) upto the point at which the kernel ran out of memory for the associated slab objects, but I still don't have a smoking gun for the cause.
With the 4.15 kernel, our Docker containers are required to set a memory limit (via AWS ECS) so we have implemented a task to unset the memory limit as soon as the container is created in /sys/fs/cgroup/memory/docker/{contarainer_id}/memory.limit_in_bytes. This appears to work though it is not a good practice to be sure. This allows the behavior we want - unlimited sharing of page cache resources on the host. Since we are running a JVM application with a fixed heap, the down side risk is limited.
For our use case, it would be fantastic to have the option to discount the page cache (mmap'd disk space) and associated slab objects entirely for a cgroup but maintain the limit on heap & stack for the docker process. The present Shared Page Accounting scheme is rather hard to reason about and we would prefer to allow the LRU page cache (and associated SLAB resources) to use the full extent of the hosts memory as is the case when the memory limit is not set at all.
I have started following some conversations on LWN but I am a bit in the dark. Maybe this is a terrible idea? I don't know... advice on how to proceed or where to go next is welcome.

java 10.0.1 2018-04-17
You should try with a more recent version of java 10 (or 11 or...)
I mentioned in "Docker support in Java 8 — finally!" last May (2019), that new evolutions from Java 10, backported in Java 8, means Docker will report more accurately the memory used.
This article from may 2018 reports:
Succes! Without providing any flags Java 10 (10u46 -- Nightly) correctly detected Dockers memory limits.
The OP David confirms in the comments:
The docker - jvm integration is a big improvement in Java 10.
It is really to do with setting sane XMS and the number of processors. These now respect the docker container limits rather than picking up the host instance values (you can turn this feature off using -XX:-UseContainerSupport depending on your use case).
I have not found it helpful in dealing with the page cache though.
The best solution I have found is to disable the docker memory limit, after the container is created if need be.
This is definitely a hack - user beware.

Java process memory growing indefinitely. Memory leak?

We have a java process running on Solaris 10 serving about 200-300 concurrent users. The administrators have reported that memory used by process increases significantly over time. It reaches 2GB in few days and never stops growing.
We have dumped the heap and analysed it using Eclipse Memory Profiler, but weren't able to see anything out of the ordinary there. The heap size was very small.
After adding memory stat logging, to our application we have found discrepancy between memory usage reported by "top" utility, used by the administrator, and the usage reported by MemoryMXBean and Runtime libraries.
Here is an output from both.
Memory usage information
From the Runtime library
Free memory: 381MB
Allocated memory: 74MB
Max memory: 456MB
Total free memory: 381MB
From the MemoryMXBean library.
Heap Committed: 136MB
Heap Init: 64MB
Heap Used: 74MB
Heap Max: 456MB
Non Heap Committed: 73MB
Non Heap Init: 4MB
Non Heap Used: 72MB
Current idle threads: 4
Current total threads: 13
Current busy threads: 9
Current queue size: 0
Max threads: 200
Min threads: 8
Idle Timeout: 60000
PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
99802 axuser 115 59 0 2037M 1471M sleep 503:46 0.14% java
How can this be? top command reports so much more usage. I was expecting that RES should be close to heap+non-heap.
pmap -x , however, reports most of the memory in the heap:
Address Kbytes RSS Anon Locked Mode Mapped File
*102000 56 56 56 - rwx---- [ heap ]
*110000 3008 3008 2752 - rwx---- [ heap ]
*400000 1622016 1621056 1167568 - rwx---- [ heap ]
*000000 45056 45056 45056 - rw----- [ anon ]
Can anyone please shed some light on this? I'm completely lost.
Thanks.
Update
This does not appear to be an issue on Linux.
Also, based on the Peter Lawrey's response the "heap" reported by pmap is native heap not Java heap.

I have encountered a similar problem and found a resolution:
Solaris 11
JDK10
REST application using HTTPS (jetty server)
There was a significant increase of c-heap (observed via pmap) over time
I decided to do some stress tests with libumem.
So i started the proces with
UMEM_DEBUG=default UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1
and stressed the application with https requests.
After a while I connected to the process with mdb.
In mdb I used the command ::findleaks and it showed this as a leak:
libucrypto.so.1`ucrypto_digest_init
So it seems than the JCA (Java Cryptography Architecture) implementation OracleUcrypto has some issues on Solaris.
The problem was resolved by updating of the $JAVA_HOME/conf/security/java.security file -
I changed the priority of OracleUcrypto to 3 and the SUN implementation to 1
security.provider.3=OracleUcrypto
security.provider.2=SunPKCS11 ${java.home}/conf/security/sunpkcs11-solaris.cfg
security.provider.1=SUN
After this the problem dissapeared.
This also explains why there is no problem on linux - since there are different implememntations of JCA providers in play

In garbage collected environments, holding on to unused pointers amounts to "failure to leak" and prevents the GC from doing its job. It's really easy
to accidentally keep pointers around.
A common culprit is hashtables. Another is arrays or vectors which are
logically cleared (by setting the reuse index to 0) but where the actual
contents of the array (above the use index) is still pointing to something.

Java 6 Update 25 VM crash: insufficient memory

For an update of this question - see below.
I experience a (reproducible, at least for me) JVM crash (not an OutOfMemoryError)
(The application which crashes is eclipse 3.6.2).
However, looking at the crash log makes me wonder:
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 65544 bytes for Chunk::new
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32-bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
Current thread (0x531d6000): JavaThread "C2 CompilerThread1" daemon
[_thread_in_native, id=7812, stack(0x53af0000,0x53bf0000)]
Stack: [0x53af0000,0x53bf0000], sp=0x53bee860, free space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [jvm.dll+0x1484aa]
V [jvm.dll+0x1434fc]
V [jvm.dll+0x5e6fc]
V [jvm.dll+0x5e993]
V [jvm.dll+0x27a571]
V [jvm.dll+0x258672]
V [jvm.dll+0x25ed93]
V [jvm.dll+0x260072]
V [jvm.dll+0x24e59a]
V [jvm.dll+0x47edd]
V [jvm.dll+0x48a6f]
V [jvm.dll+0x12dcd4]
V [jvm.dll+0x155a0c]
C [MSVCR71.dll+0xb381]
C [kernel32.dll+0xb729]
I am using Windows XP 32-bit SP3. I have 4GB RAM.
Before starting the application I had 2 GB free according to the task manager (+ 1 GB system cache which might be freed as well.). I am definitely having enough free RAM.
From the start till the crash I logged the JVM memory statistics using visualvm and jconsole.
I acquired the memory consumption statistics until the last moments before the crash.
The statistics shows the following allocated memory sizes:
HeapSize: 751 MB (used 248 MB)
Non-HeapSize(PermGen & CodeCache): 150 MB (used 95 MB)
Size of memory management areas (Edenspace, Old-gen etc.): 350 MB
Thread stack sizes: 17 MB (according to oracle and due the fact that 51 threads are running)
I am running the application (jre 6 update 25, server vm) using the parameters:
-XX:PermSize=128m
-XX:MaxPermSize=192m
-XX:ReservedCodeCacheSize=96m
-Xms500m
-Xmx1124m
Question:
Why does the JVM crash when there's obviously enough memory on the VM and OS?
With the above settings I think that I cannot hit the 2GB 32-bit limit (1124MB+192MB+96MB+thread stacks < 2GB). In any other case (too much heap allocation), I would rather expect an OutOfMemoryError than a JVM crash
Who can help me to figure out what is going wrong here?
(Note: I upgraded recently to Eclipse 3.6.2 from Eclipse 3.4.2 and from Java 5 to Java 6. I suspect that there's a connection between the crashes and these changes because I haven't seen these before)
UPDATE
It seems to be a JVM bug introduced in Java 6 Update 25 and has something to do with the new jit compiler. See also this blog entry.
According to the blog, the fix of this bug should be part of the next java 6 updates.
In the meanwhile, I got a native stack trace during a crash. I've updated the above crash log.
The proposed workaround, using the VM argument -XX:-DoEscapeAnalysis works (at least it notably lowers the probability of a crash)

2GB on 32-bit JVM on Windows is incorrect. https://blogs.sap.com/2019/10/07/does-32-bit-or-64-bit-jvm-matter-anymore/
Since you are on Windows-XP you are stuck with a 32 bit JVM.
The max heap is 1.5GB on 32 bit VM on Windows. You are at 1412MB to begin with without threads. Did you try decreasing the swap stack size -Xss, and have you tried eliminating the PermSize allocated initially: -XX:PermSize=128m? Sounds like this is an eclipse problem, not a memory-problem per-se.
Can you move to a newer JVM or different (64-bit) JVM on a different machine? Even if you are targeting windows-XP there is no reason to develop on it, unless you HAVE to. Eclipse can run, debug and deploy code on remote machines easily.
Eclipse's JVM can be different then the JVM of things you run in or with eclipse. Eclipse is a memory pig. You can eliminate unnecessary eclipse plug-ins to use less eclipse memory, it comes with things out of the box you probably don't need or want.
Try to null out references (to eliminate circularly un-collectible GC objects), re-use allocated memory, use singletons, and profile your memory usage to eliminate unnecessary objects, references and allocations. Additional tips:
Prefer static memory allocation, i.e allocate once per VM as opposed
to dynamically.
Avoid creation of temporary objects within functions - consider a reset() method which can allow the object to reused
Avoid String mutations and mutation of auto boxed types.

I think that #ggb667 has nailed it with the reason your JVM is crashing. 32-bit Windows architectural constraints limit the available RAM for a Java application to 1.5GB1 ... not 2GB as you surmised. Also, you have neglected to include the address space occupied by the code segment of the executable, shared libraries, the native heap, and "other things".
Basically, this is not a JVM bug. You are simply running against the limitations of your hardware and operating system.
There is a possible solution in the form of PAE (Physical Address Extension) support in some versions of Windows. According to the link, Windows XP with PAE makes available up to 4GB of usable address spaces to user processes. However, there are caveats about device driver support.
Another possible solution is to reduce the max heap size, and do other things to reduce the application's memory utilization; e.g. in Eclipse reduce the number of "open" projects in your workspace.
See also: Java maximum memory on Windows XP
1 - Different sources say different things about the actual limit, but it is significantly less than 2GB. To be frank, it doesn't matter what the actual limit is.
In an ideal world this question should no longer be of practical interest to anyone. In 2020:
You shouldn't be running Windows XP. It has been end of life since April 2014
You shouldn't be running Java 6. It has been end of life since April 2013
If you are still running Java 6, you should be at the last public patch release: 1.6.0_45. (Or a later 1.6 non-public release if you have / had a support contract.)
Either way, you should not be running Eclipse on this system. Seriously, you can get a new 64-bit machine for a few hundred dollars with more memory, etc that will allow you to run an up-to-date operating system and an up-to-date Java release. You should use that to run Eclipse.
If you really need to do Java development on an old 32-bit machine with an old version of Java (because you can't afford a newer machine) you would be advised to use a simple text editor and the Java 6 JDK command line tools (and a 3rd-party Java build tool like Ant, Maven, Gradle).
Finally, if you are still trying to run / maintain Java software that is stuck on Java 6, you should really be trying to get out of that hole. Life is only going to get harder for you:
If the Java 6 software was developed in-house or you have source code, port it.
If you depend on proprietary software that is stuck on Java 6, look for a new vendor.
If management says no, put it to them that they may need to "turn it off".
You / your organization should have dealt with this issue this SEVEN years ago.

I stumbled upon a similar problem at work. We had set -Xmx65536M for our application but kept getting exactly the same kind of errors. The funny thing is that the errors happened always at a time when our application was actually doing pretty lightweight calculations, relatively speaking, and was thus nowhere near this limit.
We found a possible solution for the problem online: http://www.blogsoncloud.com/jsp/techSols/java-lang-OutOfMemoryError-unable-to-create-new-native-thread.jsp , and it seemed to solve our problem. After lowering -Xmx to 50G, we've had none of these issues.
What actually happens in the case is still somewhat unclear to us.

The JVM has its own limits that will stop it long before it hits the physical or virtual memory limits. What you need to adjust is the heap size, which is with another one of the -X flags. (I think it's something creative like -XHeapSizeLimit but I'll check in a second.)
Here we go:
-Xmsn Specify the initial size, in bytes, of the memory allocation pool.
This value must be a multiple of 1024
greater than 1MB. Append the letter k
or K to indicate kilobytes, or m or M
to indicate megabytes. The default
value is 2MB. Examples:
-Xms6291456
-Xms6144k
-Xms6m
-Xmxn Specify the maximum size, in bytes, of the memory allocation pool.
This value must a multiple of 1024
greater than 2MB. Append the letter k
or K to indicate kilobytes, or m or M
to indicate megabytes. The default
value is 64MB. Examples:
-Xmx83886080
-Xmx81920k
-Xmx80m

How to improve the amount of memory used by Jboss?

I have a Java EE application running on jboss-5.0.0.GA. The application uses BIRT report tool to generate several reports.
The server has 4 cores of 2.4 Ghz and 8 Gb of ram.
The startup script is using the next options:
-Xms2g -Xmx2g -XX:MaxPermSize=512m
The application has reached some stability with this configuration, some time ago I had a lot of crashes because of the memory was totally full.
Rigth now, the application is not crashing, but memory is always fully used.
Example of top command:
Mem: 7927100k total, 7874824k used, 52276k free
The java process shows a use of 2.6g, and this is the only application running on this server.
What can I do to ensure an amount of free memory?
What can I do to try to find a memory leak?
Any other suggestion?
TIA
Based in answer by mezzie:
If you are using linux, what the
kernel does with the memory is
different with how windows work. In
linux, it will try to use up all the
memory. After it uses everything, it
will then recycle the memory for
further use. This is not a memory
leak. We also have jboss tomcat on our
linux server and we did research on
this issue a while back.
I found more information about this,
https://serverfault.com/questions/9442/why-does-red-hat-linux-report-less-free-memory-on-the-system-than-is-actually-ava
http://lwn.net/Articles/329458/
And well, half memory is cached:
total used free shared buffers cached
Mem: 7741 7690 50 0 143 4469

If you are using linux, what the kernel does with the memory is different with how windows work. In linux, it will try to use up all the memory. After it uses everything, it will then recycle the memory for further use. This is not a memory leak. We also have jboss tomcat on our linux server and we did research on this issue a while back.

I bet those are operating system mem values, not Java mem values. Java uses all the memory up to -Xmx and then starts to garbage collect, to vastly oversimplify. Use jconsole to see what the real Java memory usage is.

To make it simple, the JVM's max amount of memory us is equal to MaxPermGen (permanently used as your JVM is running. It contains the class definitions, so it should not grow with the load of your server) + Xmx (max size of the object heap, which contains all instances of the objects currently running in the JVM) + Xss (Thread stacks space, depending on the number of threads running in you JVM, which can most of the time be limited for a server) + Direct Memory Space (set by -XX:MaxDirectMemorySize=xxxx)
So do the math.If you want to be sure you have free memory left, you will have to limit the MaxPermGen, the Xmx and the number of threads allowed on your server.
Risk is, if the load on your server grows, you can get an OutOfMemoryError...

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.