Java memory usage on Linux - java

I'm running a handfull of Java Application servers that are all running the latest versions of Tomcat 6 and Sun's Java 6 on top of CentOS 5.5 Linux. Each server runs multiple instances of Tomcat.
I'm setting the -Xmx450m -XX:MaxPermSize=192m parameters to control how large the heap and permgen will grow. These settings apply to all the Tomcat instances across all of the Java Application servers, totaling about 70 Tomcat instances.
Here is a typical memory usage of one of those Tomcat instances as reported by Psi-probe
Eden = 13M
Survivor = 1.5M
Perm Gen = 122M
Code Cache = 19M
Old Gen = 390M
Total = 537M
CentOS however is reporting RAM usage for this particular process at 707M (according to RSS) which leaves 170M of RAM unaccounted for.
I am aware that the JVM itself and some of it's dependancy libraries must be loaded into memory so I decided to fire up pmap -d to find out their memory footprint.
According to my calculations that accounts for about 17M.
Next there is the Java thread stack, which is 320k per thread on the 32 bit JVM for Linux.
Again, I use Psi-probe to count the number of threads on that particular JVM and the total is 129 threads. So 129 + 320k = 42M
I've read that NIO uses memory outside of the heap, but we don't use NIO in our applications.
So here I've calculated everything that comes to (my) mind. And I've only accounted for 60M of the "missing" 170M.
What am I missing?

Try using the incremental garbage collector, using the -Xincgc command line option.
It's little more aggressive on the whole GC efforts, and has a special happy little anomaly: it actually hands back some of its unused memory to the OS, unlike the default and other GC choices !
This makes the JVM consume a lot less memory, which is especially good if you're running multiple JVM's on one machine. At the expense of some performance - but you might not notice it. The incgc is a little secret it seems, because noone ever brings it up... It's been there for eons (90's even).

Arnar, In JVM initialization process JVM will allocate a memory (mmap or malloc) of size specified by -Xmx and MaxPermSize,so anyways JVM will allocate 450+192=642m of heap space for application at the start of the JVM process. So java heap space for application is not 537 but its 642m.So now if you do the calculation it will give you your missing memory.Hope it helps.

Java allocates as much virtual memory as it might need up front, however the resident side will be how much you actually use. Note: Many of the libraries and threads have their own over heads and while you don't use direct memory, it doesn't mean none of the underlying system do. e.g. if you use NIO, it will use some direct memory even if you use heap ByteBuffers.
Lastly, 100 MB is worth about £8. It may be that its not worth spending too much time worrying about it.

Not a direct answer, but, have you also considered hosting multiple sites within the same Tomcat instance? This could save you some memory at the expense of some additional configuration.

Arnar, the JVM also mmap's all jar files in use, which will use NIO and will contribute to the RSS. I don't believe those are accounted for in any of your measurements above. Do you by chance have a significant number of large jar files? If so, the pages used for those could be your missing memory.

Related

Java memory usage: Can someone explain the difference between memory reported by jconsole, ps, and prstat?

I'm investigating some memory bloat in a Java project. Confounded by the different statistics reported by different tools (we are using Java 8 on Solaris 10).
jconsole gives me three numbers:
Committed: the amount reserved for this process by the OS
Used: the amount actually being used by this process
Max: the amount available to the process (in our case it is limited to 128MB via Java command line option -Xmx128m).
For my project, jconsole reports 119.5MB max, 61.9MB committed, 35.5MB used.
The OS tools report something totally different:
ps -o vsz,rss and prstat -s rss and pmap -x all report that this process is using around 310MB virtual, 260MB physical
So my questions are:
Why does the OS report that I'm using around 5x as much as jconsole says is "committed" to my process?
Which of these measurements is actually accurate? (By "accurate", I mean, if I have 12GB of memory, can I run 40 of these (# 300MB) before I hit OutOfMemoryException? Or can I run 200 of them (# 60MB)? (Yes, I know I can't use all 12GB of memory, and yes I understand that virtual memory exists; I'm just using that number to illuminate the question better.)
This question goes quite deep. I'm just going to mention 3 of the many reasons:
VMs
Shared libraries
Stacks and permgen
VMs
Java is like a virtual mini computer. Imagine you ran an emulator on your computer that emulates an old macintosh computer, for example. The emulator app has a config screen where you set how much RAM is in the virtual computer. If you pick 1GB and start the emulator, your OS is going to say the 'Old Mac Emulator' application is taking 1GB. Eventhough inside the virtual machine, that virtual old mac might be reporting 800MB of 1GB free.
A JVM is the same thing. The JVM has its own memory management. As far as the OS is concerned, java.exe is an app that takes 1GB. As far as the JVM is concerned, there's 400MB available on the heap right now.
A JVM is slightly more convoluted, in that the total amount of memory a JVM 'claims' from the OS can fluctuate. Out of the box, a JVM will generally not ask for the maximum right away, but will ask for more over time before kicking in the garbage collector, or a combination thereof: Heap full? Garbage collect. That only freed up maybe 20% or so? Ask the OS for more. -Xms and -Xmx control this; set them to the same, and the JVM will on bootup ask for that much memory and will never ask for more. In general a JVM will never relinquish any memory it claimed.
JVMs, still, are primarily aimed at server deployments, where you want the RAM dedicated to your VM to be constant. There's little point in having each app take whatever they want when they want it, generally. In contrast to desktop apps where you tend to have a ton of apps running and given that a human is 'operating' it, generally only one app has particularly significant ram requirements.
This explains jconsole, which is akin to reporting the free memory inside the virtual old mac app: It's reporting on the state of the heap as the JVM sees it.
Whereas ps -o and friends are memory introspection tools at the OS level, and they just see the JVM as a big black box.
Which one is actually accurate
They both are. From their perspective, they are correct.
Shared library
OSes are highly complex beasts, these days. To put things in java terms, you can have a single JVM that is concurrently handling 100 simultaneous incoming https calls. One could want to see a breakdown of how much memory each of the currently 100 running 'handlers' is taking up. Okay... so how do we 'file' the memory load of String, the class itself (not any particular instance of String - the code. e.g. the instructions for how .toLowerCase() runs. Those are in memory too, someplace!). The web framework needs it, so does the core JVM, and so does probably every single last one of those 100 concurrent handlers. So how do we 'bookkeep' this?
In other words, the memory load on an entire system cannot be strictly divided up as 'that memory is 100% part of that app, and this memory is 10)% part of this app'. Shared libraries make that difficult.
The JVM is technically capable of rendering UIs, processing images, opening files both using the synchronous as well as the asynchronous API, and even the random access API if your OS offers a separate access library for it, sending network requests in async mode, in sync mode, and more. In effect, a JVM will immediately tell the OS: I can do allllll these things.
In my experience/recollection, most OSes report the total memory load of a single application as the sum of the memory they need as well as all the memory any (shared) library they load, in full.
That means ps and friends overreport JVMs considerably: The JVM loads in a ton of libraries. This doesn't actually cost RAM (The OS also loaded these libraries, the JVM doesn't use any large DLLs/.SO/.JNILIB files of its own, just hooks up the ones the OS provides, pretty much all of them), but is often 'bookkept' as such. You know this is happening if this trivial app:
class Test { public static void main(String[] args) throws Exception {
System.out.println("Hello!");
Thread.sleep(100000L);
}}
Already takes more than ~60MB or so.
I mean, if I have 12GB of memory, can I run 40 of these (# 300MB)
That shared library stuff means each VM's memory load according to ps and friends are over-inflated by however much the shared libraries 'cost', because each JVM is going to share that library - the OS only loads it once, not 40 times.
Stacks and permgen
The 'heap', which is where newly created objects go, is the largest chunk of any JVM's memory load. It's also generally the only one JVM introspection tools like jconsole show you. However, it's not the only memory a JVM needs. There's a small slice it needs for its core self (the 'C code', so to speak). Each active thread has a stack and each stack also needs memory. By default it's whatever you pass to -Xss, but times the number of concurrent threads. But that's not a certainty: You can construct a new thread with an alternate size (check the constructors of j.l.Thread). There used to be 'permgen' which is where class code lived. Modern JVM versions got rid of it; in general newer JVM versions try to do more and more on heap instead of in magic hard-to-introspect things like permgen.
I mean, if I have 12GB of memory, can I run 40 of these (# 300MB) before I hit OutOfMemoryException?
Run all 40 at once, and always specify both -Xms and -Xmx, setting them to equal sizes. Assuming all those 40 JVMs are relatively stable in terms of how many concurrent threads they ever run, if you're ever going to run into memory issues, it'll happen immediately (due to -Xms and -Xmx being equal you've removed the dynamism from this situation. All JVMs pretty much instaclaim all the memory they will ever claim, so it either 'works' or it won't. Stacks mess with the cleanliness of this somewhat, hence the caveat of stable-ish thread counts).

Analyze/track down potential native memory leak in JVM

We're running an application on Linux using Java 1.6 (OpenJDK as well as Oracle JDK). The JVM itself has a maximum of 3.5 GB heap and 512 MB permgen space. However, after running a while top reports the process is using about 8 GB of virtual memory and smem -s swap p reports about 3.5 GB being swapped.
After running a bigger import of thousands of image files on one server, almost no swap space is left and calls to native applications (in our case Im4java calls to Image Magick) fail due to the OS failing to allocate memory for those applications.
In another case the swap space filled over the course of several weeks resulting in the OS killing the JVM due to being out of swap space.
I understand that the JVM will need more than 4 GB of memory for heap (max 3.5 GB), permgen (max 512 MB), code cache, loaded libraries, JNI frames etc.
The problem I'm having is how to find out what is actually using how much of the memory. If the JVM was out of heap memory, I'd get a dump which I could analyze, but in our case it's the OS memory that is eaten up and thus the JVM doesn't generate a dump.
I know there's jrcmd for JRockit, but unfortunately we can't just switch the JVM.
There also seem to be a couple of libraries that allow to track native memory usage but most of those seem to need native code to be recompiled - and besides Im4java (which AFAIK just runs a native process, we don't use DLL/SO-integration here) and the JVM there's no other native code involved that we know of.
Besides that, we can't use a library/tool that might have a huge impact on performance or stability in order to track memory usage on a production system over a long period (several weeks).
So the question is:
How can we get information on what the JVM is actually needing all that memory for, ideally with some detailed information?
You may find references to "zlib/gzip" (pdf handling or http encoding since Java 7), "java2d" or "jai" when replacing memory allocator (jemalloc or tcmalloc) in JVM.
But to really diagnose native memory leak, JIT code symbol mapping and Linux recent profiling tools are required: perf, perf-map-agent and bcc.
Please refer to details in related answer https://stackoverflow.com/a/52767721/737790
Many thanks to Brendan Gregg

How to improve the amount of memory used by Jboss?

I have a Java EE application running on jboss-5.0.0.GA. The application uses BIRT report tool to generate several reports.
The server has 4 cores of 2.4 Ghz and 8 Gb of ram.
The startup script is using the next options:
-Xms2g -Xmx2g -XX:MaxPermSize=512m
The application has reached some stability with this configuration, some time ago I had a lot of crashes because of the memory was totally full.
Rigth now, the application is not crashing, but memory is always fully used.
Example of top command:
Mem: 7927100k total, 7874824k used, 52276k free
The java process shows a use of 2.6g, and this is the only application running on this server.
What can I do to ensure an amount of free memory?
What can I do to try to find a memory leak?
Any other suggestion?
TIA
Based in answer by mezzie:
If you are using linux, what the
kernel does with the memory is
different with how windows work. In
linux, it will try to use up all the
memory. After it uses everything, it
will then recycle the memory for
further use. This is not a memory
leak. We also have jboss tomcat on our
linux server and we did research on
this issue a while back.
I found more information about this,
https://serverfault.com/questions/9442/why-does-red-hat-linux-report-less-free-memory-on-the-system-than-is-actually-ava
http://lwn.net/Articles/329458/
And well, half memory is cached:
total used free shared buffers cached
Mem: 7741 7690 50 0 143 4469
If you are using linux, what the kernel does with the memory is different with how windows work. In linux, it will try to use up all the memory. After it uses everything, it will then recycle the memory for further use. This is not a memory leak. We also have jboss tomcat on our linux server and we did research on this issue a while back.
I bet those are operating system mem values, not Java mem values. Java uses all the memory up to -Xmx and then starts to garbage collect, to vastly oversimplify. Use jconsole to see what the real Java memory usage is.
To make it simple, the JVM's max amount of memory us is equal to MaxPermGen (permanently used as your JVM is running. It contains the class definitions, so it should not grow with the load of your server) + Xmx (max size of the object heap, which contains all instances of the objects currently running in the JVM) + Xss (Thread stacks space, depending on the number of threads running in you JVM, which can most of the time be limited for a server) + Direct Memory Space (set by -XX:MaxDirectMemorySize=xxxx)
So do the math.If you want to be sure you have free memory left, you will have to limit the MaxPermGen, the Xmx and the number of threads allowed on your server.
Risk is, if the load on your server grows, you can get an OutOfMemoryError...

Why does the Sun JVM continue to consume ever more RSS memory even when the heap, etc sizes are stable?

Over the past year I've made huge improvements in my application's Java heap usage--a solid 66% reduction. In pursuit of that, I've been monitoring various metrics, such as Java heap size, cpu, Java non-heap, etc. via SNMP.
Recently, I've been monitoring how much real memory (RSS, resident set) by the JVM and am somewhat surprised. The real memory consumed by the JVM seems totally independent of my applications heap size, non-heap, eden space, thread count, etc.
Heap Size as measured by Java SNMP
Java Heap Used Graph http://lanai.dietpizza.ch/images/jvm-heap-used.png
Real Memory in KB. (E.g.: 1 MB of KB = 1 GB)
Java Heap Used Graph http://lanai.dietpizza.ch/images/jvm-rss.png
(The three dips in the heap graph correspond to application updates/restarts.)
This is a problem for me because all that extra memory the JVM is consuming is 'stealing' memory that could be used by the OS for file caching. In fact, once the RSS value reaches ~2.5-3GB, I start to see slower response times and higher CPU utilization from my application, mostly do to IO wait. As some point paging to the swap partition kicks in. This is all very undesirable.
So, my questions:
Why is this happening? What is going on "under the hood"?
What can I do to keep the JVM's real memory consumption in check?
The gory details:
RHEL4 64-bit (Linux - 2.6.9-78.0.5.ELsmp #1 SMP Wed Sep 24 ... 2008 x86_64 ... GNU/Linux)
Java 6 (build 1.6.0_07-b06)
Tomcat 6
Application (on-demand HTTP video streaming)
High I/O via java.nio FileChannels
Hundreds to low thousands of threads
Low database use
Spring, Hibernate
Relevant JVM parameters:
-Xms128m
-Xmx640m
-XX:+UseConcMarkSweepGC
-XX:+AlwaysActAsServerClassMachine
-XX:+CMSIncrementalMode
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCApplicationStoppedTime
-XX:+CMSLoopWarn
-XX:+HeapDumpOnOutOfMemoryError
How I measure RSS:
ps x -o command,rss | grep java | grep latest | cut -b 17-
This goes into a text file and is read into an RRD database my the monitoring system on regular intervals. Note that ps outputs Kilo Bytes.
The Problem & Solutions:
While in the end it was ATorras's answer that proved ultimately correct, it kdgregory who guided me to the correct diagnostics path with the use of pmap. (Go vote up both their answers!) Here is what was happening:
Things I know for sure:
My application records and displays data with JRobin 1.4, something I coded into my app over three years ago.
The busiest instance of the application currently creates
Over 1000 a few new JRobin database files (at about 1.3MB each) within an hour of starting up
~100+ each day after start-up
The app updates these JRobin data base objects once every 15s, if there is something to write.
In the default configuration JRobin:
uses a java.nio-based file access back-end. This back-end maps MappedByteBuffers to the files themselves.
once every five minutes a JRobin daemon thread calls MappedByteBuffer.force() on every JRobin underlying database MBB
pmap listed:
6500 mappings
5500 of which were 1.3MB JRobin database files, which works out to ~7.1GB
That last point was my "Eureka!" moment.
My corrective actions:
Consider updating to the latest JRobinLite 1.5.2 which is apparently better
Implement proper resource handling on JRobin databases. At the moment, once my application creates a database and then never dumps it after the database is no longer actively used.
Experiment with moving the MappedByteBuffer.force() to database update events, and not a periodic timer. Will the problem magically go away?
Immediately, change the JRobin back-end to the java.io implementation--a line line change. This will be slower, but it is possibly not an issue. Here is a graph showing the immediate impact of this change.
Java RSS memory used graph http://lanai.dietpizza.ch/images/stackoverflow-rss-problem-fixed.png
Questions that I may or may not have time to figure out:
What is going on inside the JVM with MappedByteBuffer.force()? If nothing has changed, does it still write the entire file? Part of the file? Does it load it first?
Is there a certain amount of the MBB always in RSS at all times? (RSS was roughly half the total allocated MBB sizes. Coincidence? I suspect not.)
If I move the MappedByteBuffer.force() to database update events, and not a periodic timer, will the problem magically go away?
Why was the RSS slope so regular? It does not correlate to any of the application load metrics.
Just an idea: NIO buffers are placed outside the JVM.
EDIT:
As per 2016 it's worth considering #Lari Hotari comment [ Why does the Sun JVM continue to consume ever more RSS memory even when the heap, etc sizes are stable? ] because back to 2009, RHEL4 had glibc < 2.10 (~2.3)
Regards.
RSS represents pages that are actively in use -- for Java, it's primarily the live objects in the heap, and the internal data structures in the JVM. There's not much that you can do to reduce its size except use fewer objects or do less processing.
In your case, I don't think it's an issue. The graph appears to show 3 meg consumed, not 3 gig as you write in the text. That's really small, and is unlikely to be causing paging.
So what else is happening in your system? Is it a situation where you have lots of Tomcat servers, each consuming 3M of RSS? You're throwing in a lot of GC flags, do they indicate the process is spending most of its time in GC? Do you have a database running on the same machine?
Edit in response to comments
Regarding the 3M RSS size - yeah, that seemed too low for a Tomcat process (I checked my box, and have one at 89M that hasn't been active for a while). However, I don't necessarily expect it to be > heap size, and I certainly don't expect it to be almost 5 times heap size (you use -Xmx640) -- it should at worst be heap size + some per-app constant.
Which causes me to suspect your numbers. So, rather than a graph over time, please run the following to get a snapshot (replace 7429 by whatever process ID you're using):
ps -p 7429 -o pcpu,cutime,cstime,cmin_flt,cmaj_flt,rss,size,vsize
(Edit by Stu so we can have formated results to the above request for ps info:)
[stu#server ~]$ ps -p 12720 -o pcpu,cutime,cstime,cmin_flt,cmaj_flt,rss,size,vsize
%CPU - - - - RSS SZ VSZ
28.8 - - - - 3262316 1333832 8725584
Edit to explain these numbers for posterity
RSS, as noted, is the resident set size: the pages in physical memory. SZ holds the number of pages writable by the process (the commit charge); the manpage describes this value as "very rough". VSZ holds the size of the virtual memory map for the process: writable pages plus shared pages.
Normally, VSZ is slightly > SZ, and very much > RSS. This output indicates a very unusual situation.
Elaboration on why the only solution is to reduce objects
RSS represents the number of pages resident in RAM -- the pages that are actively accessed. With Java, the garbage collector will periodically walk the entire object graph. If this object graph occupies most of the heap space, then the collector will touch every page in the heap, requiring all of those pages to become memory-resident. The GC is very good about compacting the heap after each major collection, so if you're running with a partial heap, there most of the pages should not need to be in RAM.
And some other options
I noticed that you mentioned having hundreds to low thousands of threads. The stacks for these threads will also add to the RSS, although it shouldn't be much. Assuming that the threads have a shallow call depth (typical for app-server handler threads), each should only consume a page or two of physical memory, even though there's a half-meg commit charge for each.
Why is this happening? What is going on "under the hood"?
JVM uses more memory than just the heap. For example Java methods, thread stacks and native handles are allocated in memory separate from the heap, as well as JVM internal data structures.
In your case, possible causes of troubles may be: NIO (already mentioned), JNI (already mentioned), excessive threads creation.
About JNI, you wrote that the application wasn't using JNI but... What type of JDBC driver are you using? Could it be a type 2, and leaking? It's very unlikely though as you said database usage was low.
About excessive threads creation, each thread gets its own stack which may be quite large. The stack size actually depends on the VM, OS and architecture e.g. for JRockit it's 256K on Linux x64, I didn't find the reference in Sun's documentation for Sun's VM. This impacts directly the thread memory (thread memory = thread stack size * number of threads). And if you create and destroy lots of thread, the memory is probably not reused.
What can I do to keep the JVM's real memory consumption in check?
To be honest, hundreds to low thousands of threads seems enormous to me. That said, if you really need that much threads, the thread stack size can be configured via the -Xss option. This may reduce the memory consumption. But I don't think this will solve the whole problem. I tend to think that there is a leak somewhere when I look at the real memory graph.
The current garbage collector in Java is well known for not releasing allocated memory, although the memory is not required anymore. It's quite strange however, that your RSS size increases to >3GB although your heap size is limited to 640MB. Are you using any native code in your application or are you having the native performance optimization pack for Tomcat enabled? In that case, you may of course have a native memory leak in your code or in Tomcat.
With Java 6u14, Sun introduced the new "Garbage-First" garbage collector, which is able to release memory back to the operating system if it's not required anymore. It's still categorized as experimental and not enabled by default, but if it is a feasible option for you, I would try to upgrade to the newest Java 6 release and enable the new garbage collector with the command line arguments "-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC". It might solve your problem.

Java memory usage with native processes

What is the best way to tune a server application written in Java that uses a native C++ library?
The environment is a 32-bit Windows machine with 4GB of RAM. The JDK is Sun 1.5.0_12.
The Java process is given 1024MB of memory (-Xmx) at startup but I often see OutOfMemoryErrors due to lack of heap space. If the memory is increased to 1200MB, the OutOfMemoryErrors occur due to lack of swap space. How is the memory shared between the JVM and the native process?
Does the Windows /3GB switch have any effect with native processes and Sun JVM?
I had lots of trouble with that setting (Java on 32-bit systems - msw and others) and they were all solved by reserving just *under 1GB of RAM to the JVM.
Otherwise, as stated, the actual occupied memory in the system for that process would be over 2GB; at that point I was having 'silent deaths' of the process - no errors, no warnings, just the process terminating very quietly.
I got more stability and performance running several JVM (each with under 1GB RAM) on the same system.
I found some info on JNI memory management here, and here's the JVM JNI section on memory management.
Well having a 3GB user space over a 2GB user space should help, but if your having problems running out of swap space at 2GB, I think 3GB is just going to make it worse. How big is your pagefile? Is it maxed out?
You can get a better idea on you heap allocation by hooking up jconsole to your jvm.
How is the memory shared between the JVM and the native process?
Sun's JVM's garbage collector is mark-and-sweep, with options to enable concurrent and incremental GC.
Well, more accurately, it's staged, and the above only applies to tenured (long-lived) objects. For young objects, GC is still done with a stop-and-copy collector, which is much better for working with short-lived objects (and all typical Java programs create many short-lived objects).
A copying collector walks over all elements in the heap, copying them to a new heap if they are referenced, and then discards the former heap. Thus 1M of live objects requires up to 2M of real memory: if every object is alive, there will be two copies of everything during garbage collection.
So the JVM requires far more system memory than is available to the code running within the VM, because there is a substantial overhead to management and garbage collection.
Does the Windows /3GB switch have any effect with native processes and Sun JVM?
The /3GB allows user virtual memory address space to be 3GB, but only for executables whose headers are marked with IMAGE_FILE_LARGE_ADDRESS_AWARE. As far as I am aware, Sun's java.exe is not. I don't have a Windows system here, so I can't verify.
You haven't explained your problem well enough, unfortunately. The real question is --- why is the Java process growing so much. Do you have a memory leak? Do you have a real reason to have that much data in the JVM?
Is the C++ library allocating its own memory from the C stack, or is it allocating memory from the Java object space, or is it doing something else entirely?

Categories