Java FileChannel missing unmap ( RAM consequences ? )

Java FileChannel missing unmap ( RAM consequences ? ) - java

I'm creating/using memory mapped files in FileChannel.MapMode.READ_WRITE mode in my application. Thoses files are created and deleted throughout the life cycle of the application.
As the GC does not necessarly free the direct buffers to _ unmap _ the underlying OS buffers I'm wondering what are the consequences in the OS and more specifically about the RAM usage.
I understand about the "Virtual Memory" of the process still being polluted by unecessary mappings but what are the consequences on the actual RAM usage (I guess buffers in "Resident Memory" are flushed over time).
It seems the process can OOM (out-of-memory) at OS level (crash JVM) - not Java OOM (out-of-memory) (still plenty of space in the heap).
I'm on a Linux 64bits (3.13.0-68-generic / Ubuntu) box and using the Oracle JRE 1.8.0_66-b17.

Address space is a resource that you can run out of independently (somewhat) of usable memory.
With disk-backed file mappings, you only consume as much memory as you have page cache (cached reads and dirty pages awaiting write). But the reserved address space is the size of the whole mapping.
You also consume file handles, and likely run out of those first.
Java does munmap mappings when they are GCed -- however, that means that it happens on the GC's schedule not yours. This usually is okay, as long as it releases address space quicker than you allocate it. But much like file descriptors or any other finite resource, you definitely can run out.
With 64 bits, that'd take a while. It's much more of an issue on 32 bit systems.
There's lots of calls for improved control of mappings, since relying on finalizers isn't great. But it's really hard to do better without compromising performance or security. To quote Sun's evaluation of the problem
There is no unmap() method on mapped byte buffers because there is no known
technique for providing one without running into insurmountable security and
performance issues.

Related

out-of-memory error -- why not paging?

Out-of-memory error occurs frequently in the java programs. My question is simple: when exceeding the memory limitation, why java directly kill the program rather than swap it out to the disk? I think memory paging/swapping strategy is frequently used in the modern operating system and programming languages like c++ definitely supports swapping. Thanks.

#Pimgd is sorta on track: but #Kayaman is right. Java doesn't handle memory besides requesting it from the system. C++ doesn't support swapping, it requests memory from the OS and the OS will do the swapping. If you request enough memory for your application with -Xmx, it might start swapping because the OS thinks it can.

Because Java is cross-platform. There might not be a disk.
Other reasons could be that such a thing would affect performance and the developers didn't want that to happen (because Java already carries a performance overhead?).

A few words about paging. Virtual memory using paging - storing 4K (or similar) chunks of any program that runs on a system - is something an operating system can or cannot do. The promise of an address space only limited by the capacity of a machine word used to store an address sounds great, but there's a severe downside, which is called thrashing. This happens when the number of page (re)loads exceeds a certain frequency, which in turn is due of too many processes requesting too much memory in combination with non-locality of memory accesses of those processes. (A process has a good locality if it can execute long stretches of code while accessing only a small percentage of its pages.)
Paging also requires (fast) secondary storage.
The ability to limit your program's memory resources (as in Java) is not only a burden; it must also be seen as a blessing when some overall plan for resource usage needs to be devised for a, say, server system.

Burst memory usage in Java

I am trying to get a handle on proper memory usage and garbage collection in Java. I'm not a novice programmer by any means, but it always seems to me that once Java touches some memory, it will never be released for other applications to use. In that case, you have to make sure your peak memory is never too high, or your application will continually use whatever the peak memory usage was.
I wrote a small sample program trying to demonstrate this. It basically has 4 buttons...
Fill class scope variable BigList = new ArrayList<string>() with about 25,000,000 long string items.
Call BigList.clear()
Reallocate the list - BigList = new ArrayList<string>() again (to shrink the list size)
A call to System.gc() - Yes, I know this doesn't mean that GC will really run, but it's what we have.
So next I did some testing on Windows, Linux, and Mac OS while using the default task monitors to check on the processes reported memory usage. Here is what I found...
Windows - Pumping the list, calling clear, and then calling GC several times will not reduce memory usage at all. However, reallocating the list using new and then calling GC several times will reduce the memory usage back to starting levels. IMO, this is acceptable.
Linux (I used Mint 11 distro with Sun JVM) - Same results as Windows.
Mac OS - I followed the sames steps as above, but even when reinitializing the list calls to GC seemingly have no effect. The program will sit using hundreds of MB of RAM even though I have nothing in memory.
Can anyone explain this to me? Some people have told me some stuff about "heap" memory, but I still don't fully understand it and I'm not sure it applies here. From what I have heard about it, I shouldn't be seeing the behavior I am on Windows and Linux anyways.
Is this just a difference in the way Mac OS's Activity Monitor measures memory usage or is there something else going on? I would prefer to not have my program idling with tons of RAM usage. Thanks for your insight.

The Sun/Oracle JVM does not return unneeded memory to the system. If you give it a large, maximum heap size, and you actually use that heap space at some point, the JVM won't give it back to the OS for other uses. Other JVMs will do that (JRockit used to, but I don't think it does any more).
So, for Oracles JVM you need to tune your app and your system for peak usage, that's just how it works. If the memory that you're using can be managed with byte arrays (such as working with images or something), then you can use mapped byte buffers instead of Java byte arrays. Mapped byte buffers are taken straight from the system, and are not part of the heap. When you free up these objects (AND they are GC'd, I believe, but not sure), the memory will be returned to the system. You'll likely have to play with that one assuming it's even applicable at all.

... but it always seems to me that once Java touches some memory, it's gone forever. You will never get it back.
It depends on what you mean by "gone forever".
I've also heard it said that some JVMs do give memory back to the OS when they are ready and able to. Unfortunately, given the way that the low-level memory APIs typically work, the JVM has to give back entire segments, and it tends to be complicated to "evacuate" a segment so that it can be given back.
But I wouldn't rely on that ... because there are various things that could prevent the memory being given back. The chances are that the JVM won't give the memory back to the OS. But it is not "gone forever" in the sense that the JVM will continue to make use of it. Even if the JVM never approaches the peak usage again, all of that memory will help to make the garbage collector run more efficiently.
In that case, you have to make sure your peak memory is never too high, or your application will continually eat up hundreds of MB of RAM.
That is not true. Assuming that you are adopting the strategy of starting with a small heap and letting it grow, the JVM won't ask for significantly more memory than the peak memory. The JVM won't continually eat up more memory ... unless your application has a memory leak and (as a result) its peak memory requirement has no bound.
(The OP's comments below indicate that this is not what he was trying to say. Even so, it is what he did say.)
On the topic of garbage collection efficiency, we can model the cost of a run of an efficient garbage collector as:
cost ~= (amount_of_live_data * W1) + (amount_of_garbage * W2)
where W1 and W2 are (we assume) constants that depend on the collector. (Actually, this is an over-simplification. The first part is not a linear function of the number of live objects. However, I claim that it doesn't matter for the following.)
The efficiency of the collector can then be stated as:
efficiency = cost / amount_of_garbage_collected
which (if we assume that the GC collects all data) expands to
efficiency ~= (amount_of_live_data * W1) / amount_of_garbage + W2.
When the GC runs,
heap_size ~= amount_of_live_data + amount_of_garbage
so
efficiency ~= W1 * (amount_of_live_data / (heap_size - amount_of_live_data) )
+ W2.
In other words:
as you increase the heap size, the efficiency tends to a constant (W2), but
you need a large ratio of heap_size to amount_of_live_data for this to happen.
The other point is that for an efficient copying collector, W2 covers just the cost of zeroing the space occupied by the garbage objects in 'from space'. The rest (tracing, copying of live objects to 'to space", and zeroing the 'from space' that they occupied) is part of the first term of the initial equation; i.e. covered by W1. What this means is that W2 is likely to be considerably smaller than W1 ... and that the first term of the final equation is significant for longer.
Now obviously this is a theoretical analysis, and the cost model is a simplification of how real garbage collectors really work. (And it doesn't take account of the "real" work that the application is doing, or the system-level effects of tying down too much memory.) However, the maths tells me that from the standpoint of GC efficiency, a big heap really does help a lot.

Some JVMs do not or are not able to release previously acquired memory back to the host OS if it isn't needed atm. This is because it's a costly and complex task. The garbage collector only applies to the heap memory within the Java virtual machine. Therefore it does not give back (free() in C terms) memory to the OS. E.g. if a big object isn't used any more, the memory will be marked as free within the heap of the JVM by the GC and not released to OS.
However, the situation is changing, for example ZGC will return memory to the operating system.

Once the program terminates, is the memory usage getting down in taskmanager in windows ? I think the memory is getting released but not shown as released by the default task monitors in the OS you are monitoring. Go through this question on C++ Problem with deallocating vector of pointers

A common misconception is that Java uses up memory as it runs and there for it should be able to return memory to the OS. Actually the Oracle/Sun JVM reserves the virtual memory as a continuous block of memory as soon as it starts. If the isn't enough continuous virtual memory available it fails on start up even if the program isn't going to use that much.
What then happens is the OS is smart enough not to allocate physical memory to the program until it is used. It cannot easily reclaim the memory but it can be swapped to disk if it needs to and it hasn't been used for a while. Java doesn't handle having parts of the heap swapped to disk very well so this should be avoided.

Java allocate memory only to objects. There is no explicit allocation of memory. In-fact Java even treats array types as objects. Each time an object created it comes in heap.
The Java runtime employs a garbage collector that reclaims the memory occupied by an object once it determines that object is no longer accessible. This is automatic process.
Calling System.gc() may not collect garbage at the time you call it; thats why your memory is not reduced. In general, it is better to let the system decide when it needs to collect the heap, and whether or not to do a full collection.
System.gc() doesn't even force a garbage collection; it's simply a hint to the JVM that "now may be a good time to clean up a bit"
Java memory explained here link2

There are some great documents produced by Sun/Oracle describing Java's Garbage Collection. A quick search on "Java Garbage Collection Tuning" yeilds results such as;
http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html
and
http://java.sun.com/docs/hotspot/gc1.4.2/
The introduction of the Oracle doc states;
The Java TM 2 Platform Standard Edition (J2SE TM platform) is used for
a wide variety of applications from small applets on desktops to web
services on large servers. In the J2SE platform version 1.4.2 there
were four garbage collectors from which to choose but without an
explicit choice by the user the serial garbage collector was always
chosen. In version 5.0 the choice of the collector is based on the
class of the machine on which the application is started.
This “smarter choice” of the garbage collector is generally better but
is not always the best. For the user who wants to make their own
choice of garbage collectors, this document will provide information
on which to base that choice. This will first include the general
features of the garbage collections and tuning options to take the
best advantage of those features. The examples are given in the
context of the serial, stop-the-world collector. Then specific
features of the other collectors will be discussed along with factors
that should considered when choosing one of the other collectors.
They describe the various types of collectors available and the situations in which they should be used. I remember using this alongside JConsole to montior how the application performed when started with various different options.
These docs will give you a bit more insight into how collection occurs depending on the parameters you are using.

I ran into this problem on Windows and have found a solution, so I'm posting it as an answer in case it can help others.
A lot of answers on here suggest that Java's behavior is 1. good and/or 2. an unavoidable consequence of garbage collecting. These are both false.
The Problem:
If you are like me and you want to write Java to write small applications for a workstation or even run multiple smaller processes on a server, then Oracle's JVM memory allocation behavior makes it almost completely useless. Even when running with -client, every JVM process hoards memory once allocated and never gives it back. This behavior cannot be disabled. As the OP notices: each jvm process holds on to its unused memory indefinitely even if it will never use it again and even while other jvm processes are starving. This inexplicable behavior makes Oracle's a useless implementation for all but monolithic, single-application scenarios.
Also: this is NOT a consequence of garbage collection. Witness .Net applications which run on Windows, use garbage collection, and do not suffer from this problem at all.
The Solution:
The solution I found to this was to use the IKVM.NET JVM which you use as a drop-in replacement for java.exe on windows. It compiles Java bytecode to .Net IL code and runs as a .Net process. It also contains utilities to convert .jar files into .Net .dll and .exe assemblies. The performance is often better than Oracle's JVM and after a GC, memory is instantly returned to the OS. (Note: this also works in Linux with Mono)
To be clear, I still rely on Oracle's JVM for all but my small applications and also to debug my small applications, but once stable, I use ikvm to run them as if they were native windows applications and this works so well, I've been amazed. It has numerous beneficial side effects. Once compiled, DLLs shared between processes are loaded only once, and applications show up in the task manager as .exe instead of all showing as javaw.exe.
Unfortunately, not everyone can use ikvm to solve this problem, but I hope this helps those in my situation.

Java: why does it uses a fixed amount of memory? or how does it manage the memory?

It seems that the JVM uses some fixed amount of memory. At least I have often seen parameters -Xmx (for the maximum size) and -Xms (for the initial size) which suggest that.
I got the feeling that Java applications don't handle memory very well. Some things I have noticed:
Even some very small sample demo applications load huge amounts of memory. Maybe this is because of the Java library which is loaded. But why is it needed to load the library for each Java instance? (It seems that way because multiple small applications linearly take more memory. See here for some details where I describe this problem.) Or why is it done that way?
Big Java applications like Eclipse often crash with some OutOfMemory exception. This was always strange because there was still plenty of memory available on my system. Often, they consume more and more memory over runtime. I'm not sure if they have some memory leaks or if this is because of fragmentation in the memory pool -- I got the feeling that the latter is the case.
The Java library seem to require much more memory than similar powerful libraries like Qt for example. Why is this? (To compare, start some Qt applications and look at their memory usage and start some Java apps.)
Why doesn't it use just the underlying system technics like malloc and free? Or if they don't like the libc implementation, they could use jemalloc (like in FreeBSD and Firefox) which seems to be quite good. I am quite sure that this would perform better than the JVM memory pool. And not only perform better, also require less memory, esp. for small applications.
Addition: Does somebody have tried that already? I would be much interested in a LLVM based JIT-compiler for Java which just uses malloc/free for memory handling.
Or maybe this also differs from JVM implementation to implementation? I have used mostly the Sun JVM.
(Also note: I'm not directly speaking about the GC here. The GC is only responsible to calculate what objects can be deleted and to initialize the memory freeing but the actual freeing is a different subsystem. Afaik, it is some own memory pool implementation, not just a call to free.)
Edit: A very related question: Why does the (Sun) JVM have a fixed upper limit for memory usage? Or to put it differently: Why does JVM handle memory allocations differently than native applications?

You need to keep in mind that the Garbage Collector does a lot more than just collecting unreachable objects. It also optimizes the heap space and keeps track of exactly where there is memory available to allocate for the creation of new objects.
Knowing immediately where there is free memory makes the allocation of new objects into the young generation efficient, and prevents the need to run back and forth to the underlying OS. The JIT compiler also optimizes such allocations away from the JVM layer, according to Sun's Jon Masamitsu:
Fast-path allocation does not call
into the JVM to allocate an object.
The JIT compilers know how to allocate
out of the young generation and code
for an allocation is generated in-line
for object allocation. The interpreter
also knows how to do the allocation
without making a call to the VM.
Note that the JVM goes to great lengths to try to get large contiguous memory blocks as well, which likely have their own performance benefits (See "The Cost of Missing the Cache"). I imagine calls to malloc (or the alternatives) have a limited likelihood of providing contiguous memory across calls, but maybe I missed something there.
Additionally, by maintaining the memory itself, the Garbage Collector can make allocation optimizations based on usage and access patterns. Now, I have no idea to what extent it does this, but given that there's a registered Sun patent for this concept, I imagine they've done something with it.
Keeping these memory blocks allocated also provides a safeguard for the Java program. Since the garbage collection is hidden from the programmer, they can't tell the JVM "No, keep that memory; I'm done with these objects, but I'll need the space for new ones." By keeping the memory, the GC doesn't risk giving up memory it won't be able to get back. Naturally, you can always get an OutOfMemoryException either way, but it seems more reasonable not to needlessly give memory back to the operating system every time you're done with an object, since you already went to the trouble to get it for yourself.
All of that aside, I'll try to directly address a few of your comments:
Often, they consume more and more
memory over runtime.
Assuming that this isn't just what the program is doing (for whatever reason, maybe it has a leak, maybe it has to keep track of an increasing amount of data), I imagine that it has to do with the free hash space ratio defaults set by the (Sun/Oracle) JVM. The default value for -XX:MinHeapFreeRatio is 40%, while -XX:MaxHeapFreeRatio is 70%. This means that any time there is only 40% of the heap space remaining, the heap will be resized by claiming more memory from the operating system (provided that this won't exceed -Xmx). Conversely, it will only* free heap memory back to the operating system if the free space exceeds 70%.
Consider what happens if I run a memory-intensive operation in Eclipse; profiling, for example. My memory consumption will shoot up, resizing the heap (likely multiple times) along the way. Once I'm done, the memory requirement falls back down, but it likely won't drop so far that 70% of the heap is free. That means that there's now a lot of underutilized space allocated that the JVM has no intention of releasing. This is a major drawback, but you might be able to work around it by customizing the percentages to your situation. To get a better picture of this, you really should profile your application so you can see the utilized versus allocated heap space. I personally use YourKit, but there are many good alternatives to choose from.
*I don't know if this is actually the only time and how this is observed from the perspective of the OS, but the documentation says it's the "maximum percentage of heap free after GC to avoid shrinking," which seems to suggest that.
Even some very small sample demo
applications load huge amounts of
memory.
I guess this depends on what kind of applications they are. I feel that Java GUI applications run memory-heavy, but I don't have any evidence one way or another. Did you have a specific example that we could look at?
But why is it needed to load the
library for each Java instance?
Well, how would you handle loading multiple Java applications if not creating new JVM processes? The isolation of the processes is a good thing, which means independent loading. I don't think that's so uncommon for processes in general, though.
As a final note, the slow start times you asked about in another question likely come from several intial heap reallocations necessary to get to the baseline application memory requirement (due to -Xms and -XX:MinHeapFreeRatio), depending what the default values are with your JVM.

Java runs inside a Virtual Machine, which constrains many parts of its behavior. Note the term "Virtual Machine." It is literally running as though the machine is a separate entity, and the underlying machine/OS are simply resources. The -Xmx value is defining the maximum amount of memory that the VM will have, while the -Xms defines the starting memory available to the application.
The VM is a product of the binary being system agnostic - this was a solution used to allow the byte code to execute wherever. This is similar to an emulator - say for old gaming systems. It is emulating the "machine" that the game runs on.
The reason why you run into an OutOfMemoryException is because the Virtual Machine has hit the -Xmx limit - it has literally run out of memory.
As far as smaller programs go, they will often require a larger percentage of their memory for the VM. Also, Java has a default starting -Xmx and -Xms (I don't remember what they are right now) that it will always start with. The overhead of the VM and the libraries becomes much less noticable when you start to build and run "real" applications.
The memory argument related to QT and the like is true, but is not the whole story. While it uses more memory than some of those, those are compiled for specific architectures. It has been a while since I have used QT or similar libraries, but I remember the memory management not being very robust, and memory leaks are still common today in C/C++ programs. The nice thing about Garbage Collection is that it removes many of the common "gotchas" that cause memory leaks. (Note: Not all of them. It is still very possible to leak memory in Java, just a bit harder).
Hope this helps clear up some of the confusion you may have been having.

To answer a portion of your question;
Java at start-up allocates a "heap" of memory, or a fixed size block (the -Xms parameter). It doesn't actually use all this memory right off the bat, but it tells the OS "I want this much memory". Then as you create objects and do work in the Java environment, it puts the created objects into this heap of pre-allocated memory. If that block of memory gets full then it will request a little more memory from the OS, up until the "max heap size" (the -Xmx parameter) is reached.
Once that max size is reached, Java will no longer request more RAM from the OS, even if there is a lot free. If you try to create more objects, there is no heap space left, and you will get an OutOfMemory exception.
Now if you are looking at Windows Task Manager or something like that, you'll see "java.exe" using X megs of memory. That sort-of corresponds to the amount of memory that it has requested for the heap, not really the amount of memory inside the heap thats used.
In other words, I could write the application:
class myfirstjavaprog
{
public static void main(String args[])
{
System.out.println("Hello World!");
}
}
Which would basically take very little memory. But if I ran it with the cmd line:
java.exe myfirstjavaprog -Xms 1024M
then on startup java will immediately ask the OS for 1,024 MB of ram, and thats what will show in Windows Task Manager. In actuallity, that ram isnt being used, but java reserved it for later use.
Conversely, if I had an app that tried to create a 10,000 byte large array:
class myfirstjavaprog
{
public static void main(String args[])
{
byte[] myArray = new byte[10000];
}
}
but ran it with the command line:
java.exe myfirstjavaprog -Xms 100 -Xmx 100
Then Java could only alocate up to 100 bytes of memory. Since a 10,000 byte array won't fit into a 100 byte heap, that would throw an OutOfMemory exception, even though the OS has plenty of RAM.
I hope that makes sense...
Edit:
Going back to "why Java uses so much memory"; why do you think its using a lot of memory? If you are looking at what the OS reports, then that isn't what its actually using, its only what its reserved for use. If you want to know what java has actually used, then you can do a heap dump and explore every object in the heap and see how much memory its using.
To answer "why doesn't it just let the OS handle it?", well I guess that is just a fundamental Java question for those that designed it. The way I look at it; Java runs in the JVM, which is a virtual machine. If you create a VMWare instance or just about any other "virtualization" of a system, you usually have to specify how much memory that virtual system will/can consume. I consider the JVM to be similar. Also, this abstracted memory model lets the JVM's for different OSes all act in a similar way. So for example Linux and Windows have different RAM allocation models, but the JVM can abstract that away and follow the same memory usage for the different OSes.

Java does use malloc and free, or at least the implementations of the JVM may. But since Java tracks allocations and garbage collects unreachable objects, they are definitely not enough.
As for the rest of your text, I'm not sure if there's a question there.

Even some very small sample demo applications load huge amounts of memory. Maybe this is because of the Java library which is loaded. But why is it needed to load the library for each Java instance? (It seems that way because multiple small applications linearly take more memory. See here for some details where I describe this problem.) Or why is it done that way?
That's likely due to the overhead of starting and running the JVM
Big Java applications like Eclipse often crash with some OutOfMemory exception. This was always strange because there was still plenty of memory available on my system. Often, they consume more and more memory over runtime. I'm not sure if they have some memory leaks or if this is because of fragmentation in the memory pool -- I got the feeling that the latter is the case.
I'm not entirely sure what you mean by "often crash," as I don't think this has happened to me in quite a long time. If it is, it's likely due to the "maximum size" setting you mentioned earlier.
Your main question asking why Java doesn't use malloc and free comes down to a matter of target market. Java was designed to eliminate the headache of memory management from the developer. Java's garbage collector does a reasonably good job of freeing up memory when it can be freed, but Java isn't meant to rival C++ in situations with memory restrictions. Java does what it was intended to do (remove developer level memory management) well, and the JVM picks up the responsibility well enough that it's good enough for most applications.

The limits are a deliberate design decision from Sun. I've seen at least two other JVM's which does not have this design - the Microsoft one and the IBM one for their non-pc AS/400 systems. Both grows as needed using as much memory as needed.

Java doesn't use a fixed size of memory it is always in the range from -Xms to -Xmx.
If Eclipse crashes with OutOfMemoryError, than it required more memory than granted by -Xmx (a coniguration issue).
Java must not use malloc/free (for object creation) since its memory handling is much different due to garbage collection (GC). GC removes automatically unused objects, which is a benefit compared to be responsible for memory management.
For details on this complex topic see Tuning Garbage Collection

Java memory usage on Linux

I'm running a handfull of Java Application servers that are all running the latest versions of Tomcat 6 and Sun's Java 6 on top of CentOS 5.5 Linux. Each server runs multiple instances of Tomcat.
I'm setting the -Xmx450m -XX:MaxPermSize=192m parameters to control how large the heap and permgen will grow. These settings apply to all the Tomcat instances across all of the Java Application servers, totaling about 70 Tomcat instances.
Here is a typical memory usage of one of those Tomcat instances as reported by Psi-probe
Eden = 13M
Survivor = 1.5M
Perm Gen = 122M
Code Cache = 19M
Old Gen = 390M
Total = 537M
CentOS however is reporting RAM usage for this particular process at 707M (according to RSS) which leaves 170M of RAM unaccounted for.
I am aware that the JVM itself and some of it's dependancy libraries must be loaded into memory so I decided to fire up pmap -d to find out their memory footprint.
According to my calculations that accounts for about 17M.
Next there is the Java thread stack, which is 320k per thread on the 32 bit JVM for Linux.
Again, I use Psi-probe to count the number of threads on that particular JVM and the total is 129 threads. So 129 + 320k = 42M
I've read that NIO uses memory outside of the heap, but we don't use NIO in our applications.
So here I've calculated everything that comes to (my) mind. And I've only accounted for 60M of the "missing" 170M.
What am I missing?

Try using the incremental garbage collector, using the -Xincgc command line option.
It's little more aggressive on the whole GC efforts, and has a special happy little anomaly: it actually hands back some of its unused memory to the OS, unlike the default and other GC choices !
This makes the JVM consume a lot less memory, which is especially good if you're running multiple JVM's on one machine. At the expense of some performance - but you might not notice it. The incgc is a little secret it seems, because noone ever brings it up... It's been there for eons (90's even).

Arnar, In JVM initialization process JVM will allocate a memory (mmap or malloc) of size specified by -Xmx and MaxPermSize,so anyways JVM will allocate 450+192=642m of heap space for application at the start of the JVM process. So java heap space for application is not 537 but its 642m.So now if you do the calculation it will give you your missing memory.Hope it helps.

Java allocates as much virtual memory as it might need up front, however the resident side will be how much you actually use. Note: Many of the libraries and threads have their own over heads and while you don't use direct memory, it doesn't mean none of the underlying system do. e.g. if you use NIO, it will use some direct memory even if you use heap ByteBuffers.
Lastly, 100 MB is worth about £8. It may be that its not worth spending too much time worrying about it.

Not a direct answer, but, have you also considered hosting multiple sites within the same Tomcat instance? This could save you some memory at the expense of some additional configuration.

Arnar, the JVM also mmap's all jar files in use, which will use NIO and will contribute to the RSS. I don't believe those are accounted for in any of your measurements above. Do you by chance have a significant number of large jar files? If so, the pages used for those could be your missing memory.

Why does the Sun JVM continue to consume ever more RSS memory even when the heap, etc sizes are stable?

Over the past year I've made huge improvements in my application's Java heap usage--a solid 66% reduction. In pursuit of that, I've been monitoring various metrics, such as Java heap size, cpu, Java non-heap, etc. via SNMP.
Recently, I've been monitoring how much real memory (RSS, resident set) by the JVM and am somewhat surprised. The real memory consumed by the JVM seems totally independent of my applications heap size, non-heap, eden space, thread count, etc.
Heap Size as measured by Java SNMP
Java Heap Used Graph http://lanai.dietpizza.ch/images/jvm-heap-used.png
Real Memory in KB. (E.g.: 1 MB of KB = 1 GB)
Java Heap Used Graph http://lanai.dietpizza.ch/images/jvm-rss.png
(The three dips in the heap graph correspond to application updates/restarts.)
This is a problem for me because all that extra memory the JVM is consuming is 'stealing' memory that could be used by the OS for file caching. In fact, once the RSS value reaches ~2.5-3GB, I start to see slower response times and higher CPU utilization from my application, mostly do to IO wait. As some point paging to the swap partition kicks in. This is all very undesirable.
So, my questions:
Why is this happening? What is going on "under the hood"?
What can I do to keep the JVM's real memory consumption in check?
The gory details:
RHEL4 64-bit (Linux - 2.6.9-78.0.5.ELsmp #1 SMP Wed Sep 24 ... 2008 x86_64 ... GNU/Linux)
Java 6 (build 1.6.0_07-b06)
Tomcat 6
Application (on-demand HTTP video streaming)
High I/O via java.nio FileChannels
Hundreds to low thousands of threads
Low database use
Spring, Hibernate
Relevant JVM parameters:
-Xms128m
-Xmx640m
-XX:+UseConcMarkSweepGC
-XX:+AlwaysActAsServerClassMachine
-XX:+CMSIncrementalMode
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCApplicationStoppedTime
-XX:+CMSLoopWarn
-XX:+HeapDumpOnOutOfMemoryError
How I measure RSS:
ps x -o command,rss | grep java | grep latest | cut -b 17-
This goes into a text file and is read into an RRD database my the monitoring system on regular intervals. Note that ps outputs Kilo Bytes.
The Problem & Solutions:
While in the end it was ATorras's answer that proved ultimately correct, it kdgregory who guided me to the correct diagnostics path with the use of pmap. (Go vote up both their answers!) Here is what was happening:
Things I know for sure:
My application records and displays data with JRobin 1.4, something I coded into my app over three years ago.
The busiest instance of the application currently creates
Over 1000 a few new JRobin database files (at about 1.3MB each) within an hour of starting up
~100+ each day after start-up
The app updates these JRobin data base objects once every 15s, if there is something to write.
In the default configuration JRobin:
uses a java.nio-based file access back-end. This back-end maps MappedByteBuffers to the files themselves.
once every five minutes a JRobin daemon thread calls MappedByteBuffer.force() on every JRobin underlying database MBB
pmap listed:
6500 mappings
5500 of which were 1.3MB JRobin database files, which works out to ~7.1GB
That last point was my "Eureka!" moment.
My corrective actions:
Consider updating to the latest JRobinLite 1.5.2 which is apparently better
Implement proper resource handling on JRobin databases. At the moment, once my application creates a database and then never dumps it after the database is no longer actively used.
Experiment with moving the MappedByteBuffer.force() to database update events, and not a periodic timer. Will the problem magically go away?
Immediately, change the JRobin back-end to the java.io implementation--a line line change. This will be slower, but it is possibly not an issue. Here is a graph showing the immediate impact of this change.
Java RSS memory used graph http://lanai.dietpizza.ch/images/stackoverflow-rss-problem-fixed.png
Questions that I may or may not have time to figure out:
What is going on inside the JVM with MappedByteBuffer.force()? If nothing has changed, does it still write the entire file? Part of the file? Does it load it first?
Is there a certain amount of the MBB always in RSS at all times? (RSS was roughly half the total allocated MBB sizes. Coincidence? I suspect not.)
If I move the MappedByteBuffer.force() to database update events, and not a periodic timer, will the problem magically go away?
Why was the RSS slope so regular? It does not correlate to any of the application load metrics.

Just an idea: NIO buffers are placed outside the JVM.
EDIT:
As per 2016 it's worth considering #Lari Hotari comment [ Why does the Sun JVM continue to consume ever more RSS memory even when the heap, etc sizes are stable? ] because back to 2009, RHEL4 had glibc < 2.10 (~2.3)
Regards.

RSS represents pages that are actively in use -- for Java, it's primarily the live objects in the heap, and the internal data structures in the JVM. There's not much that you can do to reduce its size except use fewer objects or do less processing.
In your case, I don't think it's an issue. The graph appears to show 3 meg consumed, not 3 gig as you write in the text. That's really small, and is unlikely to be causing paging.
So what else is happening in your system? Is it a situation where you have lots of Tomcat servers, each consuming 3M of RSS? You're throwing in a lot of GC flags, do they indicate the process is spending most of its time in GC? Do you have a database running on the same machine?
Edit in response to comments
Regarding the 3M RSS size - yeah, that seemed too low for a Tomcat process (I checked my box, and have one at 89M that hasn't been active for a while). However, I don't necessarily expect it to be > heap size, and I certainly don't expect it to be almost 5 times heap size (you use -Xmx640) -- it should at worst be heap size + some per-app constant.
Which causes me to suspect your numbers. So, rather than a graph over time, please run the following to get a snapshot (replace 7429 by whatever process ID you're using):
ps -p 7429 -o pcpu,cutime,cstime,cmin_flt,cmaj_flt,rss,size,vsize
(Edit by Stu so we can have formated results to the above request for ps info:)
[stu#server ~]$ ps -p 12720 -o pcpu,cutime,cstime,cmin_flt,cmaj_flt,rss,size,vsize
%CPU - - - - RSS SZ VSZ
28.8 - - - - 3262316 1333832 8725584
Edit to explain these numbers for posterity
RSS, as noted, is the resident set size: the pages in physical memory. SZ holds the number of pages writable by the process (the commit charge); the manpage describes this value as "very rough". VSZ holds the size of the virtual memory map for the process: writable pages plus shared pages.
Normally, VSZ is slightly > SZ, and very much > RSS. This output indicates a very unusual situation.
Elaboration on why the only solution is to reduce objects
RSS represents the number of pages resident in RAM -- the pages that are actively accessed. With Java, the garbage collector will periodically walk the entire object graph. If this object graph occupies most of the heap space, then the collector will touch every page in the heap, requiring all of those pages to become memory-resident. The GC is very good about compacting the heap after each major collection, so if you're running with a partial heap, there most of the pages should not need to be in RAM.
And some other options
I noticed that you mentioned having hundreds to low thousands of threads. The stacks for these threads will also add to the RSS, although it shouldn't be much. Assuming that the threads have a shallow call depth (typical for app-server handler threads), each should only consume a page or two of physical memory, even though there's a half-meg commit charge for each.

Why is this happening? What is going on "under the hood"?
JVM uses more memory than just the heap. For example Java methods, thread stacks and native handles are allocated in memory separate from the heap, as well as JVM internal data structures.
In your case, possible causes of troubles may be: NIO (already mentioned), JNI (already mentioned), excessive threads creation.
About JNI, you wrote that the application wasn't using JNI but... What type of JDBC driver are you using? Could it be a type 2, and leaking? It's very unlikely though as you said database usage was low.
About excessive threads creation, each thread gets its own stack which may be quite large. The stack size actually depends on the VM, OS and architecture e.g. for JRockit it's 256K on Linux x64, I didn't find the reference in Sun's documentation for Sun's VM. This impacts directly the thread memory (thread memory = thread stack size * number of threads). And if you create and destroy lots of thread, the memory is probably not reused.
What can I do to keep the JVM's real memory consumption in check?
To be honest, hundreds to low thousands of threads seems enormous to me. That said, if you really need that much threads, the thread stack size can be configured via the -Xss option. This may reduce the memory consumption. But I don't think this will solve the whole problem. I tend to think that there is a leak somewhere when I look at the real memory graph.

The current garbage collector in Java is well known for not releasing allocated memory, although the memory is not required anymore. It's quite strange however, that your RSS size increases to >3GB although your heap size is limited to 640MB. Are you using any native code in your application or are you having the native performance optimization pack for Tomcat enabled? In that case, you may of course have a native memory leak in your code or in Tomcat.
With Java 6u14, Sun introduced the new "Garbage-First" garbage collector, which is able to release memory back to the operating system if it's not required anymore. It's still categorized as experimental and not enabled by default, but if it is a feasible option for you, I would try to upgrade to the newest Java 6 release and enable the new garbage collector with the command line arguments "-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC". It might solve your problem.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.