One application I have to deal with regularly launches shell helpers using ProcessBuilder. For reasons untold, it still runs on a 32bit JVM (Sun, 1.6.0.25) even though the underlying OS is 64bits (RHEL 5.x for what it's worth).
This application is memory-happy, so the heap size is set to its maximum of 3 GB, and the permgen is 128M.
However... At random moments, shell helpers fail to launch. Not because of an OutOfMemory, but ENOMEM... The only cause I can see for this is lack of address space.
Well, sure, but at the same moment, the memory is not really under pressure and top reports that the actual memory usage of the JVM and its virtual set size, is not even 3 GB...
Looking at what can be looked of the code of Process, I see that the core method is called forkAndExec(), which is pretty much self explanatory... From what I know of both syscalls, it just shouldn't fail. But it does. And not always.
Why?
edit: it should be noted that neo4j is used. It seems to use FileChannel a lot, can that be the cause of lack of address space?
I would decrease the heap size. The amount of heap actually used could be leaving less and less space for the forked process to run (it inherits resources from its parent)
It is highly likely that just upgrading to a 64-bit JVM would fix the problem, Can you try Java 6 update 30 64-bit instead (just to see if it would fix the problem) If it does or does not, it should tell more about what the cause is (and then you can decide if its worth switching)
I think that you are being bitten by Linux memory overcommits killing your processes. That blog post suggest a sysctl variable that you can tune.
Related
I have a Solaris sparc (64-bit) server, which has 16 GB of memory. There are a lot of small Java processes running on it, but today I got the "Could not reserve enough space for object heap" error when trying to launch a new one. I was surprised, since there was still more than 4GB free on the server. The new process was able to successfully launch after some of the other processes were shut down; the system had definitely hit a ceiling of some kind.
After searching the web for an explanation, I began to wonder if it was somehow related to the fact that I'm using the 32-bit JVM (none of the java processes on this server require very much memory).
I believe the default max memory pool is 64MB, and I was running close to 64 of these processes. So that would be 4GB all told ... right at the 32-bit limit. But I don't understand why or how any of these processes would be affected by the others. If I'm right, then in order to run more of these processes I'll either have to tune the max heap to be lower than the default, or else switch to using the 64-bit JVM (which may mean raising the max heap to be higher than the default for these processes). I'm not opposed to either of these, but I don't want to waste time and it's still a shot in the dark right now.
Can anyone explain why it might work this way? Or am I completely mistaken?
If I am right about the explanation, then there is probably documentation on this: I'd very much like to find it. (I'm running Sun's JDK 6 update 17 if that matters.)
Edit: I was completely mistaken. The answers below confirmed my gut instinct that there's no reason why I shouldn't be able to run as many JVMs as I can hold. A little while later I got an error on the same server trying to run a non-java process: "fork: not enough space". So there's some other limit I'm encountering that is not java-specific. I'll have to figure out what it is (no, it's not swap space). Over to serverfault I go, most likely.
I believe the default max memory pool
is 64MB, and I was running close to 64
of these processes. So that would be
4GB all told ... right at the 32-bit
limit.
No. The 32bit limit is per process (at least on a 64bit OS). But the default maximum heap is not fixed at 64MB:
initial heap size: Larger of 1/64th of
the machine's physical memory on the
machine or some reasonable minimum.
maximum heap size: Smaller of 1/4th of
the physical memory or 1GB.
Note: The boundaries and fractions given for the heap size are correct for J2SE 5.0. They are likely to be different in subsequent releases as computers get more powerful.
I suspect the memory is fragmented. Check also Tools to view/solve Windows XP memory fragmentation for a confirmation that memory fragmentation can cause such errors.
Tomcat 5.5.x and 6.0.x
Grails 1.6.x
Java 1.6.x
OS CentOS 5.x (64bit)
VPS Server with memory as 384M
JAVA_OPTS : tried many combinations- including the following
export JAVA_OPTS='-Xms128M -Xmx512M -XX:MaxPermSize=1024m'
export JAVA_OPTS='-server -Xms128M -Xmx128M -XX:MaxPermSize=256M'
(As advised by http://www.grails.org/Deployment)
I have created a blank Grails application i.e simply by giving the command grails create-app and then WARed it
I am running Tomcat on a VPS Server
When I simply start the Tomcat server, with no apps deployed, the free memory is about 236M
and used memory is about 156M
When I deploy my "blank" application, the memory consumption spikes to 360M and finally the Tomcat instance is killed as soon as it takes up all free memory
As you have seen, my app is as light as it can be.
Not sure why the memory consumption is as high it is.
I am actually troubleshooting a real application, but have narrowed down to this scenario which is easier to share and explain.
UPDATE
I tested the same "blank" application on my local Tomcat 5.5.x on Windows and it worked fine
The memory consumption of the Java process shot from 32 M to 107M. But it did not crash and it remained under acceptable limits
So the hunt for answer continues... I wonder if something is wrong about my Linux box. Not sure what though...
UPDATE 2
Also see this http://www.grails.org/Grails+Test+On+Virtual+Server
It confirms my belief that my simple-blank app should work on my configuration.
It is a false economy to try to run a long running Java-based application in the minimal possible memory. The garbage collector, and hence the application will run much more efficiently if it has plenty of regular heap memory. Give an application too little heap and it will spend too much time garbage collecting.
(This may seem a bit counter-intuitive, but trust me: the effect is predictable in theory and observable in practice.)
EDIT
In practical terms, I'd suggest the following approach:
Start by running Tomcat + Grails with as much memory as you can possibly give it so that you have something that runs. (Set the permgen size to the default ... unless you have clear evidence that Tomcat + Grails are exhausting permgen.)
Run the app for a bit to get it to a steady state and figure out what its average working set is. You should be able to figure that out from a memory profiler, or by examining the GC logging.
Then set the Java heap size to be (say) twice the measured working set size or more. (This is the point I was trying to make above.)
Actually, there is another possible cause for your problems. Even though you are telling Java to use heaps of a given size, it may be that it is unable to do this. When the JVM requests memory from the OS, there are a couple of situations where the OS will refuse.
If the machine (real or virtual) that you are running the OS does not have any more unallocated "real" memory, and the OS's swap space is fully allocated, it will have to refuse requests for more memory.
It is also possible (though unlikely) that per-process memory limits are in force. That would cause the OS to refuse requests beyond that limit.
Finally, note that Java uses more virtual memory that can be accounted for by simply adding the stack, heap and permgen numbers together. There is the memory used by the executable + DLLs, memory used for I/O buffers, and possibly other stuff.
384MB is pretty small. I'm running a small Grails app in a 512MB VPS at enjoyvps.net (not affiliated in any way, just a happy customer) and it's been running for months at just under 200MB. I'm running a 32-bit Linux and JDK though, no sense wasting all that memory in 64-bit pointers if you don't have access to much memory anyway.
Can you try deploying a tomcat monitoring webapp e.g. psiprobe and see where the memory is being used?
We have recently been migrating a number of applications from running under RedHat linux JDK1.6.0_03 to Solaris 10u8 JDK1.6.0_16 (much higher spec machines) and we have noticed what seems to be a rather pressing problem: under certain loads our JVMs get themselves into a "Death Spiral" and eventually go out of memory. Things to note:
this is not a case of a memory leak. These are applications which have been running just fine (in one case for over 3 years) and the out-of-memory errors are not certain in any case. Sometimes the applications work, sometimes they don't
this is not us moving to a 64-bit VM - we are still running 32 bit
In one case, using the latest G1 garbage collector on 1.6.0_18 seems to have solved the problem. In another, moving back to 1.6.0_03 has worked
Sometimes our apps are falling over with HotSpot SIGSEGV errors
This is affecting applications written in Java as well as Scala
The most important point is this: the behaviour manifests itself in those applications which suddenly get a deluge of data (usually via TCP). It's as if the VM decides to keep adding more data (possibly progressing it to the TG) rather than running a GC on "newspace" until it realises that it has to do a full GC and then, despite practically everything in the VM being garbage, it somehow decides not to collect it!
It sounds crazy but I just don't see what else it is. How else can you explain an app which one minute falls over with a max heap of 1Gb and the next works just fine (never going about 256M when the app is doing exactly the same thing)
So my questions are:
Has anyone else observed this kind of behaviour?
has anyone any suggestions as to how I might debug the JVM itself (as opposed to my app)? How do I prove this is a VM issue?
Are there any VM-specialist forums out there where I can ask the VM's authors (assuming they aren't on SO)? (We have no support contract)
If this is a bug in the latest versions of the VM, how come no-one else has noticed it?
Interesting problem. Sounds like one of the garbage collectors works poorly on your particular situation.
Have you tried changing the garbage collector being used? There are a LOT of GC options, and figuring out which ones are optimal seems to be a bit of a black art, but I wonder if a basic change would work for you.
I know there is a "Server" GC that tends to work a lot better than the default ones. Are you using that?
Threaded GC (which I believe is the default) is probably the worst for your particular situation, I've noticed that it tends to be much less aggressive when the machine is busy.
One thing I've noticed, it often takes two GCs to convince Java to actually take out the trash. I think the first one tends to unlink a bunch of objects and the second actually deletes them. What you might want to do is occasionally force two garbage collections. This WILL cause a significant GC pause, but I've never seen a case where it took more than two to clean out the entire heap.
I have had the same issue on Solaris machines, and I solved it by decreasing the maximum size of the JVM. The 32 bit Solaris implementation apparently needs some overhead room beyond what you allocate for the JVM when doing garbage collections. So, for example, with -Xmx3580M I'd get the errors you describe, but with -Xmx3072M it would be fine.
Yes, I've observed this behavior before, and usually after countless hours of tweaking JVM parameters it starts working.
Garbage Collection, especially in multithreaded situations is nondeterministic. Defining a bug in nondeterministic code can be a challenge. But you could try DTrace if you are using Solaris, and there are a lot of JVM options for peering into HotSpot.
Go on Scala IRC and see if Ismael Juma is hanging around (ijuma). He's helped me before, but I think real in-depth help requires paying for it.
I think most people doing this kind of stuff accept that they either need to be JVM tuning experts, have one on staff, or hire a consultant. There are people who specialize in JVM tuning.
In order to solve these problems I think you need to be able to replicate them in a controlled environment where you can precisely duplicate runs with different tuning parameters and/or code changes. If you can't do that hiring an expert probably isn't going to do you any good, and the cheapest way out of the problem is probably buying more RAM.
What kind of OutOfMemoryError are you getting? Is the heap space exhausted or is the problem related to any of the other memory pools (the Error usually have a message giving more details on its cause).
If the heap is exhausted and the problem can be reproduced (it sounds as if it can), I would first of all configure the VM to produce a heap dump on OutOfMemoryErrors. You can then analyze the heap and make sure that it's not filled with objects, which are still reachable through some unexpected references.
It's of course not impossible that you are running into a VM bug, but if your application is relying on implementation specific behaviour in 1.6.0_03, it may for some reason or another end up as a memory hog when running on 1.6.0_16. Such problems may also be found if you are using some kind of server container for your application. Some developers are obviously unable to read documentation, but tend to observe the API behaviour and make their own conclusions about how something is supposed to work. This is of course not always correct and I've ran into similar problems both with Tomcat and with JBoss (both products at least used to work only with specific VMs).
Also make sure it's not a hardware fault (try running MemTest86 or similar on the server.)
Which kind of SIGSEV errors exactly do you encounter?
If you run a 32bit VM, it could be what I described here: http://janvanbesien.blogspot.com/2009/08/mysterious-jvm-crashes-explained.html
We ship Java applications that are run on Linux, AIX and HP-Ux (PA-RISC). We seem to struggle to get acceptable levels of performance on HP-Ux from applications that work just fine in the other two environments. This is true of both execution time and memory consumption.
Although I'm yet to find a definitive article on "why", I believe that measuring memory consumption using "top" is a crude approach due to things like the shared code giving misleading results. However, it's about all we have to go on with a customer site where memory consumption on HP-Ux has become an issue. It only became an issue this time when we moved from Java 1.4 to Java 1.5 (on HP-Ux 11.23 PA-RISC). By "an issue", I mean that the machine ceased to create new processes because we had exhausted all 16GB of physical memory.
By measuring "before" and "after" total "free memory" we are trying to gauge how much has been consumed by a Java application. I wrote a quick app that stores 10,000 random 64 bit strings in an ArrayList and tried this approach to measuring consumption on Linux and HP-Ux under Java 1.4 and Java 1.5.
The results:
HP Java 1.4 ~60MB
HP Java 1.5 ~150MB
Linux Java 1.4 ~24MB
Linux Java 1.5 ~16MB
Can anyone explain why these results might arise? Is this some idiosyncrasy of the way "top" measures free memory? Does Java 1.5 on HP really consume 2.5 times more memory than Java 1.4?
Thanks.
The JVMs might just have different default parameters. The heap will grow to the size that you have configured to let it. The default on the Sun VM is a certain percentage of the RAM in the machine - that's to say that Java will, by default, use more memory if you use a machine with more memory on it.
I'd be really surprised if the HP-UX VM hadn't had lots of tuning for this sort of thing by HP. I'd suggest you fiddle with the parameters on both - figure out what the smallest max heap size you can use without hurting performance or throughput.
I don't have a HP box right now to test my hypothesis. However, if I were you, I would use a profiler like JConsole(comes with JDK) OR yourkit to measure what is happening.
However, it appears that you started measuring after you saw something amiss; So, I'm NOT discounting that it's happening -- just pointing you at something I'd have done in the same situation.
First, it's not clear what did you measure by "10,000 random 64 bit strings" test. You supposed to start the application, measure it's bootstrap memory footprint, and then run your test. It could easily be that Java 1.5 acquires more heap right after start (due to heap manager settings, for instance).
Second, we do run Java apps under 1.4, 1.5 and 1.6 under HP-UX, and they don't demonstrate any special memory requirements. We have Itanium hardware, though.
Third, why do you use top? Why not just print Runtime.getRuntime().totalMemory()?
Fourth, by adding values to ArrayList you create memory fragmentation. ArrayList has to double it's internal storage now and then. Depending on GC settings and ArrayList.ensureCapacity() implementation the amount of non-collected memory may differ dramatically between 1.4 and 1.5.
Essentially, instead of figuring out the cause of problem you have run a random test that gives you no useful information. You should run a profiler on the application to figure out where the memory leaks.
You might also want to look at the problem you are trying to solve... I don't imagine there are many problems that eat 16GB of memory that aren't due for a good round of optimization.
Are you launching multiple VMs? Are you reading large datasets into memory, and not discarding them quickly enough? etc etc etc.
We have a java program that requires a large amount of heap space - we start it with (among other command line arguments) the argument -Xmx1500m, which specifies a maximum heap space of 1500 MB. When starting this program on a Windows XP box that has been freshly rebooted, it will start and run without issues. But if the program has run several times, the computer has been up for a while, etc., when it tries to start I get this error:
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.
I suspect that Windows itself is suffering from memory fragmentation, but I don't know how to confirm this suspicion. At the time that this happens, Task manager and sysinternals procexp report 2000MB free memory. I have looked at this question related to internal fragmentation
So the first question is, How do I confirm my suspicion?
The second question is, if my suspicions are correct, does anyone know of any tools to solve this problem? I've looked around quite a bit, but I haven't found anything that helps, other than periodic reboots of the machine.
ps - changing operating systems is also not currently a viable option.
Agree with Torlack, a lot of this is because other DLLs are getting loaded and go into certain spots, breaking up the amount of memory you can get for the VM in one big chunk.
You can do some work on WinXP if you have more than 3G of memory to get some of the windows stuff moved around, look up PAE here:
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEdrv.mspx
Your best bet, if you really need more than 1.2G of memory for your java app, is to look at 64 bit windows or linux or OSX. If you're using any kind of native libraries with your app you'll have to recompile them for 64 bit, but its going to be a lot easier than trying to rebase dlls and stuff to maximize the memory you can get on 32 bit windows.
Another option would be to split your program up into multiple VMs and have them communicate with eachother via RMI or messaging or something. That way each VM can have some subset of the memory you need. Without knowing what your app does, i'm not sure that this will help in any way, though...
Unless you are running out of page file space, this issue isn't that the computer is running out of memory. The whole point of virtual memory is to allow the processes to use more virtual memory than is physically available.
Not knowing how the JVM handles the heap, it is a bit hard to say exactly what the problem is, but one of the common issues is that there isn't enough contiguous free address space available in your process to allow the heap to be extended. Why this would be a problem after the machine has been running a while is a bit confusing.
I've been working on a similar problem at work. I have found that running the program using WinDBG and using the "!address" and "!address -summary" commands have been invaluable in tracking down why a processes' virtual address space has become fragmented. You can also try running the program after reboot and using the "!address" command to take a picture of the address space and then do the same when the program no longer runs. This might clue you in on the problem. Maybe something simple as an extra DLL getting loading might cause the problem.
I suspect that the problem is Windows memory fragmentation. There is another question here on StackOverflow called Java Maximum Memory on Windows XP that mentions using Process Explorer to look at where DLLs are mapped into memory, and then to address the problem by rebasing the DLLs so that load into memory in a more compact way.
Using Minimem (http://minimem.kerkia.net/) for that application might fix your problem. However, I'm not sure this is the answer you are looking for. I hope it helps.
Maybe you should consider to start the program and reserving the memory and not
end the VM after each run. Look for different GC options and release your objects.
Use vmmap from Microsoft's SysInternals tools to view the fragmentation of the virtual address space, and identify what's breaking up the space