Jade: java.lang.OutOfMemoryError: Java Heap Space - java

I've been using jade (Java Agent Development Framework) for creating a network based messaging system.
So far, I notice that jade was running without issues but one fine day, I get this message,
A JVM Heap Space error!
After investigating, I find out that this is due to the collection variable that might be clogging up objects which is occupying the JVM space without flushing it out. (you can see that Exception is raised from the jade perspective and nothing from my code side)
How can I remove this?
My code consists of a simple TickerBehaviour class as below:
public class MyBehaviour extends TickerBehaviour {
#Override
public void onTick() {
// Runs every second.
ACLMessage msg = new ACLMessage(ACLMessage.INFORM);
msg.setOntology(username);
msg.addReceiver(new AID(username, AID.ISLOCALNAME));
msg.setContent(<my intended message to that identifier>);
agent.send(msg);
}
}
I further checked if my code is creating unnecessary referenced objects (by commenting the code which finally generates my intended message as a way) I bring it down to removing all my functionality and just run the jade agent, and surprisingly, I notice that jade task itself is creating this issue.
I used visualVM to inspect the ongoing Heap Space inspection for live object creation to check how many referenced objects are still being located in JVM Heap space.
Older solutions aren't helping much either. Can anyone help me tackle this issue?
I have used the options recommended during the start of the jade Container but there are referenced objects still present which aren't being removed by GC.
System Setup:
OS: Linux 64-bit.
JVM Version: IcedTea, 1.6.0.27 , 64-bit.
JVM Options: -Xms1024m, -Xmx2048m and -XX:MaxPermSize=512M
Thank you in advance.

How can I remove this?
You seem to have investigated this as a memory leak, and concluded that the leak is in Jade.
If that is the case, then the first thing to do is to trawl the Jade mailing lists and bug tracker to see if this is a known problem, and if there is a known fix or workaround.
If that fails, you've got three choices:
Investigate further and track down the cause of the memory leak, and develop a fix for it. If the fix is general, contribute it back to the Jade team.
Report the bug on the Jade bug tracker and hope that this results in a fix ... eventually.
Bandaid. Run your application with a larger heap, and restart it whenever you get an OOME.
The other possibility is that the memory leak is in, or is caused by your code. For instance, you say:
After investigating, I find out that this is due to the collection variable that might be clogging up objects which is occupying the JVM space without flushing it out. (you can see that Exception is raised from the jade perspective and nothing from my code side)
This is not watertight evidence that the problem is in Jade code. All it means is that you were executing a Jade method when the memory finally ran out. I'd advise you to download the Jade source code and investigate this (supposed) memory leak further. Figure out exactly what really causes it rather than basing your diagnosis on assumptions and faulty inferences.
Bear in mind that Jade is a stable product that many people are using successfully ... without memory leak issues.

One of the simplest things I can recommend is to use Plumbr. It is meant exactly for such cases. If Plumbr reports that the problem lies in Jade code, then you should submit a bug report to them. Or it will help you spot the problem in you own application.

The problem was with another engine that was buffering objects for processing. JADE wasn't the culprit. I was using a common ESPER Engine and creating new objects for event processing from the data being parsed.
I'm investigating how to flush out those contents periodically without crashing the application.
Sorry for the trouble!

Related

What are the main possible causes for getting OutOfMemoryError?

Today while working i encountered this problem in one of our application.
Scenario is: We have a load button by clicking on it, it will load lacks of records from the data base. But the issue is while reloading the same records by clicking on refresh button it is giving OutOfMemoryError. Can anyone give brief explanation what might be the possible cause because at first attempt it is loading all the records fine but why we are getting exception on refreshing it.
If any good resource available to study this scenario also would help alot.
Thanks in advance...
The only reason is that you are constantly creating new objects without freeing enough, or you are creating too many threads.
You can use Java VisualVM's profiler to do some memory profiling. This allows you to get an overview, which objects are in memory, and which other object/thread has references to them.
The Java VisualVM should be part of Sun's JDK.
See also:
How to identify the issue when Java OutOfMemoryError?
What is an OutOfMemoryError and how do I debug and fix it
I came to know that our application has lots of thread processing. I tried to reduce the stack size in the server by using the below command and it worked.
-Xss512k (setting the java thread stack size to this)
Here is the resource like i have used to resolve this issue.

Stable JVM but system memory increase

I am running a java application on a Ubuntu 16.04 server. After extensive investigation I have discovered that the JVM heap size is more or less constant. At any rate there are no memory increase.
However, when I look at the server using htop the memory consumption of the server grows at an alarming rate. I am not sure what exactly is causing this but its 100% originating from java process.
I have looked at the hprof files but I cant really tell what Im looking for.
I am running two libs that might be responsible but I am not intimately familiar with them;
OrientDB (plocal)
Hazelcast
Im not sure if either / both of these would cause a memory increase outside the JVM.
Any advice on the best plan to help identify the problem would be great.
Thanks to #the8472, #davmac #qwwdfsad and #andrey-lomakin for your comments. I appreciate the details provided in the question where very thin but I was trying to avoid providing unrelated data that might lead down a rabbit whole.
I systematically tested each suggestion and it turns out that the problem was originating from OrientDB. I cant say 100% which of the following fixed the problem (possibly both). As per #andrey-lomakin suggestion I upgraded from 2.1.19 to 2.2-rc1. In doing this the applications batch inserts started throwing exceptions so I converted them all into single linear queries. Once compete the memory leak has gone.
As a side note in case it affects anybody else while testing for direct IO leak I did discover to my suprise that -Djdk.nio.maxCachedBufferSize=... works withJava(TM) SE Runtime Environment (build 1.8.0_91-b14).

How do I debug Segfaults occurring in the JVM when it runs my code?

My Java application has started to crash regularly with a SIGSEGV and a dump of stack data and a load of information in a text file.
I have debugged C programs in gdb and I have debugged Java code from my IDE. I'm not sure how to approach C-like crashes in a running Java program.
I'm assuming I'm not looking at a JVM bug here. Other Java programs run just fine, and the JVM from Sun is probably more stable than my code. However, I have no idea how I could even cause segfaults with Java code. There definitely is enough memory available, and when I last checked in the profiler, heap usage was around 50% with occasional spikes around 80%. Are there any startup parameters I could investigate? What is a good checklist when approaching a bug like this?
Though I'm not so far able to reliably reproduce the event, it does not seem to occur entirely at random either, so testing is not completely impossible.
ETA: Some of the gory details
(I'm looking for a general approach, since the actual problem might be very specific. Still, there's some info I already collected and that may be of some value.)
A while ago, I had similar-looking trouble after upgrading my CI server (see here for more details), but that fix (setting -XX:MaxPermSize) did not help this time.
Further investigation revealed that in the crash log files the thread marked as "current thread" is never one of mine, but either one called "VMThread" or one called "GCTaskThread"- I f it's the latter, it is additionally marked with the comment "(exited)", if it's the former, the GCTaskThread is not in the list. This makes me suppose that the problem might be around the end of a GC operation.
I'm assuming I'm not looking at a JVM bug here. Other Java programs
run just fine, and the JVM from Sun is probably more stable than my
code.
I don't think you should make that assumption. Without using JNI, you should not be able to write Java code that causes a SIGSEGV (although we know it happens). My point is, when it happens, it is either a bug in the JVM (not unheard of) or a bug in some JNI code. If you don't have any JNI in your own code, that doesn't mean that you aren't using some library that is, so look for that. When I have seen this kind of problem before, it was in an image manipulation library. If the culprit isn't in your own JNI code, you probably won't be able to 'fix' the bug, but you may still be able to work around it.
First, you should get an alternate JVM on the same platform and try to reproduce it. You can try one of these alternatives.
If you cannot reproduce it, it likely is a JVM bug. From that, you can either mandate a particular JVM or search the bug database, using what you know about how to reproduce it, and maybe get suggested workarounds. (Even if you can reproduce it, many JVM implementations are just tweaks on Oracle's Hotspot implementation, so it might still be a JVM bug.)
If you can reproduce it with an alternative JVM, the fault might be that you have some JNI bug. Look at what libraries you are using and what native calls they might be making. Sometimes there are alternative "pure Java" configurations or jar files for the same library or alternative libraries that do almost the same thing.
Good luck!
The following will almost certainly be useless unless you have native code. However, here goes.
Start java program in java debugger, with breakpoint well before possible sigsegv.
Use the ps command to obtain the processid of java.
gdb /usr/lib/jvm/sun-java6/bin/java processid
make sure that the gdb 'handle' command is set to stop on SIGSEGV
continue in the java debugger from the breakpoint.
wait for explosion.
Use gdb to investigate
If you've really managed to make the JVM take a sigsegv without any native code of your own, you are very unlikely to make any sense of what you will see next, and the best you can do is push a test case onto a bug report.
I found a good list at http://www.oracle.com/technetwork/java/javase/crashes-137240.html. As I'm getting the crashes during GC, I'll try switching between garbage collectors.
I tried switching between the serial and the parallel GC (the latter being the default on a 64-bit Linux server), this only changed the error message accordingly.
Reducing the max heap size from 16G to 10G after a fresh analysis in the profiler (which gave me a heap usage flattening out at 8G) did lead to a significantly lower "Virtual Memory" footprint (16G instead of 60), but I don't even know what that means, and The Internet says, it doesn't matter.
Currently, the JVM is running in client mode (using the -client startup option thus overriding the default of -server). So far, there's no crash, but the performance impact seems rather large.
If you have a corefile you could try running jstack on it, which would give you something a little more comprehensible - see http://download.oracle.com/javase/6/docs/technotes/tools/share/jstack.html, although if it's a bug in the gc thread it may not be all that helpful.
Try to check whether c program carsh which have caused java crash.use valgrind to know invalid and also cross check stack size.

JVM OutOfMemory error "death spiral" (not memory leak)

We have recently been migrating a number of applications from running under RedHat linux JDK1.6.0_03 to Solaris 10u8 JDK1.6.0_16 (much higher spec machines) and we have noticed what seems to be a rather pressing problem: under certain loads our JVMs get themselves into a "Death Spiral" and eventually go out of memory. Things to note:
this is not a case of a memory leak. These are applications which have been running just fine (in one case for over 3 years) and the out-of-memory errors are not certain in any case. Sometimes the applications work, sometimes they don't
this is not us moving to a 64-bit VM - we are still running 32 bit
In one case, using the latest G1 garbage collector on 1.6.0_18 seems to have solved the problem. In another, moving back to 1.6.0_03 has worked
Sometimes our apps are falling over with HotSpot SIGSEGV errors
This is affecting applications written in Java as well as Scala
The most important point is this: the behaviour manifests itself in those applications which suddenly get a deluge of data (usually via TCP). It's as if the VM decides to keep adding more data (possibly progressing it to the TG) rather than running a GC on "newspace" until it realises that it has to do a full GC and then, despite practically everything in the VM being garbage, it somehow decides not to collect it!
It sounds crazy but I just don't see what else it is. How else can you explain an app which one minute falls over with a max heap of 1Gb and the next works just fine (never going about 256M when the app is doing exactly the same thing)
So my questions are:
Has anyone else observed this kind of behaviour?
has anyone any suggestions as to how I might debug the JVM itself (as opposed to my app)? How do I prove this is a VM issue?
Are there any VM-specialist forums out there where I can ask the VM's authors (assuming they aren't on SO)? (We have no support contract)
If this is a bug in the latest versions of the VM, how come no-one else has noticed it?
Interesting problem. Sounds like one of the garbage collectors works poorly on your particular situation.
Have you tried changing the garbage collector being used? There are a LOT of GC options, and figuring out which ones are optimal seems to be a bit of a black art, but I wonder if a basic change would work for you.
I know there is a "Server" GC that tends to work a lot better than the default ones. Are you using that?
Threaded GC (which I believe is the default) is probably the worst for your particular situation, I've noticed that it tends to be much less aggressive when the machine is busy.
One thing I've noticed, it often takes two GCs to convince Java to actually take out the trash. I think the first one tends to unlink a bunch of objects and the second actually deletes them. What you might want to do is occasionally force two garbage collections. This WILL cause a significant GC pause, but I've never seen a case where it took more than two to clean out the entire heap.
I have had the same issue on Solaris machines, and I solved it by decreasing the maximum size of the JVM. The 32 bit Solaris implementation apparently needs some overhead room beyond what you allocate for the JVM when doing garbage collections. So, for example, with -Xmx3580M I'd get the errors you describe, but with -Xmx3072M it would be fine.
Yes, I've observed this behavior before, and usually after countless hours of tweaking JVM parameters it starts working.
Garbage Collection, especially in multithreaded situations is nondeterministic. Defining a bug in nondeterministic code can be a challenge. But you could try DTrace if you are using Solaris, and there are a lot of JVM options for peering into HotSpot.
Go on Scala IRC and see if Ismael Juma is hanging around (ijuma). He's helped me before, but I think real in-depth help requires paying for it.
I think most people doing this kind of stuff accept that they either need to be JVM tuning experts, have one on staff, or hire a consultant. There are people who specialize in JVM tuning.
In order to solve these problems I think you need to be able to replicate them in a controlled environment where you can precisely duplicate runs with different tuning parameters and/or code changes. If you can't do that hiring an expert probably isn't going to do you any good, and the cheapest way out of the problem is probably buying more RAM.
What kind of OutOfMemoryError are you getting? Is the heap space exhausted or is the problem related to any of the other memory pools (the Error usually have a message giving more details on its cause).
If the heap is exhausted and the problem can be reproduced (it sounds as if it can), I would first of all configure the VM to produce a heap dump on OutOfMemoryErrors. You can then analyze the heap and make sure that it's not filled with objects, which are still reachable through some unexpected references.
It's of course not impossible that you are running into a VM bug, but if your application is relying on implementation specific behaviour in 1.6.0_03, it may for some reason or another end up as a memory hog when running on 1.6.0_16. Such problems may also be found if you are using some kind of server container for your application. Some developers are obviously unable to read documentation, but tend to observe the API behaviour and make their own conclusions about how something is supposed to work. This is of course not always correct and I've ran into similar problems both with Tomcat and with JBoss (both products at least used to work only with specific VMs).
Also make sure it's not a hardware fault (try running MemTest86 or similar on the server.)
Which kind of SIGSEV errors exactly do you encounter?
If you run a 32bit VM, it could be what I described here: http://janvanbesien.blogspot.com/2009/08/mysterious-jvm-crashes-explained.html

How should I diagnose and prevent JVM crashes?

What should I (as a Java programmer who doesn't know anything about JVM internals) do when I come across a JVM crash?
In particular, how would you produce a reproducible test case? What should I be searching for in Sun's (or IBM's) bug database? What information can I get from the log files produced (e.g. hs_err_pidXYZ.log)?
If the crashes occur only one one specific machine, run memtest. I've seen recurring JVM crashes only two times, and in both cases the culprit turned out to be a hardware problem, namely faulty RAM.
In my experience they are nearly always caused by native code using JNI, either mine or someone else's. If you can, try re-running without the native code to see if you can reproduce it.
Sometimes it is worth trying with the JIT compiler turned off, if your bug is easily reproducible.
As others have pointed out, faulty hardware can also cause this, I've seen it for both Memory and video cards (when the crash was in swing code). Try running whatever hardware diagnostics are most appropriate for your system.
As JVM crashes are rare I'd report them to Sun. This can be done at their bug database. Use category Java SE, Subcategory jvm_exact or jit.
Under Unix/Linux you might get a Core dump. Under windows the JVM will usually tell you where it has stored a log of what has happened. These files often given some hint, but will vary from JVM to JVM. Sun gives full details of these files on their website. or IBM the files can be analysed using the Java Core Analyzer and Java heapdump Analyzer from IBM's alphaworks.
Unfortunately Java debuggers in my experience tend to hurt more than help. However, attaching an OS specific debugger (eg Visual Studio) can help if you are familiar with reading C stack traces.
Trying to get a reproducible test case is hard. If you have a large amount of code that always (or nearly always) crashes it is easier, just slowly remove parts while it keeps crashing, getting the result as small as possible. If you have no reproducible test code at all then it is very difficult. I'd suggest getting hints from my numbered selection above.
Sun documents the details of the crash log here. There is also a nice tutorial written up here, if you want to get into the dirty details (it sounds like you don't, though)
However, as a commenter mentioned, a JVM crash is a pretty rare and serious event, and it might be worthwhile to call Sun or IBM professional support in this situation.
When an iBM JVM crashes, it might have written to the file /tmp/dump_locations in there it lists any heapdump or javacore files it has written.
These files can be analysed using the Java Core Analyzer and Java heapdump Analyzer from IBM's alphaworks.
There's a great page on the Oracle website to troubleshoot these types of problems.
Check out the relevant sections for:
Hung Processes (eg. jstack utility)
Post Mortem diagnostics

Categories