Stable JVM but system memory increase

Stable JVM but system memory increase - java

I am running a java application on a Ubuntu 16.04 server. After extensive investigation I have discovered that the JVM heap size is more or less constant. At any rate there are no memory increase.
However, when I look at the server using htop the memory consumption of the server grows at an alarming rate. I am not sure what exactly is causing this but its 100% originating from java process.
I have looked at the hprof files but I cant really tell what Im looking for.
I am running two libs that might be responsible but I am not intimately familiar with them;
OrientDB (plocal)
Hazelcast
Im not sure if either / both of these would cause a memory increase outside the JVM.
Any advice on the best plan to help identify the problem would be great.

Thanks to #the8472, #davmac #qwwdfsad and #andrey-lomakin for your comments. I appreciate the details provided in the question where very thin but I was trying to avoid providing unrelated data that might lead down a rabbit whole.
I systematically tested each suggestion and it turns out that the problem was originating from OrientDB. I cant say 100% which of the following fixed the problem (possibly both). As per #andrey-lomakin suggestion I upgraded from 2.1.19 to 2.2-rc1. In doing this the applications batch inserts started throwing exceptions so I converted them all into single linear queries. Once compete the memory leak has gone.
As a side note in case it affects anybody else while testing for direct IO leak I did discover to my suprise that -Djdk.nio.maxCachedBufferSize=... works withJava(TM) SE Runtime Environment (build 1.8.0_91-b14).

Related

Performance issues with applications deployed on Tomcat

Recently we have migrated all our companies applications from Websphere to Tomcat application server. As part of this process we had performance testing done.
We found that couple of applications are having over 100% performance degradation in Tomcat. We increased the number of threads, we configured datasource settings to accommodate our test, we have also increased the read and write buffer sizes in the Tomcat server.
Application Background:
-> Spring Framework
-> Hibernate
-> Oracle 12c
-> JSPs
-> OpenJDK 8
We already checked the database and found no issues with performance in DB.
The CPU utilization while running the test is always less than 10%.
Heap settings are -xms = 1.5G to -xmx = 2G, and it never utilizes more than 1.2G.
We also have two nodes and HAProxy on top to balance the load. (We don't have a web server in place).
Despite our best efforts we couldn't pinpoint the issue that is causing the performance degradation. I am aware that this information isn't enough to provide a solution for our problem, however, any suggestion on how to proceed will be very helpful as we hit a dead-end and are unable to proceed.
Appreciate it if you can share any points that will be helpful in finding the issue.
Thanks.

Take Thread Dumps and analyze which part of application is having issues and start troubleshoot from there.
Follow this article for detailed explanation about Thread Dumps analysis - https://dzone.com/articles/how-analyze-java-thread-dumps

There are plenty of possible reasons for the problem you've mentioned and there really isn't much data to work with. Regardless of that, as kann commented a good way to start would be gathering Thread Dumps of the java process.
I'd also question if you're running in the same servers or if it's newly setup servers and how they are looking (resource-wise). Is there any CPU/Memory/IO constraints during the test?
Regarding the Xmx it sounds like you're not passing the -XX:+AlwaysPreTouch flag to the JVM but I would advice you to look into it as it will make the JVM zero the heap memory on start-up instead of doing it in runtime (which can mean a performance hit).

Jade: java.lang.OutOfMemoryError: Java Heap Space

I've been using jade (Java Agent Development Framework) for creating a network based messaging system.
So far, I notice that jade was running without issues but one fine day, I get this message,
A JVM Heap Space error!
After investigating, I find out that this is due to the collection variable that might be clogging up objects which is occupying the JVM space without flushing it out. (you can see that Exception is raised from the jade perspective and nothing from my code side)
How can I remove this?
My code consists of a simple TickerBehaviour class as below:
public class MyBehaviour extends TickerBehaviour {
#Override
public void onTick() {
// Runs every second.
ACLMessage msg = new ACLMessage(ACLMessage.INFORM);
msg.setOntology(username);
msg.addReceiver(new AID(username, AID.ISLOCALNAME));
msg.setContent(<my intended message to that identifier>);
agent.send(msg);
}
}
I further checked if my code is creating unnecessary referenced objects (by commenting the code which finally generates my intended message as a way) I bring it down to removing all my functionality and just run the jade agent, and surprisingly, I notice that jade task itself is creating this issue.
I used visualVM to inspect the ongoing Heap Space inspection for live object creation to check how many referenced objects are still being located in JVM Heap space.
Older solutions aren't helping much either. Can anyone help me tackle this issue?
I have used the options recommended during the start of the jade Container but there are referenced objects still present which aren't being removed by GC.
System Setup:
OS: Linux 64-bit.
JVM Version: IcedTea, 1.6.0.27 , 64-bit.
JVM Options: -Xms1024m, -Xmx2048m and -XX:MaxPermSize=512M
Thank you in advance.

How can I remove this?
You seem to have investigated this as a memory leak, and concluded that the leak is in Jade.
If that is the case, then the first thing to do is to trawl the Jade mailing lists and bug tracker to see if this is a known problem, and if there is a known fix or workaround.
If that fails, you've got three choices:
Investigate further and track down the cause of the memory leak, and develop a fix for it. If the fix is general, contribute it back to the Jade team.
Report the bug on the Jade bug tracker and hope that this results in a fix ... eventually.
Bandaid. Run your application with a larger heap, and restart it whenever you get an OOME.
The other possibility is that the memory leak is in, or is caused by your code. For instance, you say:
After investigating, I find out that this is due to the collection variable that might be clogging up objects which is occupying the JVM space without flushing it out. (you can see that Exception is raised from the jade perspective and nothing from my code side)
This is not watertight evidence that the problem is in Jade code. All it means is that you were executing a Jade method when the memory finally ran out. I'd advise you to download the Jade source code and investigate this (supposed) memory leak further. Figure out exactly what really causes it rather than basing your diagnosis on assumptions and faulty inferences.
Bear in mind that Jade is a stable product that many people are using successfully ... without memory leak issues.

One of the simplest things I can recommend is to use Plumbr. It is meant exactly for such cases. If Plumbr reports that the problem lies in Jade code, then you should submit a bug report to them. Or it will help you spot the problem in you own application.

The problem was with another engine that was buffering objects for processing. JADE wasn't the culprit. I was using a common ESPER Engine and creating new objects for event processing from the data being parsed.
I'm investigating how to flush out those contents periodically without crashing the application.
Sorry for the trouble!

Does the application server affect Java memory usage?

Let's say I have a very large Java application that's deployed on Tomcat. Over the course of a few weeks, the server will run out of memory, application performance is degraded, and the server needs a restart.
Obviously the application has some memory leaks that need to be fixed.
My question is.. If the application were deployed to a different server, would there be any change in memory utilization?

Certainly the services offered by the application server might vary in their memory utilization, and if the server includes its own unique VM -- i.e., if you're using J9 or JRockit with one server and Oracle's JVM with another -- there are bound to be differences. One relevant area that does matter is class loading: some app servers have better behavior than others with regard to administration. Warm-starting the application after a configuration change can result in serious memory leaks due to class loading problems on some server/VM combinations.
But none of these are really going to help you with an application that leaks. It's the program using the memory, not the server, so changing the server isn't going to affect much of anything.

There will probably be a slight difference in memory utilisation, but only in as much as the footprint differs between servlet containers. There is also a slight chance that you've encountered a memory leak with the container - but this is doubtful.
The most likely issue is that your application has a memory leak - in any case, the cause is more important than a quick fix - what would you do if the 'new' container just happens to last an extra week etc? Moving the problem rarely solves it...
You need to start analysing the applications heap memory, to locate the source of the problem. If your application is crashing with an OOME, you can add this to the JVM arguments.
-XX:-HeapDumpOnOutOfMemoryError
If the performance is just degrading until you restart the container manually, you should get into the routine of triggering periodic heap dumps. A timeline of dumps is often the most help, as you can see which object stores just grow over time.
To do this, you'll need a heap analysis tool:
JHat or IBM Heap Analyser or whatever your preference :)
Also see this question:
Recommendations for a heap analysis tool for Java?
Update:
And this may help (for obvious reasons):
How do I analyze a .hprof file?

Performance drop after 5 days running web application, how to spot the bottleneck?

I've developed a web application using the following tech stack:
Java
Mysql
Scala
Play Framework
DavMail integration (for calender and exchange server)
Javamail
Akka actors
On the first days, the application runs smoothly and without lags. But after 5 days or so, the application gets really slow! And now I have no clue how to profile this, since I have huge dependencies and it's hard to reproduce this kind of thing. I have looked into the memory and it seems that everything its okay.
Any pointers on the matter?

Try using VisualVM - you can monitor gc behaviour, memory usage, heap, threads, cpu usage etc. You can use it to connect to a remote VM.

`visualvm˙ is also a great tool for such purposes, you can connect to a remote JVM as well and see what's inside.
I suggest you doing this:
take a snapshot of the application running since few hours and since 5 days
compare thread counts
compare object counts, search for increasing numbers
see if your program spends more time in particular methods on the 5th day than on the 1str one
check for disk space, maybe you are running out of it

jconsole comes with the JDK and is an easy tool to spot bottlenecks. Connect it to your server, look into memory usage, GC times, take a look at how many threads are alive because it could be that the server creates many threads and they never exit.

I agree with tulskiy. On top of that you could also use JMeter if the investigations you will have made with jconsole are unconclusive.
The probable causes of the performances degradation are threads (that are created but never exit) and also memory leaks: if you allocate more and more memory, before having the OutOfMemoryError, you may encounter some performances degradation (happened to me a few weeks ago).

To eliminate your database you can monitor slow queries (and/or queries that are not using an index) using the slow query log
see: http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
I would hazard a guess that you have a missing index, and it has only become apparent as your data volumes have increased.

Yet another profiler is Yourkit.
It is commercial, but with trial period (two weeks).
Actually, I've firstly tried VisualVM as #axel22 suggested, but our remote server was ssh'ed and we had problems with connecting via VisualVM (not saying that it is impossible, I've just surrendered after a few hours).

You might just want to try the 'play status' command, which will list web app state (threads, jobs, etc). This might give you a hint on what's going on.

So guys, in this specific case, I was running play in Developer mode, which makes the compiler works every now and then.
After changing to production mode, everything was lightning fast and no more problems anymore. But thanks for all the help.

JVM OutOfMemory error "death spiral" (not memory leak)

We have recently been migrating a number of applications from running under RedHat linux JDK1.6.0_03 to Solaris 10u8 JDK1.6.0_16 (much higher spec machines) and we have noticed what seems to be a rather pressing problem: under certain loads our JVMs get themselves into a "Death Spiral" and eventually go out of memory. Things to note:
this is not a case of a memory leak. These are applications which have been running just fine (in one case for over 3 years) and the out-of-memory errors are not certain in any case. Sometimes the applications work, sometimes they don't
this is not us moving to a 64-bit VM - we are still running 32 bit
In one case, using the latest G1 garbage collector on 1.6.0_18 seems to have solved the problem. In another, moving back to 1.6.0_03 has worked
Sometimes our apps are falling over with HotSpot SIGSEGV errors
This is affecting applications written in Java as well as Scala
The most important point is this: the behaviour manifests itself in those applications which suddenly get a deluge of data (usually via TCP). It's as if the VM decides to keep adding more data (possibly progressing it to the TG) rather than running a GC on "newspace" until it realises that it has to do a full GC and then, despite practically everything in the VM being garbage, it somehow decides not to collect it!
It sounds crazy but I just don't see what else it is. How else can you explain an app which one minute falls over with a max heap of 1Gb and the next works just fine (never going about 256M when the app is doing exactly the same thing)
So my questions are:
Has anyone else observed this kind of behaviour?
has anyone any suggestions as to how I might debug the JVM itself (as opposed to my app)? How do I prove this is a VM issue?
Are there any VM-specialist forums out there where I can ask the VM's authors (assuming they aren't on SO)? (We have no support contract)
If this is a bug in the latest versions of the VM, how come no-one else has noticed it?

Interesting problem. Sounds like one of the garbage collectors works poorly on your particular situation.
Have you tried changing the garbage collector being used? There are a LOT of GC options, and figuring out which ones are optimal seems to be a bit of a black art, but I wonder if a basic change would work for you.
I know there is a "Server" GC that tends to work a lot better than the default ones. Are you using that?
Threaded GC (which I believe is the default) is probably the worst for your particular situation, I've noticed that it tends to be much less aggressive when the machine is busy.
One thing I've noticed, it often takes two GCs to convince Java to actually take out the trash. I think the first one tends to unlink a bunch of objects and the second actually deletes them. What you might want to do is occasionally force two garbage collections. This WILL cause a significant GC pause, but I've never seen a case where it took more than two to clean out the entire heap.

I have had the same issue on Solaris machines, and I solved it by decreasing the maximum size of the JVM. The 32 bit Solaris implementation apparently needs some overhead room beyond what you allocate for the JVM when doing garbage collections. So, for example, with -Xmx3580M I'd get the errors you describe, but with -Xmx3072M it would be fine.

Yes, I've observed this behavior before, and usually after countless hours of tweaking JVM parameters it starts working.
Garbage Collection, especially in multithreaded situations is nondeterministic. Defining a bug in nondeterministic code can be a challenge. But you could try DTrace if you are using Solaris, and there are a lot of JVM options for peering into HotSpot.
Go on Scala IRC and see if Ismael Juma is hanging around (ijuma). He's helped me before, but I think real in-depth help requires paying for it.
I think most people doing this kind of stuff accept that they either need to be JVM tuning experts, have one on staff, or hire a consultant. There are people who specialize in JVM tuning.
In order to solve these problems I think you need to be able to replicate them in a controlled environment where you can precisely duplicate runs with different tuning parameters and/or code changes. If you can't do that hiring an expert probably isn't going to do you any good, and the cheapest way out of the problem is probably buying more RAM.

What kind of OutOfMemoryError are you getting? Is the heap space exhausted or is the problem related to any of the other memory pools (the Error usually have a message giving more details on its cause).
If the heap is exhausted and the problem can be reproduced (it sounds as if it can), I would first of all configure the VM to produce a heap dump on OutOfMemoryErrors. You can then analyze the heap and make sure that it's not filled with objects, which are still reachable through some unexpected references.
It's of course not impossible that you are running into a VM bug, but if your application is relying on implementation specific behaviour in 1.6.0_03, it may for some reason or another end up as a memory hog when running on 1.6.0_16. Such problems may also be found if you are using some kind of server container for your application. Some developers are obviously unable to read documentation, but tend to observe the API behaviour and make their own conclusions about how something is supposed to work. This is of course not always correct and I've ran into similar problems both with Tomcat and with JBoss (both products at least used to work only with specific VMs).

Also make sure it's not a hardware fault (try running MemTest86 or similar on the server.)

Which kind of SIGSEV errors exactly do you encounter?
If you run a 32bit VM, it could be what I described here: http://janvanbesien.blogspot.com/2009/08/mysterious-jvm-crashes-explained.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.