Preventing CloudFoundry from killing an application on OOM (out of memory)

Preventing CloudFoundry from killing an application on OOM (out of memory) - java

We have a Java application, deployed in CloudFoundry, that occasionally throws an OOM error, due to requests that have a large payload as result.
When this happens, CloudFoundry kills the app and restarts it.
When the application is running on a development machine (rather than in CF), the OOM does not result in a crash (but does display an "out of heap memory" message in the output); usually the request-handler thread ends and the memory that was allocated for the request is garbage-collected. The application continues to run and successfully serves more requests.
Is there a way to configure CF to avoid restarting the app on OOM?
Thanks.

The short answer is no. The platform will always kill your app when you exceed the memory limit that you've assigned to it. This is the intended behavior. You cannot bypass this because that would essentially mean that your application has no memory limit.
On a side note, I would highly recommend using the Java buildpack v4.x (latest), if you are not already. It is much better about configuring the JVM such that you get meaningful errors like JVM OOME's instead of just letting your application crash. It also dumps helpful diagnostic information when this happens that will direct you to the source of the problem.
One other side note...
the OOM does not result in a crash (but does display an "out of heap memory" message in the output); usually the request-handler thread ends and the memory that was allocated for the request is garbage-collected.
You don't want to rely on this behavior. Once an OOME happens in the JVM, all bets are off. It may be able to recover and it may put your application into a horrible and unusable state. There's no way to know because there's no way to know exactly where the OOME will strike. When you get an OOME, the best course of action is to obtain any diagnostic information that you need and restart. This is exactly what the Java buildpack (v4+) does when your app runs on CF.
Hope that helps!

Related

Why does Spring Boot WEB take to respond more faster?

I usually use Spring Boot + JPA + Hibernate + Postgres.
At the end of the development of a WEB application I compile in Jar, then I run it directly with Java and then I do reverse proxy with Apache (httpd).
I have noticed that when starting there are no problems or latency, when accessing the website it works very quickly, but when several hours pass without anyone making a request to the server and then I want to access I must wait at least 20 seconds until the server responds, after this I can continue to access the site normally.
Why does this happen ?, It is as if Spring were in standby mode every time it detects that it has no load of requests, but I am not sure if it is so or is a problem. If it's some native spring functionality, how can I disable it?
Although I need to use a little more memory in idle state I want the answers to be fast regardless of whether it is loaded or not.

Without knowing more, it is likely that while your webapp is sitting idle, other programs on your server is using memory and cause the JVM memory to be swapped to disk.
When you then access the webapp again, the OS has to swap that JVM memory back into RAM, one page at a time. That takes time, but once the memory is back in RAM, your webapp will run normally.
Unfortunately, the way Java memory works, swapping JVM memory to disk is very bad for performance. That is an issue for most languages that rely on garbage collectors to free memory. Languages with manual memory management, e.g. C++ code, will usually not be hit as badly, when memory is swapped to disk, because memory use is more "focused" in those languages.
Solution: If my guess at the cause of your problem is correct, reconfigure your server so the JVM memory won't be swapped to disk.
Note that when I say server, I mean the physical machine. The "other programs", that your JVM is fighting for memory, might be running in different VMs, i.e. not in the same OS.

Access Memory Usage of JVM from within my Application?

I have a Grails/Spring application which runs in a servlet container on a web server like Tomcat. Sometime my app crashes because the JVM reaches its maximal allowed memory (Xmx).
The error which follows is a "java.lang.OutOfMemoryError" because Java heap space is full.
To prevent this error I want to check from within my app how much memory is in use and how much memory the current JVM has remaining.
How can I access these parameters from within my application?

Try to understand when OOM is thrown instead of trying to manipulate it through the application. And also, even if you are able to capture those values from within your application - how would you prevent the error? By calling GC explicitly. Know that,
Java machine specifications says that
OutOfMemoryError: The Java virtual machine implementation has run out of either virtual or physical memory, and the automatic storage manager was unable to reclaim enough memory to satisfy an object creation request.
Therefore, GC is guaranteed to run before a OOM is thrown. Your application is throwing an OOME after it has just run a full garbage collect, and discovered that it still doesn't have enough free heap to proceed.
This would be a memory leak or in general your application could have high memory requirement. Mostly if the OOM is thrown with in short span of starting the application - it is usually that application needs more memory, if your server runs fine for some time and then throw OOM then it is most likely a memory leak.
To discover the memory leak, use the tools mentioned by people above. I use new-relic to monitor my application and check the frequency of GC runs.
PS Scavenge aka minor-GC aka the parallel object collector runs for young generation only, and PS MarkAndSweep aka major GC aka parallel mark and sweep collector is for old generation. When both are run – its considered a full GC. Minor gc runs are pretty frequent – a Full GC is comparatively less frequent. Note the consumption of different heap spaces to analyze your application.
You can also try the following option -
If you get OOM too often, then start java with correct options, get a heap dump and analyze it with jhat or with memory analyzer from eclipse (http://www.eclipse.org/mat/)
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=path to dump file

You can try the Grails Melody Plugin that display's the info in the url /monitoring relative to your context.

To prevent this error I want to check from within my app how much
memory is in use and how much memory the current JVM has remaining.
I think that it is not the best idea to proceed this way. Much better is to investigate what actually breaks your app and eliminate error or make some limitation there. There could be many different scenarios and your app can become unpredictable. So to sum up - capturing memory level for monitoring purpose is OK (but there are many dedicated tools for that) but in my opinion depending on these values in application logic is not recommended and bad practice

To do this you would use a profiler to profile your application and JVM, rather than having code to monitor such metrics inside your application.
Profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or frequency and duration of function calls
Here are some good java profilers:
http://visualvm.java.net/ (Free)
http://www.ej-technologies.com/products/jprofiler/overview.html (Paid)

Performance --- response time slows down

We have a Java EE application(jsp/servlet,jdbc) running on Apache-tomcat server. The response time slows with time. It slows down at faster rate when continuously worked on.
The response time is back to normal after restart of the server.
I connected Jconsole to the server and I am attaching the screen shot of the heap memory usage,which goes up when doing intensive work and garbage collector kicks off periodically and the memory usage comes down.
However, when testing towards the end, despite kicking off garbage collector manually the response time is not going down. I
I also checked the connections and they seem to be closing off properly. i.e I do not notice any zcx
Any help is appreciated.

Attach with jvisualvm in the JDK. It allows you to profile Tomcat and find where the time goes.
My guess right now is the database connections. Either they go stale or the pool runs dry.

How much slower are the response times? Have you done any profiling or logging to help know which parts of your app are slower? It might be useful to setup a simple servlet to see if that also slows down as the other does. That might tell you if Tomcat or something in your app is slowing down.

Did you fine-tune your tomcat memory settings? Perhaps you need to increase perm gen size a bit.
e.g. -XX:MaxPermSize=512M
You can know it for sure if you can get a heap dump and load it to a tool like MemoryAnalyzer.

Does the application server affect Java memory usage?

Let's say I have a very large Java application that's deployed on Tomcat. Over the course of a few weeks, the server will run out of memory, application performance is degraded, and the server needs a restart.
Obviously the application has some memory leaks that need to be fixed.
My question is.. If the application were deployed to a different server, would there be any change in memory utilization?

Certainly the services offered by the application server might vary in their memory utilization, and if the server includes its own unique VM -- i.e., if you're using J9 or JRockit with one server and Oracle's JVM with another -- there are bound to be differences. One relevant area that does matter is class loading: some app servers have better behavior than others with regard to administration. Warm-starting the application after a configuration change can result in serious memory leaks due to class loading problems on some server/VM combinations.
But none of these are really going to help you with an application that leaks. It's the program using the memory, not the server, so changing the server isn't going to affect much of anything.

There will probably be a slight difference in memory utilisation, but only in as much as the footprint differs between servlet containers. There is also a slight chance that you've encountered a memory leak with the container - but this is doubtful.
The most likely issue is that your application has a memory leak - in any case, the cause is more important than a quick fix - what would you do if the 'new' container just happens to last an extra week etc? Moving the problem rarely solves it...
You need to start analysing the applications heap memory, to locate the source of the problem. If your application is crashing with an OOME, you can add this to the JVM arguments.
-XX:-HeapDumpOnOutOfMemoryError
If the performance is just degrading until you restart the container manually, you should get into the routine of triggering periodic heap dumps. A timeline of dumps is often the most help, as you can see which object stores just grow over time.
To do this, you'll need a heap analysis tool:
JHat or IBM Heap Analyser or whatever your preference :)
Also see this question:
Recommendations for a heap analysis tool for Java?
Update:
And this may help (for obvious reasons):
How do I analyze a .hprof file?

top reasons why an app server crashes

What are the most likely causes for application server failure?
For example: "out of disk space" is more likely than "2 of the drives in a RAID 4 setup die simultaneously".
My particular environment is Java, so Java-specific answers are welcome, but not required.
EDIT just to clarify, i'm looking for downtime-related crashes (out of memory is a good example) not just one-time issues (like a temporary network glitch).

If you are trying to keep an application server up, start monitoring it. Nagios, Big Sister, and other Network Monitoring tools can be very useful.
Watch memory availability / usage, disk availability / usage, cpu availability / usage, etc.
The most common reason why a server goes down is rarely the same reason twice. Someone "fixes" the last-most-common-reason, and a new-most-common-reason is born.

Edwin is right - you need monitoring to understand what the problem is. Or better - understand what the problem is AND prevent it from causing downtime.
You should not only track resource consumption but also demand. The difference between the two shows you if you have sized your server correctly.
There are a ton of open source tools like nagios, CollectD, etc. that can give you server specific data - that's only monitoring though, not prevention. Librato Silverline (disclosure: I work there) allows you to monitor individual processes and then throttle the resources they use by placing them in application containers for which you define resource polices.
If your server is 8 cores or less you can use it for free.

"Out of Memory" exception due to memory leaks.

All sorts of things can cause a server to crash, ranging from busted hardware (e.g. disk failures) to faulty code (memory leak resulting in an out of memory exception, network failure that got rethrown as a runtime exception and was never caught, in servers that aren't Java servers a SEGFAULT, etc.)

At first, it is usually because of memory leaks, disk space problems, endless loops causing cpu to eat up.
Once you monitor those issues and set up correct logging and warning mechanisms, they turn meta on you... and exploding error handling becomes a possible reason for a full lockup: an error (or more likely: two in an unhappy combination) occurs but when the handler is trying to write to the logfiles or send a warning (by mail or something) it gets another error which it is trying to handle by writing to the logfile or sending a warning or... and this continues until one of the resources gives out: it may lead to skyrocketing server load, memory problems, filling disk space, locking up network traffic which means it won't be accessible for a remote user to correct the problem, etc.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.