Java - programmatically reduce application load when runs out of memory - java

No, really, that's what I'm trying to do. Server is holding onto 1600 users - back end long-running process, not web server - but sometimes the users generate more activity than usual, so it needs to cut its load down, specifically when it runs out of "resources," which pretty much means heap memory. This is a big design question - how to design this?
This might likely involve preventing OOM instead of recovering from them. Ideally
if(nearlyOutOfMemory()) throw new MyRecoverableOOMException();
might happen.
But that nearlyOutOfMemory() function I don't really know what might be.

Split the server into shards, each holding fewer users but residing in different physical machines.
If you have lots of caches, try to use soft references, which get cleared out when the VM runs out of heap.
In any case, profile, profile, profile first to see where CPU time is consumed and memory is allocated and held onto.

I have actually asked a similar question about handling OOM and it turns out that there's not too many options to recover from it. Basically you can:
1) invoke external shell script (-XX:OnOutOfMemoryError="cmd args;cmd args") which would trigger some action. The problem is that if OOM has happened in some thread which doesn't have a decent recovery strategy, you're doomed.
2) Define a threshold for Old gen which technically isn't OOM but a few steps ahead, say 80% and act if the threshold has been reached. More details here.

You could use Runtime.getRuntime() and the following methods:
freeMemory()
totalMemory()
maxMemory()
But I agree with the other posters, using SoftReference, WeakReference or a WeakHashMap will probably safe you the trouble of manually recovering from that condition.

A throttling, resource regulating servlet filter may be of use too. I did encounter DoSFilter of jetty/eclipse.

Related

frequent garbage collection java web app

I have a web app that serializes a java bean into xml or json according to the user request.
I am facing a mind bending problem when I put a little bit of load on it, it quickly uses all allocated memory, and reach max capacity. I then observe full GC working really hard every 20-40 seconds.
Doesnt look like a memory leak issue... but I am not quite sure how to trouble shoot this?
The bean that is serialized to xml/json has reference to other beans and those to others. I use json-lib and jaxb to serialize the beans.
yourkit memory profiler is telling me that a char[] is the most memory consuming live object...
any insight is appreciated.
There are two possibilities: you've got a memory leak, or your webapp is just generating lots of garbage.
The brute-force way to tell if you've got a memory leak is to run it for a long time and see if it falls over with an OOME. Or turn on GC logging, and see if the average space left after garbage collection continually trends upwards over time.
Whether or not you have a memory leak, you can probably improve performance (reduce the percentage GC time) by increasing the max heap size. The fact that your webapp is seeing lots of full GCs suggests to me that it needs more heap. (This is just a bandaid solution if you have a memory leak.)
If it turns out that you are not suffering from a memory leak, then you should take a look at why your application is generating so much garbage. It could be down to the way that you are doing the XML and JSON serialization.
Why do you think you have a problem? GC is a natural and normal thing to happen. We have customers that GC every second (for less than 100ms duration), and that's fine as long as memory keeps getting reclaimed.
GCing every 20-40 seconds isn't a problem IMO - as long as it doesn't take a large % of that 20-40s. Most major commercial JVMs aim to keep GC in the 5-10% of time range (so 1-4 seconds of that 20-40s). Posting more data in the form of the GC logs might help, and I'd also suggest tools like GCMV would help you visualize and get recommendations on what your GC profile looks like.
It's impossible to diagnose this without a lot more information - code and GC logs - but my guess would be that you're reading data in as large strings, then chopping out little bits with substring(). When you do that, the substring string is made using the same underlying character array as the parent string, and so as long as it's alive, will keep that array in memory. That means code like this:
String big = a string of one million characters;
String small = big.substring(0, 1);
big = null;
Will still keep the huge string's character data in memory. If this is the case, then you can address it by forcing the small strings to use fresh, smaller, character arrays by constructing new instances:
small = new String(small);
But like i said, this is just a guess.
I'm not sure how much of it is in your code and how much might be in the tools you are using, but there are some key things to watch for.
One of the worst is if you constantly add to strings in loops. A simple "hello"+"world" is no problem at all, it's actually very smart about that, but if you do it in a loop it will constantly reallocate the string. Use StringBuilder where you can.
There are profilers for Java that should quickly point you to where the allocations are taking place. Just fool around with a profiler for a while while your java app is running and you will probably be able to reduce your GCs to virtually nothing unless the problem is inside your libraries--and even then you may figure out some way to fix it.
Things you allocate and then free quickly don't require time in the GC phase--it's pretty much free. Be sure you aren't keeping Strings around longer than you need them. Bring them in, process them and return to your previous state before returning from your request handler.
You should attach yourkit and record allocations (e.g., every 10th allocation; including all large ones). They have a step by step guide on diagnosing excessive gc:
http://www.yourkit.com/docs/90/help/excessive_gc.jsp
To me that sounds like you are trying to serialize a recursive object by some encoder which is not prepared for it.
(or at least: very deep/almost recursive)
Java's native XML API is really "noisy" and generally wasteful in terms of resources which means that if your requests and XML/JSON generation cycles are short-lived, the GC will have lots to clean up for.
I have debugged a very similar case and found out this the hard way, only way I could at least somewhat improve the situation without major refactorings was implicitly calling GC with the appropriate VM flags which actually turn System.gc(); from a non-op call to maybe-op call.
I would start by inspecting my running application to see what was being created on the heap.
HPROF can collect this information for you, which you can then analyse using HAT.
To debug issues with memory allocations, InMemProfiler can be used at the command line. Collected object allocations can be tracked and collected objects can be split into buckets based on their lifetimes.
In trace mode this tool can be used to identify the source of memory allocations.

out of memory error , my app's fault?

i have a aplication on the android market , in wich exceptions and errors are catched and sent to me by acra.
But i receive quite a lot out of memory errors..
In different kind of classes...some my app, some general java..
Does this always mean there is a problem in my app, or can it also be the phone ran out of memory due to a other process?
Will users also get a fc dialog ?
Additional Information
There is nothing memory intensite in my app..
no images...no big chunks of data..
only a simple view..and most intensive a mobclix ad..
i'm new to java...so i may have a leak somewhere..but i do find it hard to debug that.
But at this point i'm not even sure there is someting wrong...
i get about 25 -50 OOM error's daily..but compared to 60.000 ads it shows a day.
(i show only 1 or 2 ads for each time it's started) that is not too much.
1 receive errors like :
"java.lang.OutOfMemoryError
at org.apache.http.impl.io.AbstractSessionInputBuffer.init(AbstractSessionInputBuffer.java:79)
at org.apache.http.impl.io.SocketInputBuffer.<init>(SocketInputBuffer.java:93)
at android.net.http.AndroidHttpClientConnection.bind(AndroidHttpClientConnection.java:114)
at android.net.http.HttpConnection.openConnection(HttpConnection.java:61)
at android.net.http.Connection.openHttpConnection(Connection.java:378)
at android.net.http.Connection.processRequests(Connection.java:237)
at android.net.http.ConnectionThread.run(ConnectionThread.java:125)
"
"java.lang.OutOfMemoryError
at java.io.BufferedReader.<init>(BufferedReader.java:102)
at com.mobclix.android.sdk.Mobclix$FetchResponseThread.run(Mobclix.java:1422)
at com.mobclix.android.sdk.MobclixAdView$FetchAdResponseThread.run(MobclixAdView.java:390)
at java.util.Timer$TimerImpl.run(Timer.java:290)
"
"java.lang.OutOfMemoryError
at org.apache.http.util.ByteArrayBuffer.<init>(ByteArrayBuffer.java:53)
at org.apache.http.impl.io.AbstractSessionOutputBuffer.init(AbstractSessionOutputBuffer.java:77)
at org.apache.http.impl.io.SocketOutputBuffer.<init>(SocketOutputBuffer.java:76)
at android.net.http.AndroidHttpClientConnection.bind(AndroidHttpClientConnection.java:115)
at android.net.http.HttpConnection.openConnection(HttpConnection.java:61)
at android.net.http.Connection.openHttpConnection(Connection.java:378)
at android.net.http.Connection.processRequests(Connection.java:237)
at android.net.http.ConnectionThread.run(ConnectionThread.java:125)
"
So the main question is..am i leaking somewhere..
or can this be considered normal because in a small % of cases the phone may be out of memory due to other aplications running on it.
A common JVM problem is that only unreferenced objects can be removed by the Garbage Collector. If you have large persistent objects then it's important to set unused variables in those objects to null so that they are dereferenced. A classic problem is keeping something like a HashMap object around with a lot of values in it when you don't need it since every entry in the HashMap is chewing up memory.
Have you used allocation tracker in DDMS? Could help you find unexpected memory leaks.
http://developer.android.com/resources/articles/track-mem.html
(I haven't used it myself so far though)
As Thomas suggested, you really want to use the DDMS to look at your memory usage.
Also, a very common problem for leaks is use of static variables - use them only if you know what you're doing.
Handling bitmaps can also get very expensive on Android. What does your app do? Also, do you have lots references to any UI elements? Any ones defined as static?
There are things that may be out of your control (memory on the phone is an example) but nonetheless you're responsible for the behavior of your application.
How you handle memory issues will influence how users view your application. If it plays well with other applications, users will be more likely to use it. If it doesn't, they won't.
What do you mean by "general java" exceptions and if these are unrelated to your piece of software, then why are you receiving them?
As you probably know, the Dalvik virtual machine only has a small amount of memory allotted to itself (and to your application). This is implemented this way to avoid the possibility of a process growing out of control and draining all of the available resources, making the phone unusable. So if your application is performing many memory-intensive operations (like loading pictures) and you are not careful with your allocations (and clearing them as soon as they are unneeded), then bizarre outcomes may be observed.
About the force close, since you are catching these exceptions, they should not cause a crash of your application, unless you have missed to re-instantiate something after you have caught an exception.
Maybe inspection of your code and elimination of unneeded memory allocations will prove helpful. Also, you can test as my boss does - he just freaks out pushing buttons at random until something crashes :D
EDIT
Since you say that there is nothing memory expensive in your code (sans the ads probably), then you can have a simple check to see if the whole system is being low on memory when the error occurs, or it is your application that causes it. Have a look at the onLowMemory callback. It is called when the whole phone is low on memory.
When you get OutOfMemoryError, you can be sure it is your application and not another one which causes it. Each Android app is run in it's own Dalvik VM with 16Mb of maximum memory allocation.
If you do not use bitmaps (which are a frequent source of memory leaks), you also have to check if you handle orientation changes correctly, that is without keeping in memory any reference to an object relative to the UI.

Why is it bad practice to call System.gc()?

After answering a question about how to force-free objects in Java (the guy was clearing a 1.5GB HashMap) with System.gc(), I was told it's bad practice to call System.gc() manually, but the comments were not entirely convincing. In addition, no one seemed to dare to upvote, nor downvote my answer.
I was told there that it's bad practice, but then I was also told that garbage collector runs don't systematically stop the world anymore, and that it could also effectively be used by the JVM only as a hint, so I'm kind of at loss.
I do understand that the JVM usually knows better than you when it needs to reclaim memory. I also understand that worrying about a few kilobytes of data is silly. I also understand that even megabytes of data isn't what it was a few years back. But still, 1.5 gigabytes? And you know there's like 1.5 GB of data hanging around in memory; it's not like it's a shot in the dark. Is System.gc() systematically bad, or is there some point at which it becomes okay?
So the question is actually double:
Why is or isn't it bad practice to call System.gc()? Is it really merely a hint to the JVM under certain implementations, or is it always a full collection cycle? Are there really garbage collector implementations that can do their work without stopping the world? Please shed some light over the various assertions people have made in the comments to my answer.
Where's the threshold? Is it never a good idea to call System.gc(), or are there times when it's acceptable? If so, what are those times?
The reason everyone always says to avoid System.gc() is that it is a pretty good indicator of fundamentally broken code. Any code that depends on it for correctness is certainly broken; any that rely on it for performance are most likely broken.
You don't know what sort of garbage collector you are running under. There are certainly some that do not "stop the world" as you assert, but some JVMs aren't that smart or for various reasons (perhaps they are on a phone?) don't do it. You don't know what it's going to do.
Also, it's not guaranteed to do anything. The JVM may just entirely ignore your request.
The combination of "you don't know what it will do," "you don't know if it will even help," and "you shouldn't need to call it anyway" are why people are so forceful in saying that generally you shouldn't call it. I think it's a case of "if you need to ask whether you should be using this, you shouldn't"
EDIT to address a few concerns from the other thread:
After reading the thread you linked, there's a few more things I'd like to point out.
First, someone suggested that calling gc() may return memory to the system. That's certainly not necessarily true - the Java heap itself grows independently of Java allocations.
As in, the JVM will hold memory (many tens of megabytes) and grow the heap as necessary. It doesn't necessarily return that memory to the system even when you free Java objects; it is perfectly free to hold on to the allocated memory to use for future Java allocations.
To show that it's possible that System.gc() does nothing, view
JDK bug 6668279
and in particular that there's a -XX:DisableExplicitGC VM option:
By default calls to System.gc() are enabled (-XX:-DisableExplicitGC). Use -XX:+DisableExplicitGC to disable calls to System.gc(). Note that the JVM still performs garbage collection when necessary.
It has already been explained that calling system.gc() may do nothing, and that any code that "needs" the garbage collector to run is broken.
However, the pragmatic reason that it is bad practice to call System.gc() is that it is inefficient. And in the worst case, it is horribly inefficient! Let me explain.
A typical GC algorithm identifies garbage by traversing all non-garbage objects in the heap, and inferring that any object not visited must be garbage. From this, we can model the total work of a garbage collection consists of one part that is proportional to the amount of live data, and another part that is proportional to the amount of garbage; i.e. work = (live * W1 + garbage * W2).
Now suppose that you do the following in a single-threaded application.
System.gc(); System.gc();
The first call will (we predict) do (live * W1 + garbage * W2) work, and get rid of the outstanding garbage.
The second call will do (live* W1 + 0 * W2) work and reclaim nothing. In other words we have done (live * W1) work and achieved absolutely nothing.
We can model the efficiency of the collector as the amount of work needed to collect a unit of garbage; i.e. efficiency = (live * W1 + garbage * W2) / garbage. So to make the GC as efficient as possible, we need to maximize the value of garbage when we run the GC; i.e. wait until the heap is full. (And also, make the heap as big as possible. But that is a separate topic.)
If the application does not interfere (by calling System.gc()), the GC will wait until the heap is full before running, resulting in efficient collection of garbage1. But if the application forces the GC to run, the chances are that the heap won't be full, and the result will be that garbage is collected inefficiently. And the more often the application forces GC, the more inefficient the GC becomes.
Note: the above explanation glosses over the fact that a typical modern GC partitions the heap into "spaces", the GC may dynamically expand the heap, the application's working set of non-garbage objects may vary and so on. Even so, the same basic principal applies across the board to all true garbage collectors2. It is inefficient to force the GC to run.
1 - This is how the "throughput" collector works. Concurrent collectors such as CMS and G1 use different criteria to decide when to start the garbage collector.
2 - I'm also excluding memory managers that use reference counting exclusively, but no current Java implementation uses that approach ... for good reason.
Lots of people seem to be telling you not to do this. I disagree. If, after a large loading process like loading a level, you believe that:
You have a lot of objects that are unreachable and may not have been gc'ed. and
You think the user could put up with a small slowdown at this point
there is no harm in calling System.gc(). I look at it like the c/c++ inline keyword. It's just a hint to the gc that you, the developer, have decided that time/performance is not as important as it usually is and that some of it could be used reclaiming memory.
Advice to not rely on it doing anything is correct. Don't rely on it working, but giving the hint that now is an acceptable time to collect is perfectly fine. I'd rather waste time at a point in the code where it doesn't matter (loading screen) than when the user is actively interacting with the program (like during a level of a game.)
There is one time when i will force collection: when attempting to find out is a particular object leaks (either native code or large, complex callback interaction. Oh and any UI component that so much as glances at Matlab.) This should never be used in production code.
People have been doing a good job explaining why NOT to use, so I will tell you a couple situations where you should use it:
(The following comments apply to Hotspot running on Linux with the CMS collector, where I feel confident saying that System.gc() does in fact always invoke a full garbage collection).
After the initial work of starting up your application, you may be a terrible state of memory usage. Half your tenured generation could be full of garbage, meaning that you are that much closer to your first CMS. In applications where that matters, it is not a bad idea to call System.gc() to "reset" your heap to the starting state of live data.
Along the same lines as #1, if you monitor your heap usage closely, you want to have an accurate reading of what your baseline memory usage is. If the first 2 minutes of your application's uptime is all initialization, your data is going to be messed up unless you force (ahem... "suggest") the full gc up front.
You may have an application that is designed to never promote anything to the tenured generation while it is running. But maybe you need to initialize some data up-front that is not-so-huge as to automatically get moved to the tenured generation. Unless you call System.gc() after everything is set up, your data could sit in the new generation until the time comes for it to get promoted. All of a sudden your super-duper low-latency, low-GC application gets hit with a HUGE (relatively speaking, of course) latency penalty for promoting those objects during normal operations.
It is sometimes useful to have a System.gc call available in a production application for verifying the existence of a memory leak. If you know that the set of live data at time X should exist in a certain ratio to the set of live data at time Y, then it could be useful to call System.gc() a time X and time Y and compare memory usage.
This is a very bothersome question, and I feel contributes to many being opposed to Java despite how useful of a language it is.
The fact that you can't trust "System.gc" to do anything is incredibly daunting and can easily invoke "Fear, Uncertainty, Doubt" feel to the language.
In many cases, it is nice to deal with memory spikes that you cause on purpose before an important event occurs, which would cause users to think your program is badly designed/unresponsive.
Having ability to control the garbage collection would be very a great education tool, in turn improving people's understanding how the garbage collection works and how to make programs exploit it's default behavior as well as controlled behavior.
Let me review the arguments of this thread.
It is inefficient:
Often, the program may not be doing anything and you know it's not doing anything because of the way it was designed. For instance, it might be doing some kind of long wait with a large wait message box, and at the end it may as well add a call to collect garbage because the time to run it will take a really small fraction of the time of the long wait but will avoid gc from acting up in the middle of a more important operation.
It is always a bad practice and indicates broken code.
I disagree, it doesn't matter what garbage collector you have. Its' job is to track garbage and clean it.
By calling the gc during times where usage is less critical, you reduce odds of it running when your life relies on the specific code being run but instead it decides to collect garbage.
Sure, it might not behave the way you want or expect, but when you do want to call it, you know nothing is happening, and user is willing to tolerate slowness/downtime. If the System.gc works, great! If it doesn't, at least you tried. There's simply no down side unless the garbage collector has inherent side effects that do something horribly unexpected to how a garbage collector is suppose to behave if invoked manually, and this by itself causes distrust.
It is not a common use case:
It is a use case that cannot be achieved reliably, but could be if the system was designed that way. It's like making a traffic light and making it so that some/all of the traffic lights' buttons don't do anything, it makes you question why the button is there to begin with, javascript doesn't have garbage collection function so we don't scrutinize it as much for it.
The spec says that System.gc() is a hint that GC should run and the VM is free to ignore it.
what is a "hint"? what is "ignore"? a computer cannot simply take hints or ignore something, there are strict behavior paths it takes that may be dynamic that are guided by the intent of the system. A proper answer would include what the garbage collector is actually doing, at implementation level, that causes it to not perform collection when you request it. Is the feature simply a nop? Is there some kind of conditions that must me met? What are these conditions?
As it stands, Java's GC often seems like a monster that you just don't trust. You don't know when it's going to come or go, you don't know what it's going to do, how it's going to do it. I can imagine some experts having better idea of how their Garbage Collection works on per-instruction basis, but vast majority simply hopes it "just works", and having to trust an opaque-seeming algorithm to do work for you is frustrating.
There is a big gap between reading about something or being taught something, and actually seeing the implementation of it, the differences across systems, and being able to play with it without having to look at the source code. This creates confidence and feeling of mastery/understanding/control.
To summarize, there is an inherent problem with the answers "this feature might not do anything, and I won't go into details how to tell when it does do something and when it doesn't and why it won't or will, often implying that it is simply against the philosophy to try to do it, even if the intent behind it is reasonable".
It might be okay for Java GC to behave the way it does, or it might not, but to understand it, it is difficult to truly follow in which direction to go to get a comprehensive overview of what you can trust the GC to do and not to do, so it's too easy simply distrust the language, because the purpose of a language is to have controlled behavior up to philosophical extent(it's easy for a programmer, especially novices to fall into existential crisis from certain system/language behaviors) you are capable of tolerating(and if you can't, you just won't use the language until you have to), and more things you can't control for no known reason why you can't control them is inherently harmful.
Sometimes (not often!) you do truly know more about past, current and future memory usage then the run time does. This does not happen very often, and I would claim never in a web application while normal pages are being served.
Many year ago I work on a report generator, that
Had a single thread
Read the “report request” from a queue
Loaded the data needed for the report from the database
Generated the report and emailed it out.
Repeated forever, sleeping when there were no outstanding requests.
It did not reuse any data between reports and did not do any cashing.
Firstly as it was not real time and the users expected to wait for a report, a delay while the GC run was not an issue, but we needed to produce reports at a rate that was faster than they were requested.
Looking at the above outline of the process, it is clear that.
We know there would be very few live objects just after a report had been emailed out, as the next request had not started being processed yet.
It is well known that the cost of running a garbage collection cycle is depending on the number of live objects, the amount of garbage has little effect on the cost of a GC run.
That when the queue is empty there is nothing better to do then run the GC.
Therefore clearly it was well worth while doing a GC run whenever the request queue was empty; there was no downside to this.
It may be worth doing a GC run after each report is emailed, as we know this is a good time for a GC run. However if the computer had enough ram, better results would be obtained by delaying the GC run.
This behaviour was configured on a per installation bases, for some customers enabling a forced GC after each report greatly speeded up the production of reports. (I expect this was due to low memory on their server and it running lots of other processes, so hence a well time forced GC reduced paging.)
We never detected an installation that did not benefit from a forced GC run every time the work queue was empty.
But, let be clear, the above is not a common case.
These days I would be more inclined to run each report in a seperate process leaving the operating system to clear up memory rather then the garbage collector and having the custom queue manager service use mulple working processes on large servers.
GC efficiency relies on a number of heuristics. For instance, a common heuristic is that write accesses to objects usually occur on objects which were created not long ago. Another is that many objects are very short-lived (some objects will be used for a long time, but many will be discarded a few microseconds after their creation).
Calling System.gc() is like kicking the GC. It means: "all those carefully tuned parameters, those smart organizations, all the effort you just put into allocating and managing the objects such that things go smoothly, well, just drop the whole lot, and start from scratch". It may improve performance, but most of the time it just degrades performance.
To use System.gc() reliably(*) you need to know how the GC operates in all its fine details. Such details tend to change quite a bit if you use a JVM from another vendor, or the next version from the same vendor, or the same JVM but with slightly different command-line options. So it is rarely a good idea, unless you want to address a specific issue in which you control all those parameters. Hence the notion of "bad practice": that's not forbidden, the method exists, but it rarely pays off.
(*) I am talking about efficiency here. System.gc() will never break a correct Java program. It will neither conjure extra memory that the JVM could not have obtained otherwise: before throwing an OutOfMemoryError, the JVM does the job of System.gc(), even if as a last resort.
Maybe I write crappy code, but I've come to realize that clicking the trash-can icon on eclipse and netbeans IDEs is a 'good practice'.
Some of what I am about to write is simply a summarization of what has already been written in other answers, and some is new.
The question "Why is it bad practice to call System.gc()?" does not compute. It assumes that it is bad practice, while it is not. It greatly depends on what you are trying to accomplish.
The vast majority of programmers out there have no need for System.gc(), and it will never do anything useful to them in the vast majority of use cases. So, for the majority, calling it is bad practice because it will not do whatever it is that they think it will do, it will only add overhead.
However, there are a few rare cases where invoking System.gc() is actually beneficial:
When you are absolutely sure that you have some CPU time to spare now, and you want to improve the throughput of code that will run later. For example, a web server that discovers that there are no pending web requests at the moment can initiate garbage collection now, so as to reduce the chances that garbage collection will be needed during the processing of a barrage of web requests later on. (Of course this can hurt if a web request arrives during collection, but the web server could be smart about it and abandon collection if a request comes in.) Desktop GUIs are another example: on the idle event (or, more broadly, after a period of inactivity,) you can give the JVM a hint that if it has any garbage collection to do, now is better than later.
When you want to detect memory leaks. This is often done in combination with a debug-mode-only finalizer, or with the java.lang.ref.Cleaner class from Java 9 onwards. The idea is that by forcing garbage collection now, and thus discovering memory leaks now as opposed to some random point in time in the future, you can detect the memory leaks as soon as possible after they have happened, and therefore be in a better position to tell precisely which piece of code has leaked memory and why. (Incidentally, this is also one of, or perhaps the only, legitimate use cases for finalizers or the Cleaner. The practice of using finalization for recycling of unmanaged resources is flawed, despite being very widespread and even officially recommended, because it is non-deterministic. For more on this topic, read this: https://blog.michael.gr/2021/01/object-lifetime-awareness.html)
When you are measuring the performance of code, (benchmarking,) in order to reduce/minimize the chances of garbage collection occurring during the benchmark, or in order to guarantee that whatever overhead is suffered due to garbage collection during the benchmark is due to garbage generated by the code under benchmark, and not by unrelated code. A good benchmark always starts with an as thorough as possible garbage collection.
When you are measuring the memory consumption of code, in order to determine how much garbage is generated by a piece of code. The idea is to perform a full garbage collection so as to start in a clean state, run the code under measurement, obtain the heap size, then do another full garbage collection, obtain the heap size again, and take the difference. (Incidentally, the ability to temporarily suppress garbage collection while running the code under measurement would be useful here, alas, the JVM does not support that. This is deplorable.)
Note that of the above use cases, only one is in a production scenario; the rest are in testing / diagnostics scenarios.
This means that System.gc() can be quite useful under some circumstances, which in turn means that it being "only a hint" is problematic.
(For as long as the JVM is not offering some deterministic and guaranteed means of controlling garbage collection, the JVM is broken in this respect.)
Here is how you can turn System.gc() into a bit less of a hint:
private static void runGarbageCollection()
{
for( WeakReference<Object> ref = new WeakReference<>( new Object() ); ; )
{
System.gc(); //optional
Runtime.getRuntime().runFinalization(); //optional
if( ref.get() == null )
break;
Thread.yield();
}
}
This still does not guarantee that you will get a full GC, but it gets a lot closer. Specifically, it will give you some amount of garbage collection even if the -XX:DisableExplicitGC VM option has been used. (So, it truly uses System.gc() as a hint; it does not rely on it.)
Yes, calling System.gc() doesn't guarantee that it will run, it's a request to the JVM that may be ignored. From the docs:
Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects
It's almost always a bad idea to call it because the automatic memory management usually knows better than you when to gc. It will do so when its internal pool of free memory is low, or if the OS requests some memory be handed back.
It might be acceptable to call System.gc() if you know that it helps. By that I mean you've thoroughly tested and measured the behaviour of both scenarios on the deployment platform, and you can show it helps. Be aware though that the gc isn't easily predictable - it may help on one run and hurt on another.
First, there is a difference between spec and reality. The spec says that System.gc() is a hint that GC should run and the VM is free to ignore it. The reality is, the VM will never ignore a call to System.gc().
Calling GC comes with a non-trivial overhead to the call and if you do this at some random point in time it's likely you'll see no reward for your efforts. On the other hand, a naturally triggered collection is very likely to recoup the costs of the call. If you have information that indicates that a GC should be run than you can make the call to System.gc() and you should see benefits. However, it's my experience that this happens only in a few edge cases as it's very unlikely that you'll have enough information to understand if and when System.gc() should be called.
One example listed here, hitting the garbage can in your IDE. If you're off to a meeting why not hit it. The overhead isn't going to affect you and heap might be cleaned up for when you get back. Do this in a production system and frequent calls to collect will bring it to a grinding halt! Even occasional calls such as those made by RMI can be disruptive to performance.
In my experience, using System.gc() is effectively a platform-specific form of optimization (where "platform" is the combination of hardware architecture, OS, JVM version and possible more runtime parameters such as RAM available), because its behaviour, while roughly predictable on a specific platform, can (and will) vary considerably between platforms.
Yes, there are situations where System.gc() will improve (perceived) performance. On example is if delays are tolerable in some parts of your app, but not in others (the game example cited above, where you want GC to happen at the start of a level, not during the level).
However, whether it will help or hurt (or do nothing) is highly dependent on the platform (as defined above).
So I think it is valid as a last-resort platform-specific optimization (i.e. if other performance optimizations are not enough). But you should never call it just because you believe it might help(without specific benchmarks), because chances are it will not.
Since objects are dynamically allocated by using the new operator,
you might be wondering how such objects are destroyed and their
memory released for later reallocation.
In some languages, such as C++, dynamically allocated objects must
be manually released by use of a delete operator.
Java takes a different approach; it handles deallocation for you
automatically.
The technique that accomplishes this is called garbage collection.
It works like this: when no references to an object exist, that object is assumed to be no longer needed, and the memory occupied by the object can be reclaimed. There is no explicit need to destroy objects as in C++.
Garbage collection only occurs sporadically (if at all) during the
execution of your program.
It will not occur simply because one or more objects exist that are
no longer used.
Furthermore, different Java run-time implementations will take
varying approaches to garbage collection, but for the most part, you
should not have to think about it while writing your programs.

Java: flushing memory out to disk

Let's say I have a Java application which does roughly the following:
Initialize (takes a long time because this is complicated)
Do some stuff quickly
Wait idly for a long time (your favorite mechanism here)
Go to step 2.
Is there a way to encourage or force the JVM to flush its memory out to disk during long periods of idleness? (e.g. at the end of step 2, make some function call that effectively says "HEY JVM! I'm going to be going to sleep for a while.")
I don't mind using a big chunk of virtual memory, but physical memory is at a premium on the machine I'm using because there are many background processes.
The operating system should handle this, I'd think.
Otherwise, you could manually store your application to disk or database post-initialization, and do a quicker initialization from that data, maybe?
Instead of having your program sit idle and use up resources, why not schedule it with cron? Or better yet, since you're using Java, schedule it with Quartz? Do your best to cache elements of your lengthy initialization procedure so you don't have to pay a big penalty each time the scheduled task runs.
The very first thing you must make sure of, is that your objects are garbage collectable. But that's just the first step.
Secondly, the memory used by the JVM may not be returned to the OS at all.
For instance. Let's say you have 100mb of java objects, your VM size will be 100mb approx. After the garbage collection you may reduce the heap usage to 10mb, but the VM will stay in something around 100mb. This strategy is used to allow the VM to have available memory for new objects.
To have the application returning "physical" memory to the system you have to check if your VM supports such a thing.
There are additional VM options that may allow your app to return more memory to the OS:
-XX:MaxHeapFreeRatio=70 Maximum percentage of heap free after GC to avoid shrinking.
-XX:MinHeapFreeRatio=40 Minimum percentage of heap free after GC to avoid expansion.
In my own interpretation using those options the VM will shirk if it falls below 70%. But quite frankly I don't know if only the heap will shrink and be returned to the OS or only shrink inside the VM.
For a complete description on the hot memory management works see:
Description of HotSpot GCs: Memory Management in the Java HotSpot Virtual Machine White Paper: https://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf
And please, please. Give it a try and measure and let us know back here if that effectively reduces the memory consumption.
It's a bit of a hack to say the very least, but assuming you are on Win32 and if you are prepared to give up portability - write a small DLL that calls SetProcessWorkingSetSize and call into it using JNI. This allows you to suggest to the OS what the WS size should be. You can even specify -1, in which case the OS will attempt to page out as much as possible.
Assuming this is something like a server that's waiting for a request, could you do this?
Make two classes, Server and Worker.
Server only listens and launches Worker when required.
If Worker has never been initialised, initialise it.
After Worker has finished doing whatever it needed to do, serialize it, write it to disk, and set the Worker object to null.
Wait for a request.
When a request is received, read the serialized Worker object from disk and load it into memory.
Perform Worker tasks, when done, serialize, write out and set Worker object to null.
Rinse and repeat.
This means that the memory-intensive Worker object gets unloaded from memory (when the gc next runs, and you can encourage the gc to run by calling System.gc() after setting the Worker object to null), but since you saved it's state, you have the ability to reload it from disk and let it do it's work without going through initialization again. If it needs to run every "x" hours, you can put a java.util.Timer in the Server class instead of listening on a socket.
EDIT: There is also a JVM option -Xmx which sets the maximum size of the JVM's heap. This is probably not helpful in this case, but just thought I'd throw it in there.
Isn't this what page files are for? If your JVM is idle for any length of time and doesn't access it's memory pages. It'll very likely get paged and thus won't be using much actual RAM.
One thing you could do though... Most daemon programs have a startup phase (where they parse files and create data structures etc) and a running phase where they use the objects created at startup. If the JVM is allowed to it will start on the second phase without doing a garbage collection potentially causing the size of the process to grow and then stay that big for the lifetime of the process (since GC never/infrequently reduces the actual size of the process).
If you make sure that all memory allocated at each distinct phase of the programs life is GCable before the next phase starts then you can use the -Xmx setting to force down the maximum size of the process and cause your program to constantly GC between phases. I've done that before with some success.

Finding Memory Usage in Java

Following is the scenario i need to solve. I have struck with two solutions.
I need to maintain a cache of data fetched from database to be shown on a Swing GUI.
Whenever my JVM memory exceeds 70% of its allocated memory, i need to warn user regarding excessive usage. And once JVM memory usage exceeds 80%, then i have to halt all the database querying and clean up the existing cache fetched as part of the user operations and notifying the user. During cleanup process, i will manually handle deleting some data based up on some rules and instructs JVM for a GC. Whenever GC occurs, if memory cleans up and reaches 60% of the allocated memory, I need to restart all the Database handling and giving back control to the user.
For checking JVM memory statistics i found following two solutions. Could not able to decide which is best way and why.
Runtime.freeMemory() - Thread created to run every 10 seconds and check for the free memory and if memory exceeds the limits mentioned, necessary popups will intimate user and will call the methods to halt the operations and freeing up the memory.
MemoryPoolMXBean.getUsage() - Java 5 has introduced JMX to get the snapshot of the memory at runtime. In, JMX i cannot use Threshold notification since it will only notify when memory reaches/exceeds the given threshhold. Only way to use is Polling in MemoryMXBean and check the memory statistics over a period.
In case of using polling, it seems for me both the implementations are going to be same.
Please suggest the advantages of the methods and if there are any other alternatives/any corrections to the methods using.
Just a side note: Runtime.freeMemory() doesn't state the amount of memory that's left of allocating, it's just the amount of memory that's free within the currently allocated memory (which is initially smaller than the maximum memory the VM is configured to use), but grows over time.
When starting a VM, the max memory (Runtime.maxMemory()) just defines the upper limit of memory that the VM may allocate (configurable using the -Xmx VM option).
The total memory (Runtime.totalMemory()) is the initial size of the memory allocated for the VM process (configurable using the -Xms VM option), and will dynamically grow every time you allocate more than the currently free portion of it (Runtime.freeMemory()), until it reaches the max memory.
The metric you're interested in is the memory available for further allocation:
long usableFreeMemory= Runtime.getRuntime().maxMemory()
-Runtime.getRuntime().totalMemory()
+Runtime.getRuntime().freeMemory()
or:
double usedPercent=(double)(Runtime.getRuntime().totalMemory()
-Runtime.getRuntime().freeMemory())/Runtime.getRuntime().maxMemory()
The usual way to handle this sort of thing is to use WeakReferences and SoftReferences. You need to use both - the weak reference means you are not holding multiple copies of things, and the soft references mean that the GC will hang onto things until it starts running out of memory.
If you need to do additional cleanup, then you can add references to queues, and override the queue notification methods to trigger the cleanup. It's all good fun, but you do need to understand what these classes do.
It is entirely normal for a JVM to go up to 100% memory usage and them back to say 10% after a GC and do this every few second.
You shouldn't need to try managing the memory in this way.
You cannot say how much memory is being retained until a full GC has been run.
I suggest you work out what you are really trying to achieve and look at the problem another way.
The requirements you mention are a clear contradiction with how Garbage Collection works in a JVM.
because of the behaviour of the JVM it will be very hard to warn you users in a correct way.
Altogether stopping als database manipulation , cleaning stuff up and starting again really is not the way to go.
Let the JVM do what it is supposed to do, handle all memory related for you.
Modern generations of the JVM are very good at it and with some finetuning of the GC parameters you will get a a much cleaner memory handling then forcing things yourself
Articles like http://www.kodewerk.com/advice_on_jvm_heap_tuning_dont_touch_that_dial.htm mention the pros and cons and offer a nice explanation of what the VM does for you
I've only used the first method for similar task and it was OK.
One thing you should note, for both methods, is to implement some kind of debouncing - i.e. once you recognize you've hit 70% of memory, wait for a minute (or any other time you find appropriate) - GC can run at that time and clean up lots of memory.
If you implement a Runtime.freeMemory() graph in your system you'll see how the memory is constantly going up and down, up and down.
VisualVM is a bit nicer than JConsole because it gives you a nice visual Garbage Collector view.
Look into JConsole. It graphs the information you need so it is a matter of adapting this to your needs (given that you run on a Sun Java 6).
This also allows you to detach the surveiling process from what you want to look at.
Very late after the original post, I know, but I thought I'd post an example of how I've done it. Hopefully it'll be of some use to someone (I stress, it's a proof of principal example, nothing else... not particularly elegant either :) )
Just stick these two functions in a class, and it should work.
EDIT: Oh, andimport java.util.ArrayList;
import java.util.List;
public static int MEM(){
return (int)(Runtime.getRuntime().maxMemory()-Runtime.getRuntime().totalMemory() +Runtime.getRuntime().freeMemory())/1024/1024;
}
public static void main(String[] args) throws InterruptedException
{
List list = new ArrayList();
//get available memory before filling list
int initMem = MEM();
int lowMemWarning = (int) (initMem * 0.2);
int highMem = (int) (initMem *0.8);
int iteration =0;
while(true)
{
//use up some memory
list.add(Math.random());
//report
if(++iteration%10000==0)
{
System.out.printf("Available Memory: %dMb \tListSize: %d\n", MEM(),list.size());
//if low on memory, clear list and await garbage collection before continuing
if(MEM()<lowMemWarning)
{
System.out.printf("Warning! Low memory (%dMb remaining). Clearing list and cleaning up.\n",MEM());
//clear list
list = new ArrayList(); //obviously, here is a good place to put your warning logic
//ensure garbage collection occurs before continuing to re-add to list, to avoid immediately entering this block again
while(MEM()<highMem)
{
System.out.printf("Awaiting gc...(%dMb remaining)\n",MEM());
//give it a nudge
Runtime.getRuntime().gc();
Thread.sleep(250);
}
System.out.printf("gc successful! Continuing to fill list (%dMb remaining). List size: %d\n",MEM(),list.size());
Thread.sleep(3000); //just to view output
}
}
}
}
EDIT: This approach still relies on sensible setting of memory in the jvm using -Xmx, however.
EDIT2: It seems that the gc request line really does help things along, at least on my jvm. ymmv.

Categories