I'm trying to get the amount of available RAM of a system in a Java application, specifically on Linux but it needs to be cross platform. Not the amount of memory available to the JVM, the actual physical RAM that is available. Not the free RAM either, I mean available.
I tried using the OperatingSystemMXBean but it only returns free RAM: of course the problem is that Linux will consume free RAM as disk cache in order to speed up the system, reducing the "free" amount to almost zero even though the kernel will dump that cache at any time if more RAM is needed, hence the need for an "available" value.
So after a week or so my app will start complaining that my system is almost out of RAM, and I look at it like "no, the system may only have 100MB RAM free, but it's got 3GB of disk cache it can free up as needed".
Even "used" memory would be more useful than free. Every tutorial I read on getting "used" RAM says to use "total - free": not the same thing. Total - Used != available either, but it's closer than "free" and would give me more accurate tracking.
I feel like I've got to be missing something. "Free" RAM isn't a very useful metric in most cases; whenever someone says they want "free" RAM they almost always mean "available", how much more RAM can be used by applications. I'm pretty sure they're the same thing on Windows, but in *nix the distinction between "free" and "available" is incredibly important and it seems like a major oversight on Oracle/Sun's part.
This code snippet solves the problem.
public static long getAvailableMem()
{
String osName = System.getProperty("os.name");
if (osName.equals("Linux"))
{
try {
BufferedReader memInfo = new BufferedReader(new FileReader("/proc/meminfo"));
String line;
while ((line = memInfo.readLine()) != null)
{
if (line.startsWith("MemAvailable: "))
{
// Output is in KB which is close enough.
return java.lang.Long.parseLong(line.split("[^0-9]+")[1]) * 1024;
}
}
} catch (IOException e)
{
e.printStackTrace();
}
// We can also add checks for freebsd and sunos which have different ways of getting available memory
} else
{
OperatingSystemMXBean osBean = ManagementFactory.getOperatingSystemMXBean();
com.sun.management.OperatingSystemMXBean sunOsBean = (com.sun.management.OperatingSystemMXBean)osBean;
return sunOsBean.getFreePhysicalMemorySize();
}
return -1;
}
Related
I have a Java class that allocates all files within a directory (6GB). Then for each file, does some text processing. When I check the ram usage, I can see that when I finish from a file and start to the next file, RAM does not get rid of the previous file - bad garbage collection, I guess. Is there a way to programatically free the finished file and its data?
public void fromDirectory(String path) {
File folder = new File(path);
disFile = path + "/dis.txt";
if (folder.isDirectory()) {
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
File file = listOfFiles[i];
if (file.isFile() && file.getName().contains("log")) {
System.out.println("The file will be processed is: "
+ file.getPath());
forEachFile(file.getPath());
//Runtime.getRuntime().exec("purge");
//System.gc();
} else
System.out.println("The file " + file.getName()
+ " doesn't contain log");
}
} else {
System.out.println("The path: " + path + " is not a directory");
}
}
private void forEachFile(String filePath) {
File in = new File(filePath);
File out = new File(disFile);
try {
out.createNewFile();
FileWriter fw = new FileWriter(out.getAbsoluteFile());
BufferedWriter bw = new BufferedWriter(fw);
BufferedReader reader = new BufferedReader(new FileReader(in));
String line = null;
while ((line = reader.readLine()) != null) {
if (line.toLowerCase().contains("keyword")) {
bw.write(line);
bw.newLine();
numberOfLines++;
}
}
reader.close();
bw.close();
} catch (IOException e) {
e.printStackTrace();
}
}
You can strongly suggests the VM to do a garbage collection by calling System.gc() . It is generally considered a code-smell to do so.
I think you are mistaking two things here: JVMs memory allocation and realy memory usage within allocated space.
JVM may allocate a lot of memory and not free it even after the objects that were using it were garbaged internally. It may be freed after some time or not freed at all.
You could try to reduce the memory footprint of your application, for example by not using toLowerCase, since it creates a new object. Maybe a precompiled regex search would be faster?
Using System.gc() as you did it, in your case, in my opinion, is acceptable. Whether it helps anything - I don't know.
As long as you have a lot of memory available and Java doesn't slow down because of too not being able to allocate more, I would leave it as it is. The code looks fine.
Even if you are right about checking the memory from some profiler and deducing "correctly" that the file remains in memory why do you think it should be released immediately?
The JVM will garbage collect when the memory is running out (depending on the JVM configuration) not when developers think it should.
Also judging from your question I doubt if you used a profiler or a similar tool gauge JVM memory usage. Instead it's more likely you checked the memory being used by the JVM as a whole.
Also you shouldn't worry about these things unless you are encountering out of memory errors.
As stated, the garbage collector runs when there is no more memory available. If you have 10 files of 100MB each, and you set your heap to 4GB, then chances are that you simply won't ever get any GC.
Now, for the "free the finished file and its data" part, you cannot really do this by yourself, and should not try to do so.
If you want your application to be memory-efficient, then you can just set the maximum heap size to a small value.
On the other hand, if you want your application to be really fast, then you don't want to suffer from any GC, therefore eliminating every System.gc() call and giving your heap as much memory as possible.
Trying to free memory yourself mean giving too much memory to your heap (your app is not memory-efficient) and triggering GC yourself (your app is not time-efficient either).
Note that in some cases, the JVM can give back memory to the OS. For instance, with G1, it will, but with CMS, it won't. See this article for more details.
Finally, if you use Java7, you should wrap your InputStream/OutputStream in a try-with-resources. Or, at least, wrap the .close() in a finally block.
Hope that helps !
I used a while loop to fetch message from Amazon SQS. Partial code is as follows:
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(myQueueUrl);
while (true) {
List<Message> messages = sqs.receiveMessage(receiveMessageRequest).getMessages();
if (messages.size() > 0) {
MemcachedClient c = new MemcachedClient(new BinaryConnectionFactory(), AddrUtil.getAddresses(memAddress));
for (Message message : messages) {
// get message from aws sqs
String messageid = message.getBody();
String messageRecieptHandle = message.getReceiptHandle();
sqs.deleteMessage(new DeleteMessageRequest(myQueueUrl, messageRecieptHandle));
// get details info from memcache
String result = null;
String key = null;
key = "message-"+messageid;
result = c.get(key);
}
c.shutdown();
}
}
Will it cause memory leak in such case?
I checked using "ps aux". What I found is that the RSS (resident set size, the non-swapped physical memory that a task used) is growing slowly.
You can't evaluate whether your Java application has a memory leak simply based on the RSS of the process. Most JVMs are pretty greedy, they would rather take more memory from the OS than spend a lot of work on Garbage Collection.
That said your while loop doesn't seem like it has any obvious memory "leaks" either, but that depends on what some of the method calls do (which isn't included above). If you are storing things in static variables, that can be a cause of concern but if the only references are within the scope of the loop you're probably fine.
The simplest way to know if you have a memory leak in a certain area of code is to rigorously exercise that code within a single run of your application (potentially set with a relatively low maximum heap size). If you get an OutOfMemoryError, you probably have a memory leak.
Sorry, but I don't see here code to remove message from the message queue. Did you clean the message list? In case that DeleteRequest removes message from the queue then you try to modify message list which you itereate.
Also you can get better memory usage statistic with visualvm tool which is part of JDK now.
So I'm using Java to do multi-way external merge sorts of large on-disk files of line-delimited tuples. Batches of tuples are read into a TreeSet, which are then dumped into on-disk sorted batches. Once all of the data have been exhausted, these batches are then merge-sorted to the output.
Currently I'm using magic numbers for figuring out how many tuples we can fit into memory. This is based on a static figure indicating how may tuples can be roughly fit per MB of heap space, and how much heap space is available using:
long max = Runtime.getRuntime().maxMemory();
long used = Runtime.getRuntime().totalMemory();
long free = Runtime.getRuntime().freeMemory();
long space = free + (max - used);
However, this does not always work so well since we may be sorting different length tuples (for which the static tuple-per-MB figure might be too conservative) and I now want to use flyweight patterns to jam more in there, which may make the figure even more variable.
So I'm looking for a better way to fill the heap-space to the brim. Ideally the solution should be:
reliable (no risk of heap-space exceptions)
flexible (not based on static numbers)
efficient (e.g., not polling runtime memory estimates after every tuple)
Any ideas?
Filling the heap to the brim might be a bad idea due to garbage collector trashing. (As the memory gets nearly full, the efficiency of garbage collection approaches 0, because the effort for collection depends on heap size, but the amount of memory freed depends on the size of the objects identified as unreachable).
However, if you must, can't you simply do it as follows?
for (;;) {
long freeSpace = getFreeSpace();
if (freeSpace < 1000000) break;
for (;;freeSpace > 0) {
treeSet.add(readRecord());
freeSpace -= MAX_RECORD_SIZE;
}
}
The calls to discover the free memory will be rare, so shouldn't tax performance much. For instance, if you have 1 GB heap space, and leave 1MB empty, and MAX_RECORD_SIZE is ten times average record size, getFreeSpace() will be invoked a mere log(1000) / -log(0.9) ~= 66 times.
Why bother with calculating how many items you can hold? How about letting java tell you when you've used up all your memory, catching the exception and continuing. For example,
// prepare output medium now so we don't need to worry about having enough
// memory once the treeset has been filled.
BufferedWriter writer = new BufferedWriter(new FileWriter("output"));
Set<?> set = new TreeSet<?>();
int linesRead = 0;
{
BufferedReader reader = new BufferedReader(new FileReader("input"));
try {
String line = reader.readLine();
while (reader != null) {
set.add(parseTuple(line));
linesRead += 1;
line = reader.readLine();
}
// end of file reached
linesRead = -1;
} catch (OutOfMemoryError e) {
// while loop broken
} finally {
reader.close();
}
// since reader and line were declared in a block their resources will
// now be released
}
// output treeset to file
for (Object o: set) {
writer.write(o.toString());
}
writer.close();
// use linesRead to find position in file for next pass
// or continue on to next file, depending on value of linesRead
If you still have trouble with memory, just make the reader's buffer extra large so as to reserve more memory.
The default size for the buffer in a BufferedReader is 4096 bytes. So when finishing reading you will release upwards of 4k of memory. After this your additional memory needs will be minimal. You need enough memory to create an iterator for the set, let's be generous and assume 200 bytes. You will also need memory to store the string output of your tuples (but only temporarily). You say the tuples contain about 200 characters. Let's double that to take account for separators -- 400 characters, which is 800 bytes. So all you really need is an additional 1k bytes. So you're fine as you've just released 4k bytes.
The reason you don't need to worry about the memory used to store the string output of your tuples is because they are short lived and only referred to within the output for loop. Note that the Writer will copy the contents into its buffer and then discard the string. Thus, the next time the garbage collector runs the memory can be reclaimed.
I've checked and, a OOME in add will not leave a TreeSet in an inconsistent state, and the memory allocation for a new Entry (the internal implementation for storing a key/value pair) happens before the internal representation is modified.
You can really fill the heap to the brim using direct memory writing (it does exist in Java!). It's in sun.misc.Unsafe, but isn't really recommended for use. See here for more details. I'd probably advise writing some JNI code instead, and using existing C++ algorithms.
I'll add this as an idea I was playing around with, involving using a SoftReference as a "sniffer" for low memory.
SoftReference<Byte[]> sniffer = new SoftReference<String>(new Byte[8192]);
while(iter.hasNext()){
tuple = iter.next();
treeset.add(tuple);
if(sniffer.get()==null){
dump(treeset);
treeset.clear();
sniffer = new SoftReference<String>(new Byte[8192]);
}
}
This might work well in theory, but I don't know the exact behaviour of SoftReference.
All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError. Otherwise no constraints are placed upon the time at which a soft reference will be cleared or the order in which a set of such references to different objects will be cleared. Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references.
Would like to hear feedback as it seems to me like an elegant solution, although behaviour might vary between VMs?
Testing on my laptop, I found that it the soft-reference is cleared infrequently, but sometimes is cleared too early, so I'm thinking to combine it with meriton's answer:
SoftReference<Byte[]> sniffer = new SoftReference<String>(new Byte[8192]);
while(iter.hasNext()){
tuple = iter.next();
treeset.add(tuple);
if(sniffer.get()==null){
free = MemoryManager.estimateFreeSpace();
if(free < MIN_SAFE_MEMORY){
dump(treeset);
treeset.clear();
sniffer = new SoftReference<String>(new Byte[8192]);
}
}
}
Again, thoughts welcome!
Sorry I can't post code but I have a bufferedreader with 50000000 bytes set as the buffer size. It works as you would expect for half an hour, the HDD light flashing every two minutes or so, reading in the big chunk of data, and then going quiet again as the CPU processes it. But after about half an hour (this is a very big file), the HDD starts thrashing as if it is reading one byte at a time. It is still in the same loop and I think I checked free ram to rule out swapping (heap size is default).
Probably won't get any helpful answers, but worth a try.
OK I have changed heap size to 768mb and still nothing. There is plenty of free memory and java.exe is only using about 300mb.
Now I have profiled it and heap stays at about 200MB, well below what is available. CPU stays at 50%. Yet the HDD starts thrashing like crazy. I have.. no idea. I am going to rewrite the whole thing in c#, that is my solution.
Here is the code (it is just a throw-away script, not pretty):
BufferedReader s = null;
HashMap<String, Integer> allWords = new HashMap<String, Integer>();
HashSet<String> pageWords = new HashSet<String>();
long[] pageCount = new long[78592];
long pages = 0;
Scanner wordFile = new Scanner(new BufferedReader(new FileReader("allWords.txt")));
while (wordFile.hasNext()) {
allWords.put(wordFile.next(), Integer.parseInt(wordFile.next()));
}
s = new BufferedReader(new FileReader("wikipedia/enwiki-latest-pages-articles.xml"), 50000000);
StringBuilder words = new StringBuilder();
String nextLine = null;
while ((nextLine = s.readLine()) != null) {
if (a.matcher(nextLine).matches()) {
continue;
}
else if (b.matcher(nextLine).matches()) {
continue;
}
else if (c.matcher(nextLine).matches()) {
continue;
}
else if (d.matcher(nextLine).matches()) {
nextLine = s.readLine();
if (e.matcher(nextLine).matches()) {
if (f.matcher(s.readLine()).matches()) {
pageWords.addAll(Arrays.asList(words.toString().toLowerCase().split("[^a-zA-Z]")));
words.setLength(0);
pages++;
for (String word : pageWords) {
if (allWords.containsKey(word)) {
pageCount[allWords.get(word)]++;
}
else if (!word.isEmpty() && allWords.containsKey(word.substring(0, word.length() - 1))) {
pageCount[allWords.get(word.substring(0, word.length() - 1))]++;
}
}
pageWords.clear();
}
}
}
else if (g.matcher(nextLine).matches()) {
continue;
}
words.append(nextLine);
words.append(" ");
}
Have you tried removing the buffer size and trying it out with the defaults?
It may be not that the file buffering isn't working, but that your program is using up enough memory that your virtual memory system is page swapping to disk. What happens if you try with a smaller buffer size? What about larger?
I'd bet that you are running out of heap space and you are getting stuck doing back to back GC's. Have you profiled the app to see what is going on during that time? Also, try running with -verbose:gc to see garbage collection as it happens. You could also try starting with a larger heap like"
-Xms1000m -Xmx1000m
That will give you 1gb of heap so if you do use that all up, it should be much later than it is currently happening.
It appears to me that if the file you are reading is very large, then the following lines could result in a large portion of the file being copied to memory via a StringBuilder. If the process' memory footprint becomes too large, you will likely swap and/or throw your garbage collector into a spin.
...
words.append(nextLine);
words.append(" ");
Hopefully this may help: http://www.velocityreviews.com/forums/t131734-bufferedreader-and-buffer-size.html
Before you assume there is something wrong with Java and reading IO, I suggest you write a simple program which just reads the file as fast as it can. You should be able to read the file at 20 MB/s or more regardless of file size with default buffering. You should be able to do this by stripping down your application to just read the file. Then you can prove to yourself how long it takes to read the file.
You have used quite a lot of expensive operations. Perhaps you should look at how you can make your parser more efficient using a profiler. e.g.
word.substring(0, word.length() - 1)
is the same as
word
so the first if clause and the second are the same.
I work on a legacy system that has a VB6 app that needs to call Java code. The solution we use is to have the VB app call a C++ dll that uses JNI to call the Java code. A bit funky, but it's actually worked pretty well. However, I'm moving to a new dev box, and I've just run into a serious problem with this. The built VB app works fine on the new box, but when I try to run it from VB, the dll fails to load the VM, getting a return code of -4 (JNI_ENOMEM) from JNI_CreateJavaVM.
Both the built app and VB are calling the exact same dll, and I've tried it with both Java 1.5 and 1.6. I've tried the suggestions here (redirecting stdout and stderr to files, adding a vfprint option, adding an -Xcheck:jni option), but to no avail. I can't seem to get any additional information out of the jvm. As far as I can tell, the new box is configured pretty much the same as the old one (installed software, Path, Classpath, etc.), and both are running the same release of Windows Server 2003. The new machine is an x64 box with more memory (4GB rather than 2GB), but it's running 32-bit Windows.
Any suggestions or ideas about what else to look into? Rewriting the whole thing in a more sane way is not an option -- I need to find a way to have the dll get the jvm to load without thinking that it's out of memory. Any help would be much appreciated.
OK, I've figured it out. As kschneid points out, the JVM needs a pretty large contiguous chunk of memory inside the application's memory space. So I used the sysinternals VMMap utility to see what VB's memory looked like. There was, in fact, no large chunk of memory available, and there were some libraries belonging to Visio that were loadeed in locations that seemed to be designed to fragment memory. It turns out that when I installed Visio on the new machine, it automatically installed the Visio UML add-in into VB. Since I don't use this add-in, I disabled it. With the add-in disabled, there was a large contiguous chunk of free memory available, and now the JVM loads just fine.
FYI - I found the following extremely useful article: https://forums.oracle.com/forums/thread.jspa?messageID=6463655
I'm going to repeat some insanely useful code here because I'm not sure that I trust Oracle to keep the above forum around.
When I set up my JVM, I use a call to getMaxHeapAvailable(), then set my heap space accordingly (-Xmxm) - works great for workstations with less RAM available, without having to penalize users with large amounts of RAM.
bool canAllocate(DWORD bytes)
{
LPVOID lpvBase;
lpvBase = VirtualAlloc(NULL, bytes, MEM_RESERVE, PAGE_READWRITE);
if (lpvBase == NULL) return false;
VirtualFree(lpvBase, 0, MEM_RELEASE);
return true;
}
int getMaxHeapAvailable(int permGenMB, int maxHeapMB)
{
DWORD originalMaxHeapBytes = 0;
DWORD maxHeapBytes = 0;
int numMemChunks = 0;
SYSTEM_INFO sSysInfo;
DWORD maxPermBytes = permGenMB * NUM_BYTES_PER_MB; // Perm space is in addition to the heap size
DWORD numBytesNeeded = 0;
GetSystemInfo(&sSysInfo);
// jvm aligns as follows:
// quoted from size_t GenCollectorPolicy::compute_max_alignment() of jdk 7 hotspot code:
// The card marking array and the offset arrays for old generations are
// committed in os pages as well. Make sure they are entirely full (to
// avoid partial page problems), e.g. if 512 bytes heap corresponds to 1
// byte entry and the os page size is 4096, the maximum heap size should
// be 512*4096 = 2MB aligned.
// card_size computation from CardTableModRefBS::SomePublicConstants of jdk 7 hotspot code
int card_shift = 9;
int card_size = 1 << card_shift;
DWORD alignmentBytes = sSysInfo.dwPageSize * card_size;
maxHeapBytes = maxHeapMB * NUM_BYTES_PER_MB;
// make it fit in the alignment structure
maxHeapBytes = maxHeapBytes + (maxHeapBytes % alignmentBytes);
numMemChunks = maxHeapBytes / alignmentBytes;
originalMaxHeapBytes = maxHeapBytes;
// loop and decrement requested amount by one chunk
// until the available amount is found
numBytesNeeded = maxHeapBytes + maxPermBytes;
while (!canAllocate(numBytesNeeded + 50*NUM_BYTES_PER_MB) && numMemChunks > 0) // 50 is an overhead fudge factory per https://forums.oracle.com/forums/thread.jspa?messageID=6463655 (they had 28, I'm bumping it 'just in case')
{
numMemChunks --;
maxHeapBytes = numMemChunks * alignmentBytes;
numBytesNeeded = maxHeapBytes + maxPermBytes;
}
if (numMemChunks == 0) return 0;
// we can allocate the requested size, return it now
if (maxHeapBytes == originalMaxHeapBytes) return maxHeapMB;
// calculate the new MaxHeapSize in megabytes
return maxHeapBytes / NUM_BYTES_PER_MB;
}
I had the same problem described by "the klaus" and read "http://support.microsoft.com/kb/126962". Changed the registry as described in the mentioned article. I exagerated my change to : "%SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=3072,3072,3072 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ProfileControl=Off MaxRequestThreads=16"
The field to look at is "SharedSection=3072,3072,3072". It solved my problem, but I may have side-effects because of this change.