I have a file of size 5 gb. I like to do memory map that file in Java. I understand one memory mapped portion can not be > 2 gb.
My question is, Is it possible to create 5 x 1 gb memory mapped portions to map the complete 5 gb file and access them in the same Java application.
No, it's not possible.
There are two issues here:
First of all, a 32-bit machine (or 32-bit OS on 64-bit machine) only has an address space of 4 GB (32 bits), so you can't map a 5 GB file all at the same time even from C.
The other issue is the limitation of Java's implementation of memory mapping which is handled via a MappedByteBuffer. Even though the method FileChannel.map() takes longs for offset and size, it returns a MappedByteBuffer which can only use ints for its limit and position. This means that even on a 64-bit machine and OS where you can map the whole 5 GB file as a single area from C, in Java you will have to manually create a series of mapped regions, each no larger than 2 GB. Still, you will at least be able to map the 5 GB in chunks while on a 32 bit OS you can't have them mapped at the same time. And given that in Java unmapping a file region requires some ugly tricks, it's not convenient (though possible) to map and unmap regions as needed in order to keep them within the limit. You can have a look at the source code of Lucene or Cassandra. As far as I remember they also use libraries with native code when possible in order to handle mapping and unmapping in a more efficient way than pure Java allows.
To make things even more complicated, 2 GB is the theoretical limit which may not be reachable on a 32-bit OS due to memory fragmentation. Some OS-es may also be configured with a 3-1 memory split which leaves just 1 GB of address space available to user-space programs, with the rest going to the OS address space. So, in practice, the chunks you should try mapping should be much smaller than 2 GB, you are more likely to succeed in mapping 4-6 chunks of 250 MB than in mapping a single 2 GB chunk.
Please see MappedByteBuffer and FileChannel.map() javadocs.
I'm not an expert in Java NIO, so I'm not sure if the byte buffer handles chunks automatically or if you have to use multiple MappedByteBuffers. Feel free to code a simple class to test and play around with your huge file.
Related
I have just encountered an error in my opensrc library code that allocates a large buffer for making modifications to a large flac file, the error only occurs on an old PC machine with 3Gb of memory using Java 1.8.0_74 25.74-b02 32bit
Originally I used to just allocate a buffer
ByteBuffer audioData = ByteBuffer.allocateDirect((int)(fc.size() - fc.position()));
But for some time I have it as
MappedByteBuffer mappedFile = fc.map(MapMode.READ_WRITE, 0, totalTargetSize);
My (mis)understanding was that mapped buffers use less memory that a direct buffer because the whole mapped buffer doesnt have to be in memory at the same time only the part being used. But this answer says that using mapped byte buffers is a bad idea so Im not qwuite clear how it works
Java Large File Upload throws java.io.IOException: Map failed
The full code can be seen at here
Although a mapped buffer may use less physical memory at any one point in time, it still requires an available (logical) address space equal to the total (logical) size of the buffer. To make things worse, it might (probably) requires that address space to be contiguous. For whatever reason, that old computer appears unable to provide sufficient additional logical address space. Two likely explanations are (1) a limited logical address space + hefty buffer memory requirements, and (2) some internal limitation that the OS is imposing on the amount of memory that can be mapped as a file for I/O.
Regarding the first possibility, consider the fact that in a virtual memory system every process executes in its own logical address space (and so has access to the full 2^32 bytes worth of addressing). So if--at the point in time in which you try to instantiate the MappedByteBuffer--the current size of the JVM process plus the total (logical) size of the MappedByteBuffer is greater than 2^32 bytes (~ 4 gigabytes), then you would run into an OutOfMemoryError (or whatever error/exception that class chooses to throw in its stead, e.g. IOException: Map failed).
Regarding the second possibility, probably the easiest way to evaluate this is to profile your program / the JVM as you attempt to instantiate the MappedByteBuffer. If the JVM process' allocated memory + the required totalTargetSize are well below the 2^32 byte ceiling, but you still get a "map failed" error, then it is likely that some internal OS limit on the size of memory-mapped files is the root cause.
So what does this mean as far as possible solutions go?
Just don't use that old PC. (preferable, but probably not feasible)
Make sure everything else in your JVM has as low a memory footprint as possible for the lifespan of the MappedByteBuffer. (plausible, but maybe irrelevant and definitely impractical)
Break that file up into smaller chunks, then operate on only one chunk at a time. (might depend on the nature of the file)
Use a different / smaller buffer. ...and just put up with the decreased performance. (this is the most realistic solution, even if it's the most frustrating)
Also, what exactly is the totalTargetSize for your problem case?
EDIT:
After doing some digging, it seems clear that the IOException is due to running out of address space in a 32-bit environment. This can happen even when the file itself is under 2^32 bytes either due to the lack of sufficient contiguous address space, or due to other sufficiently large address space requirements in the JVM at the same time combined with the large MappedByteBuffer request (see comments). To be clear, an IOE can still be thrown rather than an OOM even if the original cause is ENOMEM. Moreover, there appear to be issues with older [insert Microsoft OS here] 32-bit environments in particular (example, example).
So it looks like you have three main choices.
Use "the 64-bit JRE or...another operating system" altogether.
Use a smaller buffer of a different type and operate on the file in chunks. (and take the performance hit due to not using a mapped buffer)
Continue to use the MappedFileBuffer for performance reasons, but also operate on the file in smaller chunks in order to work around the address space limitations.
The reason I put using MappedFileBuffer in smaller chunks as third is because of the well-established and unresolved problems in unmapping a MappedFileBuffer (example), which is something you would necessarily have to do in between processing each chunk in order to avoid hitting the 32-bit ceiling due to the combined address space footprint of accumulated mappings. (NOTE: this only applies if it is the 32-bit address space ceiling and not some internal OS restrictions that are the problem... if the latter, then ignore this paragraph) You could attempt this strategy (delete all references then run the GC), but you would essentially be at the mercy of how the GC and your underlying OS interact regarding memory-mapped files. And other potential workarounds that attempt to manipulate the underlying memory-mapped file more-or-less directly (example) are exceedingly dangerous and specifically condemned by Oracle (see last paragraph). Finally, considering that GC behavior is unreliable anyway, and moreover that the official documentation explicitly states that "many of the details of memory-mapped files [are] unspecified", I would not recommend using MappedFileBuffer like this regardless of any workaround you may read about.
So unless you're willing to take the risk, I'd suggest either following Oracle's explicit advice (point 1), or processing the file as a sequence of smaller chunks using a different buffer type (point 2).
When you allocate buffer, you basically get chunk of virtual memory off your operating system (and this virtual memory is finite and upper theoretical is your RAM + whatever swap is configured - whatever else was grabbed first by other programs and OS)
Memory map just adds space occupied on your on disk file to your virtual memory (ok, there is some overhead, but not that much) - so you can get more of it.
Neither of those has to be present in RAM constantly, parts of it could be swapped out to disk at any given time.
I have a serialized object on disk of a patricia trie(https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/trie/PatriciaTrie.html). On disk, it occupies roughly 7.4 GB. I am using a 64 GB RAM server. When deserialized, the memory consumption of the corresponding process goes up till 40 GB. Is this sensible, because the highest voted answer at Serialized object size vs in memory object size in Java says that "the size in memory will be usually between half and double the serializable size!" I was expecting the in memory size to not go beyond 15 GB, but 40 GB is too much as other processes would be loaded as well.
I thought of using http://docs.oracle.com/javase/7/docs/api/java/lang/instrument/Instrumentation.html for measuring size in memory, but Calculate size of Object in Java says that it "can be used to get the implementation specific approximation of object size." So, it would be again approximate measure only.
Is there something I am missing here. I am closing the file and bufferred reader as well. What could be hogging all the memory? I can't share the code for corporate policy reasons - any help or pointers would be highly appreciated. Thanks
Serialized size on disk has little to do with the size of the data in memory. Every object in Java has some memory overhead (which can vary depending on the JVM mode and version). An single array of bytes would have serialized and deserialized to about the same size/memory. However, an array of billion 8 byte arrays would not.
If you create a heap dump of the data after deserializing the data you should be able to determine exactly where the memory is going.
How to collect heap dumps of any java process
What is the memory consumption of an object in Java?
Trick behind JVM's compressed Oops
I am running a memory intensive application. Some info about the environment:
64 bit debian
13 GB of RAM
64 bit JVM (I output System.getProperty("sun.arch.data.model") when my program runs, and it says "64")
Here is the exact command I am issuing:
java -Xmx9000m -jar "ale.jar" testconfig
I have run the program with same exact data, config, etc. on several other systems, and I know that the JVM uses (at its peak) 6 GB of memory on those systems. However, I am getting an OutOfMemory error. Furthermore, during the execution of the program, the system never drops below 8.5 GB of free memory.
When I output Runtime.getRuntime().maxMemory() during execution, I get the value 3044540416, i.e. ~ 3 GB.
I don't know whether it is relevant, but this is a Google Compute Engine instance.
The only explanation I can think of is that there may be some sort of system restriction on the maximum amount of memory that a single process may use.
-Xmx will only set the maximum assigned memory. Use -Xms to specify the minimum. Setting them to the same value will make the memory footprint static.
The only explanation I can think of is that there may be some sort of system restriction on the maximum amount of memory that a single process may use.
That is one possible explanation.
Another one is that you are attempting to allocate a really large array. The largest possible arrays are 2^31 - 1 elements, but the actual size depends on the element size:
byte[] or boolean[] ... 2G bytes
char[] or short[] ... 4G bytes
int[] ... 8 Gbytes
long[] or Object[] ... 16 Gbytes
If you allocate a really large array, the GC needs to find a contiguous region of free memory of the required size. Depending on the array size, and how the heap space is split into spaces, it may be able to find considerably less contiguous space than you think.
A third possibility is that you are getting OOMEs because the GC is hitting the GC Overhead limit for time spent running the GC.
Some of these theories could be confirmed or dismissed if you showed us the stacktrace ...
My problem in short:
I have a machine with 500 GB RAM without swap (more than enough) : top command shows 500GB of free ram
I have a 20GB file containing triplets (stringOfTypeX, stringOfTypeY, double val). The meaning is that for one string of type X, the file has on average 20-30 lines, each containing this string of type X plus one (different) string of type Y and the double value associated
I want to load the file in an in-memory index HashMap < StringOfTypeX, TreeMap < StringOfTypeY, val > >
I wrote a Java program using BufferedReader.readLine()
in this program, the hashmap is initialized in the constructor using an initCapacity of 2 times the expected number of distinct strings of type X (the expected number of keys)
I ran the program using: java -jar XXX.jar -Xms500G -Xmx500G -XX:-UseGCOverheadLimit
the program seems to process file lines slower and slower: at first, it processes 2M lines per minute, but with each chunk of 2M lines, it gets slower and slower. After 16M of lines, it is almost stopped and, eventually, it will throw a java.lang.OutOfMemoryError(GC overhead limit exceeded)
before it throws that error, top command shows me that it consumes 6% of the 500GB ram (and this value is constant, the program doesn't consume more RAM than this for the rest of its lifetime)
I've read all possible internet threads regarding this. Nothing seems to work. I guess the GC starts doing a lot of stuff, but I don't understand why it does this given that I tried to allocate the hashmap enough RAM before the starting. Anyways, it seems that JVM cannot be forced to pre-allocate a big amount of RAM, no matter what command line args I give. If this is true, what is the real usage of Xmx and Xms params ?
Anyone has any ideas? Many thanks !!
Update:
my jvm is 64-bit
6.1% of the 515 GB of RAM is ~ 32GB. Seems that JVM is not allowing the usage of more than 32 GB. Following this post I tried to disable the use of compressed pointers using the flag -XX:-UseCompressedOops . However, nothing changed. The limit is still 32GB.
no swap is done at any point in time (checked using top)
running with -Xms400G -Xmx400G doesn't solve the issue
It is fairly common to mis-diagnose these sorts of problem.
500 GB should be more than enough, assuming you have more than 500 GB of main memory, swap will not do.
20 GB file is likely to have a significant expansion ration if you have Strings. e.g. a 16 character String will use about 80 bytes of memory., A Double uses around 24 bytes in a 64-bit JVM, not the 8 bytes you might expect.
HashMap and TreeMap uses about 24 extra bytes per entry.
Using readLine() and doubling the capacity is fine. Actually expected-size*4/3 is enough though it always uses the next power of 2.
Setting the -Xms does preallocate the memory specific (or almost that number, it is often out by 1% for no apparent reason)
2 M lines per minute is pretty slow. It suggests your overhead is already very high. I would be looking for something closer to 1 million lines per second.
16 million entries is nothing compared with the size of your JVM. My guess is you have started to swap and the error you see is because the GC is taking too long, not because the heap is too full.
How much free main memory do you have? e.g. in top what do you see after the application dies.
Problem solved:
java -jar XXX.jar -Xms500G -Xmx500G -XX:-UseGCOverheadLimit is not correct. The running parameters should be specified before -jar, otherwise they will be considered as Main params. The correct cmd line is java -Xms500G -Xmx500G -XX:-UseGCOverheadLimit -jar XXX.jar args[0] args[1] ... .
Sorry for this and thanks for you answers!
You say you have 500GB of RAM. You shouldn't set the Xmx to 500 GB because this is only the Heap size. The VM itself has some memory overhead too. So it is not advised to fully set all memory to it.
I would recommend to profile your application using for example JVisualVM. Or make an heapdump to know what really is in the memory. Maybe something is not cleaned up.
Background:
I have a Java application which does intensive IO on quite large
memory mapped files (> 500 MB). The program reads data, writes data,
and sometimes does both.
All read/write functions have similar computation complexity.
I benchmarked the IO layer of the program and noticed strange
performance characteristics of memory mapped files:
It performs 90k reads per second (read 1KB every iteration at random position)
It performs 38k writes per second (write 1KB every iteration sequentially)
It performs 43k writes per second (write 4 bytes every iteration at random position)
It performs only 9k read/write combined operation per second (read 12 bytes then write 1KB every iteration, at random position)
The programs on 64-bit JDK 1.7, Linux 3.4.
The machine is an ordinary Intel PC with 8 threads CPU and 4GB physical memory. Only 1 GB was assigned to JVM heap when conducting the benchmark.
If more details are needed, here is the benchmark code: https://github.com/HouzuoGuo/Aurinko2/blob/master/src/test/scala/storage/Benchmark.scala
And here is the implementation of the above read, write, read/write functions: https://github.com/HouzuoGuo/Aurinko2/blob/master/src/main/scala/aurinko2/storage/Collection.scala
So my questions are:
Given fixed file size and memory size, what factors affect memory mapped file random read performance?
Given fixed file size and memory size, what factors affect memory mapped file random write performance?
How do I explain the benchmark result of read/write combined operation? (I was expecting it to perform over 20K iterations per second).
Thank you.
The memory mapped file performance depends on disk performance, file system type, free memory available for file system cache and read/write block size. The page size on the linux is 4K. So you should expect most performance with 4k read/writes. An access at random position causes page fault if page is not mapped and will pull a new page read. Usually, you want memory mapped file if you want to see the files as a one memory array ( or ByteBuffer in Java ).