Android Inserting words into ArrayList, out of memory

Android Inserting words into ArrayList, out of memory - java

I have two files, a dictionary containing words length 3 to 6 and a dictionary containing words 7. The words are stored in textfile separated with newlines. This method loads the file and inserts it into an arraylist which I store in an application class.
The file sizes are 386KB and 380 KB and contain less than 200k words each.
private void loadDataIntoDictionary(String filename) throws Exception {
Log.d(TAG, "loading file: " + filename);
AssetFileDescriptor descriptor = getAssets().openFd(filename);
FileReader fileReader = new FileReader(descriptor.getFileDescriptor());
BufferedReader bufferedReader = new BufferedReader(fileReader);
String word = null;
int i = 0;
MyApp appState = ((MyApp)getApplicationContext());
while ((word = bufferedReader.readLine()) != null) {
appState.addToDictionary(word);
word = null;
i++;
}
Log.d(TAG, "added " + i + " words to the dictionary");
bufferedReader.close();
}
The program crashes on an emulator running 2.3.3 with a 64MB sd card.
The errors being reported using logcat.
The heap grows past 24 MB. I then see clamp target GC heap from 25.XXX to 24.000 MB.
GC_FOR_MALLOC freed 0K, 12% free, external 1657k/2137K, paused 208ms.
GC_CONCURRENT freed XXK, 14% free
Out of memory on a 24-byte allocation and then FATAL EXCEPTION, memory exhausted.
How can I load these files without getting such a large heap?
Inside MyApp:
private ArrayList<String> dictionary = new ArrayList<String>();
public void addToDictionary(String word) {
dictionary.add(word);
}

Irrespective of any other problems/bugs, ArrayList can be very wasteful for this kind of storage, because as a growing ArrayList runs out of space, it doubles the size of its underlying storage array. So it's possible that nearly half of your storage is wasted. If you can pre-size a storage array or ArrayList to the correct size, then you may get significant saving.
Also (with paranoid data-cleansing hat on) make sure that there's no extra whitespace in your input files - you can use String.trim() on each word if necessary, or clean up the input files first. But I don't think this can be a significant problem given the file sizes you mention.
I'd expect your inputs to take less than 2MB to store the text itself (remember that Java uses UTF-16 internally, so would typically take 2 bytes per character) but there's maybe 1.5MB overhead for the String object references, plus 1.5MB overhead for the String lengths, and possibly the same again and again for the offset and hashcode (take a look at String.java)... whilst 24MB of heap still sounds a little excessive, it's not far off if you are getting the near-doubling effect of an unlucky ArrayList re-size.
In fact, rather than speculate, how about a test? The following code, run with -Xmx24M gets to about 560,000 6-character Strings before stalling (on a Java SE 7 JVM, 64-bit). It eventually crawls up to around 580,000 (with much GC thrashing, I imagine).
ArrayList<String> list = new ArrayList<String>();
int x = 0;
while (true)
{
list.add(new String("123456"));
if (++x % 1000 == 0) System.out.println(x);
}
So I don't think there's a bug in your code - storing large numbers of small Strings is just not very efficient in Java - for the test above it takes over 7 bytes per character because of all the overheads (which may differ between 32-bit and 64-bit machines, incidentally, and depend on JVM settings too)!
You might get slightly better results by storing an array of byte arrays, rather than ArrayList of Strings. There are also more efficient data structures for storing strings, such as Tries.

Related

High memory usage with Files.lines

I've found a few other questions on SO that are close to what I need but I can't figure this out. I'm reading a text file line by line and getting an out of memory error. Here's the code:
System.out.println("Total memory before read: " + Runtime.getRuntime().totalMemory()/1000000 + "MB");
String wp_posts = new String();
try(Stream<String> stream = Files.lines(path, StandardCharsets.UTF_8)){
wp_posts = stream
.filter(line -> line.startsWith("INSERT INTO `wp_posts`"))
.collect(StringBuilder::new, StringBuilder::append,
StringBuilder::append)
.toString();
} catch (Exception e1) {
System.out.println(e1.getMessage());
e1.printStackTrace();
}
try {
System.out.println("wp_posts Mega bytes: " + wp_posts.getBytes("UTF-8").length/1000000);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
System.out.println("Total memory after read: " + Runtime.getRuntime().totalMemory()/1000000 + "MB");
Output is like (when run in an environment with more memory):
Total memory before read: 255MB
wp_posts Mega bytes: 18
Total memory after read: 1035MB
Note than in my production environment, I cannot increase the memory heap.
I've tried explicitly closing the stream, doing a gc, and putting stream in parallel mode (consumed more memory).
My questions are:
Is this amount of memory usage expected?
Is there a way to use less memory?

Your problem is in collect(StringBuilder::new, StringBuilder::append, StringBuilder::append). When you add smth to the StringBuilder and it has not enough internal array, then it double it and copy part from previous one.
Do new StringBuilder(int size) to predefine size of internal array.
Second problem, is that you have a big file, but as result you put it into a StringBuilder. This is very strange to me. Actually this is same as read whole file into a String without using Stream.

Your Runtime.totalMemory() calculation is pointless if you are allowing JVM to resize the heap. Java will allocate heap memory as needed as long as it doesn't exceed -Xmx value. Since JVM is smart it won't allocate heap memory by 1 byte at a time because it would be very expensive. Instead JVM will request a larger amount of memory at a time (actual value is platform and JVM implementation specific).
Your code is currently loading the content of the file into memory so there will be objects created on the heap. Because of that JVM most likely will request memory from the OS and you will observer increased Runtime.totalMemory() value.
Try running your program with strictly sized heap e.g. by adding -Xms300m -Xmx300m options. If you won't get OutOfMemoryError then decrease the heap until you get it. However you also need to pay attention to GC cycles, these things go hand in had and are a trade off.
Alternatively you can create a heap dump after the file is processed and then explore the data with MemoryAnalyzer.

The way you calculated memory is incorrect due to the following reasons:
You have taken the total memory (not the used memory). JVM allocates memory lazily and when it does, it does it in chunks. So, when it needs an additional 1 byte memory, it may allocate 1MB memory (provided the total memory does not exceed the configured max heap size). Thus a good portion of allocated heap memory may remain unused. Therefore, you need to calculate the used memory: Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()
A good portion of the memory you see with the above formula maybe ready for garbage collection. JVM would definitely do the garbage collection before saying OutOfMemory. Therefore, to get an idea, you should do a System.gc() before calculating used memory. Ofcourse, you don't call gc in production and also calling gc does not guarantee that JVM would indeed trigger garbage collection. But for testing purpose, I think it works well.
You got the OutOfMemory when the stream processing was in progress. At that time the String was not formed and the StringBuilder had strong reference. You should call the capacity() method of StringBuilder to get the actual number of char elements in the array within StringBuilder and then multiply it by 2 to get the number of bytes because Java internally uses UTF16 which needs 2 bytes to store an ASCII character.
Finally, the way your code is written (i.e. not specifying a big enough size for StringBuilder initially), every time your StringBuilder runs out of space, it double the size of the internal array by creating a new array and copying the content. This means there will be triple the size allocated at a time than the actual String. This you cannot measure because it happens within the StringBuilder class and when the control comes out of StringBuilder class the old array is ready for garbage collection. So, there is a high chance that when you get the OutOfMemory error, you get it at that point in StringBuilder when it tries to allocate a double sized array, or more specifically in the Arrays.copyOf method
How much memory is expected to be consumed by your program as is? (A rough estimate)
Let's consider the program which is similar to yours.
public static void main(String[] arg) {
// Initialize the arraylist to emulate a
// file with 32 lines each containing
// 1000 ASCII characters
List<String> strList = new ArrayList<String>(32);
for (Integer i = 0; i < 32; i++) {
strList.add(String.format("%01000d", i));
}
StringBuilder str = new StringBuilder();
strList.stream().map(element -> {
// Print the number of char
// reserved by the StringBuilder
System.out.print(str.capacity() + ", ");
return element;
}).collect(() -> {
return str;
}, (response, element) -> {
response.append(element);
}, (response, element) -> {
response.append(element);
}).toString();
}
Here after every append, I'm printing the capacity of the StringBuilder.
The output of the program is as follows:
16, 1000, 2002, 4006, 4006, 8014, 8014, 8014, 8014,
16030, 16030, 16030, 16030, 16030, 16030, 16030, 16030,
32062, 32062, 32062, 32062, 32062, 32062, 32062, 32062,
32062, 32062, 32062, 32062, 32062, 32062, 32062,
If your file has "n" lines (where n is a power of 2) and each line has an average "m" ASCII characters, the capacity of the StringBuilder at the end of the program execution will be: (n * m + 2 ^ (a + 1) ) where (2 ^ a = n).
E.g. if your file has 256 lines and an average of 1500 ASCII characters per line, the total capacity of the StringBuilder at the end of program will be: (256 * 1500 + 2 ^ 9) = 384512 characters.
Assuming, you have only ASCII characters in you file, each character will occupy 2 bytes in UTF-16 representation. Additionally, everytime when the StringBuilder array runs out of space, a new bigger array twice the size of original is created (see the capacity growth numbers above) and the content of the old array is copied to the new array. The old array is then left for garbage collection. Therefore, if you add another 2 ^ (a+1) or 2 ^ 9 characters, the StringBuilder would create a new array for holding (n * m + 2 ^ (a + 1) ) * 2 + 2 characters and start copying the content of old array into the new array. Thus, there will be two big sized arrays within the StringBuilder as the copying activity goes on.
thus the total memory will be: 384512 * 2 + (384512 * 2 + 2 ) * 2 = 23,07,076 = 2.2 MB (approx.) to hold only 0.7 MB data.
I have ignored the other memory consuming items like array header, object header, references etc. as those will be negligible or constant compared to the array size.
So, in conclusion, 256 lines with 1500 characters each, consumes 2.2 MB (approx.) to hold only 0.7 MB data (one-third data).
If you had initialized the StringBuilder with the size 3,84,512 at the beginning, you could have accommodated the same number of characters in one-third memory and also there would have been much less work for CPU in terms of array copy and garbage collection
What you may consider doing instead
Finally, in such kind of problems, you may want to do it in chunks where you would write the content of your StringBuilder in a file or database as soon as it has processed 1000 records (say), clear the StringBuilder and start over again for the next batch of records. Thus you'd never hold more than 1000 (say) record worth of data in memory.

Performance in Reading data to memory java

I am trying to read a 512MB file into java memory. Here is my code:
String url_part = "/homes/t1.csv";
File f = new File(url_part);
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(f)));
ArrayList<String> mem = new ArrayList<String>();
System.out.println("Start loading.....");
System.gc();
double start = System.currentTimeMillis();
String line = br.readLine();
int count = 0;
while(line!=null){
line=br.readLine();
mem.add(line);
//System.out.println(count);
count++;
if(count%500000==0){
System.out.println(count);
}
}
The file contains 40000000 lines, the performance is totally fine before reading 18500000 lines, but it stucks somewhere after reading about 20000000 lines. (It freezes here, but continue after a long waiting, about 10seconds)
I kept track of the memory use, I found even the totaly file size is just 512 MB, the memory grows about 2GB when running the program. Also, the 8 core CPU keeps working at 100% utils.
I just want to read the file into memory so that later I can access the data I want faster from memory. Am I doing in the right way? THank!

First, Java stores strings in UTF-16, so if your input file contains mostly latin-1 symbols, then you will need twice more memory to store these symbols, thus 1Gb is used to store the chars. Second, there's an overhead per each line. We may roughly estimate it:
Reference from ArrayList to String - 4 bytes (assuming compressed oops)
Reference from String to char[] array - 4 bytes
String object header - at least 8 bytes
hash String field (to store hashCode) - 4 bytes
char[] object header - at least 8 bytes
char[] array length - 4 bytes
So in total at least 32 bytes will be wasted per each line. Usually it's more as objects must be padded. So for 20_000_000 lines you have at least 640_000_000 bytes overhead.

Java OutOfMemoryError when allocating byte[] within max heap

I'm trying to figure out why I am getting an OOM error even though the byte array I am initializing plus the currently used memory is less than the max heap size (1000MB).
Right before the array is initialized I'm using 373MB with 117 free. When I try to initialize the array that takes up 371MB I get an error. The strange thing is that the error persists until I allocate 1.2G or more for the JVM.
373 + 371 is 744, I should still have 256MB free, this is driving me nuts.
In a second case using 920mb with 117 free initializing a 918mb array takes at least 2800mb.
Is this somehow part of how java functions? If so is there a workaround so that something simple like an array copy operation can be done in less than 3n memory?
(memory numbers are from Runtime and max heap size is set with -Xmx)
test.java:
byte iv[];
iv =new byte[32];
byte key[] = new byte[32];
new SecureRandom().nextBytes(iv);
new SecureRandom().nextBytes(key);
plaintext = FileUtils.readFileToByteArray(new File("sampleFile"));
EncryptionResult out = ExperimentalCrypto.doSHE(plaintext, key, iv);
ExperimentalCrypto.java:
public static byte[] ExperimentalCrypto(byte[] input ,byte[] key, byte[]iv){
if(input.length%32 != 0){
int length = input.length;
byte[] temp = null;
System.out.println((input.length/32+1)*32 / (1024*1024));
temp=new byte[(input.length/32+1)*32]; // encounter error here

Typical JVM implementations split the Java heap into several parts dedicated to objects with a certain lifetime. Allocations of larger arrays typically bypass the stages for younger objects as these areas are usually smaller and to avoid unnecessary copying. So they will end up in the “old generation” space for which a size of ⅔ is not unusual. As you are using JVisualVM I recommend installing the plugin Visual GC which can show you a live view of the different memory areas and their fill state.
You can use the -XX:MaxPermSize=… and -XX:MaxNewSize=… startup options to reduce the sizes of the areas for the young and permanent generation and thus indirectly raise the fraction of the old generation’s area where your array will be allocated.

Reducing memory churn when processing large data set

Java has a tendency to create a large number objects that needs to be garbage collected when processing large data set. This happens fairly frequently when streaming a amounts of data from the database, creating reports, etc. Is there a strategy to reduce the memory churn.
In this example, the object based version spends significant amount of times (2+ seconds) generating objects and performing garbage collection whereas the boolean array version completes in a fraction of a section without any garbages collection whatsoever.
How do I reduce the memory churn (the need for large number of garbage collections) when processing large data sets?
java -verbose:gc -Xmx500M UniqChars
...
----------------
[GC 495441K->444241K(505600K), 0.0019288 secs] x 45 times
70000007
================
70000007
import java.util.HashSet;
import java.util.Set;
public class UniqChars {
static String a=null;
public static void main(String [] args) {
//Generate data set
StringBuffer sb=new StringBuffer("sfdisdf");
for (int i =0; i< 10000000; i++) {
sb.append("sfdisdf");
}
a=sb.toString();
sb=null; //free sb
System.out.println("----------------");
compareAsSet();
System.out.println("================");
compareAsAry();
}
public static void compareAsSet() {
Set<String> uniqSet = new HashSet<String>();
int n=0;
for(int i=0; i<a.length(); i++) {
String chr = a.substring(i,i);
uniqSet.add(chr);
n++;
}
System.out.println(n);
}
public static void compareAsAry() {
boolean uniqSet[] = new boolean[65536];
int n=0;
for(int i=0; i<a.length(); i++) {
int chr = (int) a.charAt(i);
uniqSet[chr]=true;
n++;
}
System.out.println(n);
}
}

Well as pointed out by one of the comments it's your code, not Java at fault for memory churn. So let's see you've written this code that builds an insanely large String from a StringBuffer. Calls toString() on it. Then calls substring() on that insanely large string which is in a loop and creating new a.length() Strings. Then does some in place junk on an array that really will perform pretty damn fast since there is no object creation, but ultimately writes to true to the same 5-6 locations in a huge array. Waste much? So what did you think would happen? Ditch StringBuffer and use StringBuilder since it's not fully synchronized which will be a little faster.
Ok so here's where your algorithm is probably spending its time. See the StringBuffer is allocating an internal character array to store stuff in each time you call append(). When that character array fills entirely up, it has to allocate a larger character array, copy all that junk you just wrote to it into the new array, then append what you originally called it with. So your code is allocating filling up, allocating a bigger chunk, copying that junk to the new array, then repeating that process until it does that 1000000 times. You can speed that up by pre-allocating the character array for the StringBuffer. Roughly that's 10000000 * "sfdisdf".length(). That will keep Java from creating tons of memory that it just dumps over and over.
Next is the compareAsSet() mess. Your line String chr = a.substring(i,i); is creating NEW strings a.length() times. Well since you're doing a.substring(i,i) is only a character you could just charAt(i) then there's no allocating happen. There's also an option of CharSequence which doesn't create a new String with it's own character array but simply points to the original underlying char[] with an offset and length. String.subSequence()
You plug this same code in any other language and it'll suck there too. In fact I'd say far far worse. Just try this is C++ and watch it be significantly worse than Java should you allocate and deallocate this much. See Java memory allocation is way way way faster than C++ because everything in Java is allocated from a memory pool so creating objects is magnitudes faster. But, there are limits. Furthermore, Java compresses its memory should it become too fragmented, C++ doesn't. So as you allocate memory and dump it, just in the same way, you'll probably run the risk of fragmenting the memory in C++. That could mean your StringBuffer might run out of the ability to grow large enough to finish and would crash.
In fact that might also explain some of the performance issues with GC because it's having to make room more a continuous block big enough after lots of trash has been taken out. So Java is not only cleaning up the memory its also having to compress the memory address space so it can get a block big enough for your StringBuffer.
Anyway, I'm sure your just testing the tires, but testing with code like this isn't really smart because it'll never perform well because it's unrealistic memory allocation. You know the old adage Garbage In Garbage Out. And that's what you got Garbage.

In your example your two methods are doing very different things.
In compareAsSet() you are generating the same 4 Strings ("s", "d", "f" and "i") and calling String.hashCode() and String.equals(String) (HashSet does this when you try to add them) 70000007 times. What you end up with is a HashSet of size 4. While you are doing this you are allocating String objects each time String.substring(int, int) returns which will force a minor collection every time the 'new' generation of the garbage collector gets filled.
In compareAsAry() you've allocated a single array 65536 elements wide changed some values in it and and then it goes out of scope when the method returns. This is a single heap memory operation vs 70000007 done in compareAsSet. You do have a local int variable being changed 70000007 times but this happens in stack memory not in heap memory. This method does not really generate that much garbage in the heap compared to the other method (basically just the array).
Regarding churn your options are recycle objects or tuning the garbage collector.
Recycling is not really possible with Strings in general as they are immutable, though the VM may perform interning operations this only reduces total memory footprint not garbage churn. A solution targeted for the above scenario that recycles could be generated but the implementation would be brittle and inflexible.
Tuning the garbage collector so that the 'new' generation is larger could reduce the total number of collections that has to be performed during your method call and thus increase the throughput of the call, you could also just increase the heap size in general which would accomplish the same thing.
For futher reading on garbage collector tuning in Java 6 I recommend the Oracle white paper linked below.
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html

For comparison, if you wrote this it would do the same thing.
public static void compareLength() {
// All the loop does is count the length in a complex way.
System.out.println(a.length());
}
// I assume you intended to write this.
public static void compareAsBitSet() {
BitSet uniqSet = new BitSet();
for(int i=0; i<a.length(); i++)
uniqSet.set(a.charAt(i));
System.out.println(uniqSet.size());
}
Note: the BitSet uses 1 bit per element, rather than 1 byte per element. It also expands as required so say you have ASCII text, the BitSet might use 128-bits or 16 bytes (plus 32-byte overhead) The boolean[] uses 64 KB which is much higher. Ironically, using a boolean[] can be faster as it involves less bit shifting and only the portion of the array used needs to be in memory.
As you can see, with either solution, you get a much more efficient result because you use a better algorithm for what needs to be done.

Why do I get "Not enough storage is available to process this command" using Java MappedByteBuffers?

I have a very large array of doubles that I am using a disk-based file and a paging List of MappedByteBuffers to handle, see this question for more background. I am running on Windows XP using Java 1.5.
Here is the key part of my code that does the allocation of the buffers against the file...
try
{
// create a random access file and size it so it can hold all our data = the extent x the size of a double
f = new File(_base_filename);
_filename = f.getAbsolutePath();
_ioFile = new RandomAccessFile(f, "rw");
_ioFile.setLength(_extent * BLOCK_SIZE);
_ioChannel = _ioFile.getChannel();
// make enough MappedByteBuffers to handle the whole lot
_pagesize = bytes_extent;
long pages = 1;
long diff = 0;
while (_pagesize > MAX_PAGE_SIZE)
{
_pagesize /= PAGE_DIVISION;
pages *= PAGE_DIVISION;
// make sure we are at double boundaries. We cannot have a double spanning pages
diff = _pagesize % BLOCK_SIZE;
if (diff != 0) _pagesize -= diff;
}
// what is the difference between the total bytes associated with all the pages and the
// total overall bytes? There is a good chance we'll have a few left over because of the
// rounding down that happens when the page size is halved
diff = bytes_extent - (_pagesize * pages);
if (diff > 0)
{
// check whether adding on the remainder to the last page will tip it over the max size
// if not then we just need to allocate the remainder to the final page
if (_pagesize + diff > MAX_PAGE_SIZE)
{
// need one more page
pages++;
}
}
// make the byte buffers and put them on the list
int size = (int) _pagesize ; // safe cast because of the loop which drops maxsize below Integer.MAX_INT
int offset = 0;
for (int page = 0; page < pages; page++)
{
offset = (int) (page * _pagesize );
// the last page should be just big enough to accommodate any left over odd bytes
if ((bytes_extent - offset) < _pagesize )
{
size = (int) (bytes_extent - offset);
}
// map the buffer to the right place
MappedByteBuffer buf = _ioChannel.map(FileChannel.MapMode.READ_WRITE, offset, size);
// stick the buffer on the list
_bufs.add(buf);
}
Controller.g_Logger.info("Created memory map file :" + _filename);
Controller.g_Logger.info("Using " + _bufs.size() + " MappedByteBuffers");
_ioChannel.close();
_ioFile.close();
}
catch (Exception e)
{
Controller.g_Logger.error("Error opening memory map file: " + _base_filename);
Controller.g_Logger.error("Error creating memory map file: " + e.getMessage());
e.printStackTrace();
Clear();
if (_ioChannel != null) _ioChannel.close();
if (_ioFile != null) _ioFile.close();
if (f != null) f.delete();
throw e;
}
I get the error mentioned in the title after I allocate the second or third buffer.
I thought it was something to do with contiguous memory available, so have tried it with different sizes and numbers of pages, but to no overall benefit.
What exactly does "Not enough storage is available to process this command" mean and what, if anything, can I do about it?
I thought the point of MappedByteBuffers was the ability to be able to handle structures larger than you could fit on the heap, and treat them as if they were in memory.
Any clues?
EDIT:
In response to an answer below (#adsk) I changed my code so I never have more than a single active MappedByteBuffer at any one time. When I refer to a region of the file that is currently unmapped I junk the existing map and create a new one. I still get the same error after about 3 map operations.
The bug quoted with GC not collecting the MappedByteBuffers still seems to be a problem in JDK 1.5.

I thought the point of MappedByteBuffers was the ability to be able to handle structures larger than you could fit on the heap, and treat them as if they were in memory.
No. The idea is / was to allow you to address more than to 2**31 doubles ... on the assumption that you had enough memory, and were using a 64 bit JVM.
(I am assuming that this is a followup question to this question.)
EDIT: Clearly, more explanation is needed.
There are a number of limits that come into play.
Java has a fundamental restriction that the length attribute of an array, and array indexes have type int. This, combined with the fact that int is signed and an array cannot have a negative size means that the largest possible array can have 2**31 elements. This restriction applies to 32bit AND 64bit JVMs. It is a fundamental part of the Java language ... like the fact that char values go from 0 to 65535.
Using a 32bit JVM places a (theoretical) upper bound of 2**32 on the number of bytes that are addressable by the JVM. This includes, the entire heap, your code, and library classes that you use, the JVM's native code core, memory used for mapped buffers ... everything. (In fact, depending on your platform, the OS may give you considerably less than 2**32 bytes if address space.)
The parameters that you give on the java command line determine how much heap memory the JVM will allow your application to use. Memory mapped to using MappedByteBuffer objects does not count towards this.
The amount of memory that the OS will give you depends (on Linux/UNIX) on the total amount of swap space configured, the 'process' limits and so on. Similar limits probably apply to Windows. And of course, you can only run a 64bit JVM if the host OS is 64bit capable, and you are using 64bit capable hardware. (If you have a Pentium, you are plain out of luck.)
Finally, the amount of physical memory in your system comes into play. In theory, you can ask your JVM to use a heap, etc that is many times bigger than than your machine's physical memory. In practice, this is a bad idea. If you over allocate virtual memory, your system will thrash and application performance will go through the floor.
The take away is this:
If you use a 32 bit JVM, you probably are limited to somewhere between 2**31 and 2**32 bytes of addressable memory. That's enough space for a MAXIMUM of between 2**29 and 2**30 doubles, whether you use an array or a mapped Buffer.
If you use a 64 bit JVM, you can represent a single array of 2**31 doubles. The theoretical limit of a mapped Buffer would be 2**63 bytes or 2**61 doubles, but the practical limit would roughly the amount of physical memory your machine has.

When memory mapping a file, it is possible to run out of address space in 32-bit VM. This happens even if the file is mapped in small chunks and those ByteBuffers are no longer reachable. The reason is that GC never kicks in to free the buffers.
Refer the bug at http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6417205

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.