When I read the source code of SparseArray in Android SDK, I met a method named ArrayUtils.newUnpaddedObjectArray(capacity). I don't understand what the unpadded array means?
You can find more information in class VMRuntime, which is used by class ArrayUtils.
The JavaDoc of VMRuntime.newUnpaddedArray(...) says:
Returns an array of at least minLength, but potentially larger. The increased size comes from avoiding any padding after the array. The amount of padding varies depending on the componentType and the memory allocator implementation.
Java data types use different size in memory. If only the needed capacity would be allocated, some space at then end would be left unused before the next memory allocation. This space is called padding. So, in order to not waste this space, the array made a little bit larger.
Related
Buffer is an abstract class having concrete subclasses such as ByteBuffer, IntBuffer, etc. It seems to be a container of data of a specific primitive type. What are the benefits of a Buffer? Why wouldn't I just use an array or a list?
A buffer can be defined, in its simplest form, as a contiguous block of memory of some type. Hence a byte buffer of size 4K (4096 bytes) may occupy memory locations 0xf000 through 0xffff inclusive.
As to why a buffer type may be used instead of an array or list, neither of those two alternatives have the in-built features of limit, position or mark.
On the first item, a buffer separates the capacity from the limit in that you can have a capacity of 1000 with a current limit of 10. In other words, it enforces the ability to have a variable size up to and including the capacity.
For the other two features, the current position provides an in-built way to read or write the next element, easing sequential processing, and the mark provides a way to save the current position for later reset.
All these features would require extra variables if you needed them in conjunction with an array or list.
Of course, if you don't need any of these features then, by all means, use an array.
How to declare an array of size 10^9 in java ?.I have tried Array list but the problem is i need to find minimum and maximum element in array and so I need to compare 0th element of an array with all other elements of array and i initially need some fixed size of array which is required in the input format of array on code-chef. Can anybody help?. I tried using long array but it gave out of memory error.
A Java array can have a maximum size equal to Integer.MAX_VALUE (or in some cases a slightly different value) which is about 2.3 * 10^9, so creating that large an array is theoretically possible. However since 10^9 would mean the prefix giga (to make it easier to read) the array would have a size of at least 1GB (when using byte[]). Depending on the data type you're using the array probably just takes too much memory (an int[] would already take up 4GB).
You could try to increase the JVM's max memory by using the -Xmx option (e.g. to allow a maximum of 4GB you could use -Xmx=4g) but you're still limited by the maximum of adressable memory (e.g. IIRC a 32-bit JVM can only adress up top 4GB in total) and available memory.
Alternatively you could try and split the array over multiple machines or JVMs and employ some distributed approach. Or you could write the array to a (memory mapped) file and keep only a part of the array in memory.
The best approach, however, would probably be to check whether you really need that much memory. In many cases using some clever algorithms or structures can dramatically reduce memory requirements. What to use depends on what you're trying to achieve in the end though.
I wish to make a large int array that very nearly fills all of the memory available to the JVM. Take this code, for instance:
final int numBuffers = (int) ((runtime.freeMemory() - 200000L) / (BUFFER_SIZE));
System.out.println(runtime.freeMemory());
System.out.println(numBuffers*(BUFFER_SIZE/4)*4);
buffers = new int[numBuffers*(BUFFER_SIZE / 4)];
When run with a heap size of 10M, this throws an OutOfMemoryException, despite the output from the printlns being:
9487176
9273344
I realise the array is going to have some overheads, but not 200k, surely? Why does java fail to allocate memory for something it claims to have enough space for? I have to set that constant that is subtracted to something around 4M before Java will run this (By which time the printlns are looking more like:
9487176
5472256
)
Even more bewilderingly, if I replace buffers with a 2D array:
buffers = new int[numBuffers][BUFFER_SIZE / 4];
Then it runs without complaint using the 200k subtraction shown above - even though the amount of integers being stored is the same in both arrays (And wouldn't the overheads on a 2D array be larger than that of a 1D array, since it's got all those references to other arrays to store).
Any ideas?
The VM will divide the heap memory into different areas (mainly for the garbage collector), so you will run out of memory when you attempt to allocate a single object of nearly the entire heap size.
Also, some memory will already have been used up by the JRE. 200k is nothing with todays memory sizes, and 10M heap is almost unrealistically small for most applications.
The actual overhead of an array is relatively small, on a 32bit VM its 12 bytes IIRC (plus what gets wasted if the size is less than the minimal granularity, which is AFAIK 8 bytes). So in the worst case you have something like 19 bytes overhead per array.
Note that Java has no 2D (multi-dimensional) arrays, it implements this internally as an array of arrays.
In the 2D case, you are allocating more, smaller objects. The memory manager is objecting to the single large object taking up most of the heap. Why this is objectionable is a detail of the garbage collection scheme-- it's probably because something like it can move the smaller objects between generations and the heap won't accomodate moving the single large object around.
This might be due to memory fragmentation and the JVM's inability to allocate an array of that size given the current heap.
Imagine your heap is 10 x long:
xxxxxxxxxx
Then, you allocate an object 0 somehere. This makes your heap look like:
xxxxxxx0xx
Now, you can no longer allocate those 10 x spaces. You can not even allocate 8 xs, despite the fact that available memory is 9 xs.
The fact is that an array of arrays does not suffer from the same problem because it's not contiguous.
EDIT: Please note that the above is a very simplistic view of the problem. When in need of space in the heap, Java's garbage collector will try to collect as much memory as it can and, if really, really necessary, try to compact the heap. However, some objects might not be movable or collectible, creating heap fragmentation and putting you in the above situation.
There are also many other factors that you have to consider, some of which include: memory leaks either in the VM (not very likely) or your application (also not likely for a simple scenario), unreliability of using Runtime.freeMemory() (the GC might run right after the call and the available free memory could change), implementation details of each particular JVM, etc.
The point is, as a rule of thumb, don't always expect to have the full amount of Runtime.freeMemory() available to your application.
I want to read some XML-files and convert it to a graph (no graphics, just a model). But because the files are very large (2,2 GB) my model object, which holds all the information, becomes even larger (4x the size of the file...).
Googling through the net I tried to find ways to reduce the object size. I tried different collection types but would like to stick to a HashMap (because I have to have random access). The actuall keys and values make up just a small amount of the allocated memory. Most of the hash table is empty...
If I'm not totally wrong a garbage collection doesn't help me to free the allocated memory and reduce the size of the hashmap. Is there and other way to release unused memory and shrink the hashmap? Or is there a way to do perfect hashing? Or shoud I just use another collection?
Thanks in advance,
Sebastian
A HashMap is typically just a large array of references filled to a certain percentage of capacity. If only 80% of the map is filled, the remaining 20% of the array cells are unused (i.e., are null). The extra overhead is really only just the empty (null) cells.
On a 32-bit CPU, each array cell is usually 4 bytes in size (although some JVM implementations may allocate 8 bytes). That's not really that much unused space overall.
Once your map is filled, you can copy it to another HashMap with a more appropriate (smaller) size giving a larger fill percentage.
Your question seems to imply that there are more allocated but unused objects that you're worried about. But how is that the case?
Addendum
Once a map is filled almost to capacity (typically more than 95% or so), a larger array is allocated, the old array's contents are copied to the new array, and then the smaller array is left to be garbage collected. This is obviously an expensive operation, so choosing a reasonably large initial size for the map is key to improving performance.
If you can (over)estimate the number of cells needed, preallocating a map can reduce or even eliminate the resizing operations.
What you are asking is not so clear, it is not clear if memory is taken by the objects that you put inside the hasmap or by the hashmap itself, which shouldn't be the case since it only holds references.
In any case take a look at the WeakHashMap, maybe it is what you are looking for: it is an hashmap which doesn't guarantee that keys are kept inside it, it should be used as a sort of cache but from your description I don't really know if it is your case or not.
If you get nowhere with reducing the memory footprint of your hashmap, you could always put the data in a database. Depending on how the data is accessed, you might still get reasonable performance if you introduce a cache in front of the db.
One thing that might come into play is that you might have substrings that are referencing old larger strings, and those substrings are then making it impossible for the GC to collect the char arrays that are too big.
This happens when you are using some XML parsers that are returning attributes/values as substring from a larger string. (A substring is only a limited view of the larger string).
Try to put your strings in the map by doing something like this:
map.put(new String(key), new String(value));
Note that the GC then might get more work to do when you are populating the map, and this might not help you if you don't have that many substrings that are referencing larger strings.
If you're really serious about this and you have time to spare, you can make your own implementation of the Map interface based on minimal perfect hashing
If your keys are Strings, then there apparently is a map available for you here.
I haven't tried it myself but it brags about reduced memory usage.
You might give the Trove collections a shot. They advertise it as a more time and space efficient drop-in replacement for the java.util Collections.
Say I instantiate 100 000 of Vectors
a[0..100k] = new Vector<Integer>();
If i do this
a[0..100k] = new Vector<Integer>(1);
Will they take less memory? That is ignoring whether they have stuff in them and the overhead of expanding them when there has to be more than 1 element.
According to the Javadoc, the default capacity for a vector is 10, so I would expect it to take more memory than a vector of capacity 1.
In practice, you should probably use an ArrayList unless you need to work with another API that requires vectors.
When you create a Vector, you either specify the size you want it to have at the start or leave some default value. But it should be noted that in any case everything stored in a Vector is just a bunch of references, which take really little place compared to the objects they are actually pointing at.
So yes, you will save place initially, but only by the amount which equals to the difference between the default size and the specified multiplied by the size of a reference variable. If you create a really large amount of vectors like in your case, initial size does matter.
Well, sort of yes. IIRC Vector initializes internally 16 elements by default which means that due to byte alignment and other stuff done by underlying VM you'll save a considerable amount of memory initially.
What are you trying to accomplish, though?
Yes, they will. Putting in reasonable "initial sizes" for collections is one of the first things I do when confronted with a need to radically improve memory consumption of my program.
Yes, it will. By default, Vector allocates space for 10 elements.
Vector()
Constructs an empty vector so that its internal data array has size 10
and its standard capacity increment is zero.increment is zero.
Therefore, it reserves memory for 10 memory references.
That being said, in real life situations, this is rarely a concern. If you are truly generating 100,000 Vectors, you need to rethink your designincrement is zero.