I am working on a project that involves heuristics, and I built it in Java (Should have probably made it in C).
I am running into problems with memory.
My tree is built up with object nodes, and each object contains an array, a matrix, and three integers.
I already cut down many other values, in order to try and save more memory space, however, it still isn't enough.
So, I was thinking that I could also cut down the matrix, and transform it into an array.
However, my whole project is built on coordinates, to reach a certain point in the matrix.
So before I make any change, I would like to know how much (or not so much) this would affect memory usage.
Edit: The array and matrix both are made of int primitives.
The array is array[25] and the matrix is matrix[5][5].
The matrix represents the board of the game, with information of whether the field is empty, or has a certain type of piece inside it (all int).
I am talking about 16GB of RAM usage, and 25 million nodes.
I made this method, to clone arrays:
public int[] cloneArray(int[] array){
int i = 0;
int[] clone = new int[array.length];
while (i < array.length){
clone[i] = array[i];
i++;
}
return clone;
}
Similar methods were made, to clone matrixes, and the objects themselves.
Edit:
After finding out about the existence of a profiler, I made a check.
Here is a screenshot of the results:
I think these numbers make sense, because in the console, you can see nearly as many nodes that were counted, as you can see in the profiler, the states (in the console, "estados" is the pointer of the state that is currently being expanded).
So, in the profiler, we can see almost 20m states, which are the generated nodes.
Each state contains 1 array and 1 matrix.
We can see 138m arrays, which divided by 6 equals 23m.
And since a matrix is 5x5, then 5x23m of the arrays are contained in the matrix, and the other 23m are the arrays.
Am I making sense? Is this interpretation accurate?
Here is a dropbox link, so you can check the full resolution image:
https://www.dropbox.com/s/7wxz8vch1wnrsyr/Untitled.png?dl=0
Here are a couple of examples:
int[] array = new int[25];
int[][] matrix = new int[5][5];
The space occupied by the array is:
25 x 4 byte ints (the array contents)
12 bytes of object header for the array
total 112 bytes
A 2D int matrix in Java is actually an array of arrays, so the space occupied by the matrix is
(5 x 4 byte ints + 12 bytes of array header) x 5.
5 x 4 byte references + 12 bytes of array header
total 192 bytes
(The above assumes a 32 bit JVM, and typical array header sizes. Those are platform specific assumptions, but for any JVM platform you should be able to tie them down with specificity. And for Oracle HotSpot / OpenJDK JVMs since Java 6, the source code is available for anyone to see.)
Note of course that as the arrays / matrices get larger, the relative saving for an int[N^2] versus an int[N][N] becomes smaller.
Your question may suggest hidden problem in your code rather then "out of memory problem". the heap memory is not finish so fast , you need your code to be extremely heavy in order to get there.
still, I'll dare to say that changing 2 dimensional matrix into an array wouldn't change the memory usage much.
speaking on which - the 2 most common ways to implement higher-dimensions arrays (2 and above) are 1) slice it to one dimension array, then use the formula :
arr[a][b].. = arr[a+b+..]
2) use pointers to pointers , then you get an array of pointers , which points to another array of pointers and so on until the final level which are real objects
this said , (again , with dare) , Java may already slice the matrix into one dimension array behind the scenes.
any way , I highly suspect you have memory leak in your code , or not-ending-recursion, or a combination of the above . try to see you're not there before trying to implement what you suggested.
Related
I'm trying to find a counterexample to the Pólya Conjecture which will be somewhere in the 900 millions. I'm using a very efficient algorithm that doesn't even require any factorization (similar to a Sieve of Eratosthenes, but with even more information. So, a large array of ints is required.
The program is efficient and correct, but requires an array up to the x i want to check for (it checks all numbers from (2, x)). So, if the counterexample is in the 900 millions, I need an array that will be just as large. Java won't allow me anything over about 20 million. Is there anything I can possibly do to get an array that large?
You may want to extend the max size of the JVM Heap. You can do that with a command line option.
I believe it is -Xmx3600m (3600 megabytes)
Java arrays are indexed by int, so an array can't get larger than 2^31 (there are no unsigned ints). So, the maximum size of an array is 2147483648, which consumes (for a plain int[]) 8589934592 bytes (= 8GB).
Thus, the int-index is usually not a limitation, since you would run out of memory anyway.
In your algorithm, you should use a List (or a Map) as your data structure instead, and choose an implementation of List (or Map) that can grow beyond 2^31. This can get tricky, since the "usual" implementation ArrayList (and HashMap) uses arrays internally. You will have to implement a custom data structure; e.g. by using a 2-level array (a list/array). When you are at it, you can also try to pack the bits more tightly.
Java will allow up to 2 billions array entries. It’s your machine (and your limited memory) that can not handle such a large amount.
900 million 32 bit ints with no further overhead - and there will always be more overhead - would require a little over 3.35 GiB. The only way to get that much memory is with a 64 bit JVM (on a machine with at least 8 GB of RAM) or use some disk backed cache.
If you don't need it all loaded in memory at once, you could segment it into files and store on disk.
What do you mean by "won't allow". You probably getting an OutOfMemoryError, so add more memory with the -Xmx command line option.
You could define your own class which stores the data in a 2d array which would be closer to sqrt(n) by sqrt(n). Then use an index function to determine the two indices of the array. This can be extended to more dimensions, as needed.
The main problem you will run into is running out of RAM. If you approach this limit, you'll need to rethink your algorithm or consider external storage (ie a file or database).
If your algorithm allows it:
Compute it in slices which fit into memory.
You will have to redo the computation for each slice, but it will often be fast enough.
Use an array of a smaller numeric type such as byte.
Depending on how you need to access the array, you might find a RandomAccessFile will allow you to use a file which is larger than will fit in memory. However, the performance you get is very dependant on your access behaviour.
I wrote a version of the Sieve of Eratosthenes for Project Euler which worked on chunks of the search space at a time. It processes the first 1M integers (for example), but keeps each prime number it finds in a table. After you've iterated over all the primes found so far, the array is re-initialised and the primes found already are used to mark the array before looking for the next one.
The table maps a prime to its 'offset' from the start of the array for the next processing iteration.
This is similar in concept (if not in implementation) to the way functional programming languages perform lazy evaluation of lists (although in larger steps). Allocating all the memory up-front isn't necessary, since you're only interested in the parts of the array that pass your test for primeness. Keeping the non-primes hanging around isn't useful to you.
This method also provides memoisation for later iterations over prime numbers. It's faster than scanning your sparse sieve data structure looking for the ones every time.
I second #sfossen's idea and #Aaron Digulla. I'd go for disk access. If your algorithm can take in a List interface rather than a plain array, you could write an adapter from the List to the memory mapped file.
Use Tokyo Cabinet, Berkeley DB, or any other disk-based key-value store. They're faster than any conventional database but allow you to use the disk instead of memory.
could you get by with 900 million bits? (maybe stored as a byte array).
You can try splitting it up into multiple arrays.
for(int x = 0; x <= 1000000; x++){
myFirstList.add(x);
}
for(int x = 1000001; x <= 2000000; x++){
mySecondList.add(x);
}
then iterate over them.
for(int x: myFirstList){
for(int y: myFirstList){
//Remove multiples
}
}
//repeat for second list
Use a memory mapped file (Java 5 NIO package) instead. Or move the sieve into a small C library and use Java JNI.
Context
I am implementing a seam carving algorithm.
I am representing the pixels in a picture as a 1D array
private int[] picture;
Each int represents the RGB of the pixel.
To access the pixels I use helper methods such as:
private int pixelToIndex(int x, int y) {return (y * width()) + x;}
The alternative would be to store in a 2D array:
private int[][] picture;
The seam carving algorithm has two parts.
Firstly, it does some image processing where it finds the horizontal or vertical connected seam with lowest energy. Here the pixel accesses jump around a bit across rows.
Secondly it removes this connected seam.
For vertical seams I mark the pixel to be removed with -1 and create a new picture array skipping the removed pixels like so:
int i = 0, j = 0;
while (i < temp.length) {
if (picture[j] != -1) {
temp[i++] = picture[j];
}
j++;
}
picture = temp;
For horizontal seams, given a specific column I shift all the pixels after the deleted pixel of that column up by one row as so:
for (int i = 0; i < temp.length; i++) {
int row = indexToY(i);
int col = indexToX(i);
int deletedCell = seam[col];
if (row >= deletedCell) temp[i] = picture[i + width()];
else temp[i] = picture[i];
}
picture = temp;
The question
Obviously the 1D array uses less physical memory because of the overhead for each subarray but given the way I am iterating the matrix would the 2D array be more effectively cached by the CPU and thus more efficient?
How would the arrays differ in the way they would be loaded into the CPU cache and RAM? Would part of the 1D array go into the L1-cache? How would the 1D and 2D array be loaded into memory? Would it be dependent on size of the array?
An array of ints is just represented just as that: an array of int values. An array of arrays ... adds certain overhead. So, short answer: when dealing with really large amounts of data; plain 1-dimensional arrays are your friend.
On the other hand: only start optimizing after understanding the bottlenecks. You know, it doesn't help much to optimize your in-memory-datastructure ... when your application spends most of its time waiting for IO for example. And if your attempts to write "high performance" code yield "complicated, hard to read, thus hard to maintain" code ... you might have focused on the wrong area.
Besides: concrete performance numbers are affected by many different variables. So you want to do profiling first; and see what happens with different hardware, different data sets, and so on.
And another side note: sometimes, for the real number crunching; it can also be a viable option to implement something in C++ can make calls via JNI. It really depends on the nature of your problem; how often things will be used; response times expected by users; and so on.
Java has arrays of arrays for multi-dimensional arrays. In your case int[][] is an array of int[] (and of course int[] is an array of int). So, matrix is represented as a set of rows and pointers for each row. In this case it means that NxM matrix is occupying NxM for data and an array of pointers.
Since you can represent any matrix as an array you'll get less memory consumption storing it that way.
On the other hand address manipulation in case representing a 2D matrix as an arary is not that complex.
If we assume that you have a matrix that is NxM accessing and an Array with size NxM representing this matrix, yo can access element of Matrix[x,y] as Array[x*n+y].
Array[i] is compact and it has a high probability of being in L1 cache, or even in register cache.
Matrix[x,y] requires one memory read and addition
Array[x*n+y] requires one multiplication and one addition.
So, I'll put my two cents on Array, but anyway it has to be tested (don't forget to wait for warming time for JIT compiler)
I'm using a java program to get some data from a DB. I then calculate some numbers and start storing them in an array. The machine I'm using has 4 gigs of RAM. Now, I don't know how many numbers there will be in advance, so I use an ArrayList<Double>. But I do know there will be roughly 300 million numbers.
So, since one double is 8 bytes a rough estimate of the memory this array will consume is 2.4 gigs (probably more because of the overheads of an ArrayList). After this, I want to calculate the median of this array and am using the org.apache.commons.math3.stat.descriptive.rank.Median library which takes as input a double[] array. So, I need to convert the ArrayList<Double> to double[].
I did see many questions where this is raised and they all mention there is no way around looping through the entire array. Now this is fine, but since they also maintain both objects in memory, this brings my memory requirements up to 4.8 gigs. Now we have a problem since the total RAM available us 4 gigs.
First of all, is my suspicion that the program will at some point give me a memory error correct (it is currently running)? And if so, how can I calculate the median without having to allocate double the memory? I want to avoid sorting the array as calculating the median is O(n).
Your problem is even worse than you realize, because ArrayList<Double> is much less efficient than 8 bytes per entry. Each entry is actually an object, to which the ArrayList keeps an array of references. A Double object is probably about 12 bytes (4 bytes for some kind of type identifier, 8 bytes for the double itself), and the reference to it adds another 4, bringing the total up to 16 bytes per entry, even excluding overhead for memory management and such.
If the constraints were a little wider, you could implement your own DoubleArray that is backed by a double[] but knows how to resize itself. However, the resizing means you'll have to keep a copy of both the old and the new array in memory at the same time, also blowing your memory limit.
That still leaves a few options though:
Loop through the input twice; once to count the entries, once to read them into a right-sized double[]. It depends on the nature of your input whether this is possible, of course.
Make some assumption on the maximum input size (perhaps user-configurable), and allocate a double[] up front that is this fixed size. Use only the part of it that's filled.
Use float instead of double to cut memory requirements in half, at the expense of some precision.
Rethink your algorithm to avoid holding everything in memory at once.
There are many open source libraries that create dynamic arrays for primitives. One of these:
http://trove.starlight-systems.com/
The Median value is the value at the middle of a sorted list. So you don't have to use a second array, you can just do:
Collections.sort(myArray);
final double median = myArray.get(myArray.size() / 2);
And since you get that data from a DB anyways, you could just tell the DB to give you the median instead of doing it in Java, which will save all the time (and memory) for transmitting the data as well.
I agree, use Trove4j TDoubleArrayList class (see javadoc) to store double or TFloatArrayList for float. And by combining previous answers, we gets :
// guess initialcapacity to remove requirement for resizing
TDoubleArrayList data = new TDoubleArrayList(initialcapacity);
// fill data
data.sort();
double median = data.get(data.size()/2);
I have the problem that I want to do parallelization with Android Renderscript. For this I have to allocate my input data to renderscript and allocate them back. I want to do big matrix multiplications with the size of 8x8 or 64x64 matrices. There are two problems:
1) I cannot allocate two dimensional arrays.
2) forEach executes the loop as often as the size of the allocation. E.g. The input vector has 10 elements the loop will be executed 10 times.
To find a solution I did coding. So my matrix is generated randomly in a byte array. This byte array will be coded row or column to an integer array. So I put a 2d array in a one dimensional array with the size of the length. On the other side (Renderscript) I have to decode them, calculating the result and put the back with the allocation. I want to avoid the coding and to speed up the application. Someone know a better solution for my problem?
array[a][b] --> vector[a] or vector[b] but not vector[a*b] Exist there a possible solution?
I'm not sure that I fully understand your problem.
Let me try to make a general suggestion based on what I understand.
You can create a wrapper class that transform input index to the internal index via getters and setters, this wrapper can also implement java.lang.Iterable.
To help with the second part of your problem, bind the matrix Allocations to the Renderscript separately and pass rsForEach another Allocation that is sized to the number of operations you want to perform. You can use values set in this Allocation and/or the x argument of the root() function to help you find where to operate on the matrix data.
My answer for operating per row/column of an image gives more details.
I need to store a 2d matrix containing zip codes and the distance in km between each one of them. My client has an application that calculates the distances which are then stored in an Excel file. Currently, there are 952 places. So the matrix would have 952x952 = 906304 entries.
I tried to map this into a HashMap[Integer, Float]. The Integer is the hash code of the two Strings for two places, e.g. "A" and "B". The float value is the distance in km between them.
While filling in the data I run into OutOfMemoryExceptions after 205k entries. Do you have a tip how I can store this in a clever way? I even don't know if it's clever to have the whole bunch in memory. My options are SQL and MS Access...
The problem is that I need to access the data very quickly and possibly very often which is why I chose the HashMap because it runs in O(1) for the look up.
Thansk for your replies and suggestions!
Marco
A 2d array would be more memory efficient.
You can use a small hashmap to map the 952 places into a number between 0 and 951 .
Then, just do:
float[][] distances= new float[952][952];
To look things up, just use two hash lookups to convert the two places into two integers, and use them as indexes into the 2d array.
By doing it this way, you avoid the boxing of floats, and also the memory overhead of the large hashmap.
However, 906304 really isn't that many entries, you may just need to increase the Xmx maximum heap size
I would have thought that you could calculate the distances on the fly. Presumably someone has already done this, so you simply need to find out what algorithm they used, and the input data; e.g. longitude/latitude of the notional centres of each ZIP code.
EDIT: There are two commonly used algorithms for finding the (approximate) geodesic distance between two points given by longitude/latitude pairs.
The Vicenty formula is based on an ellipsoid approximation. It is more accurate, but more complicated to implement.
The Haversine formula is based on a spherical approximation. It is less accurate (0.3%), but simpler to implement.
Can you simply boost the memory available to the JVM ?
java -Xmx512m ...
By default the maximum memory configuration is 64Mb. Some more tuning tips here. If you can do this then you can keep the data in-process and maximise the performance (i.e. you don't need to calculate on the fly).
I upvoted Chi's and Benjamin's answers, because they're telling you what you need to do, but while I'm here, I'd like to stress that using the hashcode of the two strings directly will get you into trouble. You're likely to run into the problem of hash collisions.
This would not be a problem if you were concatenating the two strings (being careful to use a delimiter which cannot appear in the place designators), and letting HashMap do its magic, but the method you suggested, using the hashcodes for the two strings as a key, that's going to get you into trouble.
You will simply need more memory. When starting your Java process, kick it off like so:
java -Xmx256M MyClass
The -Xmx defines the max heap size, so this says the process can use up to 256 MB of memory for the heap. If you still run out, keep bumping that number up until you hit the physical limit.
Lately I've managed similar requisites for my master thesis.
I ended with a Matrix class that uses a double[], not a double[][], in order to alleviate double deref costs (data[i] that is an array, then array[i][j] that is a double) while allowing the VM to allocate a big, contiguous chunk of memory:
public class Matrix {
private final double data[];
private final int rows;
private final int columns;
public Matrix(int rows, int columns, double[][] initializer) {
this.rows = rows;
this.columns = columns;
this.data = new double[rows * columns];
int k = 0;
for (int i = 0; i < initializer.length; i++) {
System.arraycopy(initializer[i], 0, data, k, initializer[i].length);
k += initializer[i].length;
}
}
public Matrix set(int i, int j, double value) {
data[j + i * columns] = value;
return this;
}
public double get(int i, int j) {
return data[j + i * columns];
}
}
this class should use less memory than an HashMap since it uses a primitive array (no boxing needed): it needs only 906304 * 8 ~ 8 Mb (for doubles) or 906304 * 4 ~ 4 Mb (for floats). My 2 cents.
NB
I've omitted some sanity checks for simplicity's sake
Stephen C. has a good point: if the distances are as-the-crow-flies, then you could probably save memory by doing some calculations on the fly. All you'd need is space for the longitude and latitude for 952 zip codes and then you could use the vicenty formula to do your calculation when you need to. This would make your memory usage O(n) in zipcodes.
Of course, that solution makes some assumptions that may turn out to be false in your particular case, i.e. that you have longitude and latitude data for your zipcodes and that you're concerned with as-the-crow-flies distances and not something more complicated like driving directions.
If those assumptions are true though, trading a few computes for a whole bunch of memory might help you scale in the future if you ever need to handle a bigger dataset.
The above suggestions regarding heap size will be helpful. However, I am not sure if you gave an accurate description of the size of your matrix.
Suppose you have 4 locations. Then you need to assess the distances between A->B, A->C, A->D, B->C, B->D, C->D. This suggests six entries in your HashMap (4 choose 2).
That would lead me to believe the actual optimal size of your HashMap is (952 choose 2)=452,676; NOT 952x952=906,304.
This is all assuming, of course, that you only store one-way relationships (i.e. from A->B, but not from B->A, since that is redundant), which I would recommend since you are already experiencing problems with memory space.
Edit: Should have said that the size of your matrix is not optimal, rather than saying the description was not accurate.
Create a new class with 2 slots for location names. Have it always put the alphabetically first name in the first slot. Give it a proper equals and hashcode method. Give it a compareTo (e.g. order alphabetically by names). Throw them all in an array. Sort it.
Also, hash1 = hash2 does not imply object1 = object2. Don't ever do this. It's a hack.