I am trying to full one dimensional array in two dimensional array in Java.
I did it using this way, is there another way better than this?
public double[][] getResult(double[] data, int rowSize) {
int columnSize = data.length;
double[][] result = new double[columnSize][rowSize];
for (int i = 0; i < columnSize; i++) {
result[i][0] = data[i];
}
return result;
}
Edit: i am not going to reuse data-Array i want to set the reference of first column in result-array to the data-array .is it possible? if yes, how can i do this ?
I don't know if this makes sense in your application/context, but for performance reasons you should normally have coherent data as close in the memory as possible.
For Arrays (double[row][column]) that means that data which is has the same row value is normally stored very close in the memory. For example when you have an Array like double[][] d = double[10][10] and save two values like d[0][0] = 1.0; d[1][0] = 2.0;, then they are (I think) 40 bytes away from each other.
But when you save them like d[0][0] = 1.0; d[0][1] = 2.0;, they're directly next to each other in the memory and are normally loaded into the super-fast cache at the same time. When you want to use them iteratively, then that is a huge performance gain.
So for your application, a first improvement would be this:
public double[][] getResult(double[] data, int rowSize) {
int columnSize = data.length;
double[][] result = new double[rowSize][columnSize];
for (int i = 0; i < columnSize; i++) {
result[0][i] = data[i];
}
return result;
}
Secondly, you have to consider whether you are going to re-use the data-Array or not. Because if you don't re-use it, you can just set the reference of the first row in your result Array to the data-Array like this:
result[0] = data;
Again, this gives you another huge performance gain.
Why? Well simply because you don't need to use unnecessary much memory by copying all values - and as you might know, memory allocation can be rather slow.
So always re-use memory instead of copying it wildly.
So my full suggestion is to go with this code:
public double[][] getResult(double[] data, int rowSize) {
double[][] result = new double[rowSize][];
result[0] = data;
return result;
}
It is optional whether you set the columnSize or not.
Related
I have been thinking of this for a while, but I couldn't come up with a solution. In java, we can access the reference of a row in a matrix like this:
int [][] matrix = new int[3][4];
int [] toChangeTo = new int[4];
matrix[0] = toChangeTo;
Here, if I will make any changes in matrix[0], it will reflect in the actual matrix. However, there is no such way I could find to access a column. When I looked for some answers online, they were:
int [][] matrix = new int[3][4];
int column = new int[3];
for(int r = 0 ; r < matrix.length ; r++) {
column[r] = matrix[r][0];
}
In this code, column does have the values of the first column of matrix, but it is not its reference. I want a way to access the reference of any column of a matrix and without for loop iterations. Thanks a lot in advance.
int[3][4] means, I'll have a cupboard with three big boxes, then in each box, I'll put four smaller boxes. You can take a whole big box from the cupboard and have it still contain its four sub-boxes; but there's nothing that already contains the first sub-box of each big box, unless you manually rearrange it.
Same thing with Java.
As for any way, we could create a class.
class Column{
private final int[][] data;
private final int column;
public Column( int[][] data, int c ){
this.data = data;
column = c;
}
int get(int i){
return data[i][column];
}
int set(int index, int value){
data[index][column] = value;
}
}
We could add more methods like length etc.
Hey as other mention there is no way you can do that. You can read about how matrix is create in memory. Each row is sequence of memory location horizontally.
If you want to set reference of column to two variable, just store column values in row and row values in column OR take transpose of matrix.
Currently I have this for code and my game either uses way to much memory when generating (over a GB) or if I set it low, it will give a
WORLD_SIZE_X & WORLD_SIZE_Z = 256;
WORLD_SIZE_Y = 128;
Does anyone know how I could improve this so it doesn't use so much RAM?
Thanks! :)
public void generate() {
for(int xP = 0; xP < WORLD_SIZE_X; xP++) {
for(int zP = 0; zP < WORLD_SIZE_Z; zP++) {
for(int yP = 0; yP < WORLD_SIZE_Y; yP++) {
try {
blocks[xP][yP][zP] = new BlockAir();
if(yP == 4) {
blocks[xP][yP][zP] = new BlockGrass();
}
if(yP < 4) {
blocks[xP][yP][zP] = new BlockDirt();
}
if(yP == 0) {
blocks[xP][yP][zP] = new BlockUnbreakable();
}
} catch(Exception e) {}
}
//Tree Generation :D
Random rX = new Random();
Random rZ = new Random();
if(rX.nextInt(WORLD_SIZE_X) < WORLD_SIZE_X / 6 && rZ.nextInt(WORLD_SIZE_Z) < WORLD_SIZE_Z / 6) {
for(int j = 0; j < 5; j++) {
blocks[xP][5 + j][zP] = new BlockLog();
}
}
}
}
generated = true;
}
Delay object creation until you really need to access one of these voxels. You can write a method (I'm assuming Block as the common subclass of all the Block classes):
Block getBlockAt( int x, int y, int z )
using code similar what you have in your threefold loop, plus using a hash map Map<Integer,Block> for storing the random stuff, e.g. trees: from x, y and z compute an integer (x*128 + y)*256 + z and use this as the key.
Also, consider that for all "air", "log", "dirt" blocks you may not need a separate object unless something must be changed at a certain block. Until then, share a single object of a kind.
Cause you just give small piece of code, I can give you two suggestions:
compact the object size. Seems very stupid but very easy to do. Just imagine you have thousands of objects in your memory. If everyone can be compacted half size, you can save half memory :).
Just assign the value to array when you need it. Sometime it is not work if you need really need a assigned array. So just assign values to elements in array as LESS as you can. If you can show me more code, I can help you more.
Are you sure the problem is in this method? Unless Block objects are really big, 256*256*128 ~= 8M objects should not require 1 GB ...
That said, if the blocks do not hold state, it would be more memory efficient to use an enum (or even a byte instead), as we would not need a separate object for each block:
enum Block {
air, grass, dirt, log, unbreakable;
}
Block[][][] map = ...
I'm developing a image processing app in Java (Swing), which have lots of calculations.
It crashes when big images are loaded:
java.lang.OutOfMemoryError: Java heap space due things like:
double matrizAdj[][] = new double[18658][18658];
So I'm decided to experiment a light, and faster as possible, database to deal with this problem. Thinking to use a table as it were a 2D array, loop throught it insert resulting values into other table.
I'm also thinking about using JNI, but as I'm not familiarized with C/C++ and I don't have the time needed to learn.
Currently, my problem is not processing, only heap overload.
I would like to hear what is my best option to solve this.
EDIT :
Little explanation: First I get all white pixels from a binarized image into a list. Lets say I got 18k pixels. Then I perform a lot of operations with that list. Like variance, standard deviation, covariance... and goes on... At the end I have to multiply two 2D array([2][18000] & [18000][2]) resulting in a double[18000][18000] that is causing me trouble. After that, other operations are done with this 2D array, resulting in more than one big 2D array.
I can't deal with requiring large ammounts of RAM to use this app.
Well, for trivia's sake, that matrix you're showing consumes roughly 2.6Gb of RAM. So, that's a benchmark of how much memory you need should you decided to pursue that tact.
If it's efficient for you, you could store the rows of the matrix in to blobs within a database. In this case you'd have 18658 rows, with a serialized double[18658] store on it.
I wouldn't suggest that though.
A better tact would be to use the image file directly, and look at NIO and byte buffers to use mmap to map them in to your program space.
Then you can use things like DoubleBuffers to access the data. This lets the VM page in as much of the original file is necessary, and it also keeps the data off the Java heap (rather it's stored in process RAM associated with the JVM). The big benefit is that it keeps these monster data structures away from the Garbage Collector.
You'll still need physical RAM on the machine, of course, but it's not Java Heap RAM.
But this is would likely be the most efficient way to access this data for your process.
Since you stated you "can't deal with requiring large ammounts of RAM to use this app" your only option is to store the big array off RAM - disk being the most obvious choice (using a relational database is just an unnecessary overhead).
You can use a little utility class which provides a persistent 2-dimensional double array functionality. Here is my solution to that using RandomAccessFile. This solution also has the advantage that you can keep the array and reuse it when you restart the application!
Note: the presented solution is not thread-safe. Synchronization needed if you want to access it from multiple threads concurrently.
Persistent 2-dimensional double array:
public class FileDoubleMatrix implements Closeable {
private final int rows;
private final int cols;
private final long rowSize;
private final RandomAccessFile raf;
public FileDoubleMatrix(File f, int rows, int cols) throws IOException {
if (rows < 0 || cols < 0)
throw new IllegalArgumentException(
"Rows and cols cannot be negative!");
this.rows = rows;
this.cols = cols;
rowSize = cols * 8;
raf = new RandomAccessFile(f, "rw");
raf.setLength(rowSize * cols);
}
/**
* Absolute get method.
*/
public double get(int row, int col) throws IOException {
pos(row, col);
return get();
}
/**
* Absolute set method.
*/
public void set(int row, int col, double value) throws IOException {
pos(row, col);
set(value);
}
public void pos(int row, int col) throws IOException {
if (row < 0 || col < 0 || row >= rows || col >= cols)
throw new IllegalArgumentException("Invalid row or col!");
raf.seek(row * rowSize + col * 8);
}
/**
* Relative get method. Useful if you want to go though the whole array or
* though a continuous part, use {#link #pos(int, int)} to position.
*/
public double get() throws IOException {
return raf.readDouble();
}
/**
* Relative set method. Useful if you want to go though the whole array or
* though a continuous part, use {#link #pos(int, int)} to position.
*/
public void set(double value) throws IOException {
raf.writeDouble(value);
}
public int getRows() { return rows; }
public int getCols() { return cols; }
#Override
public void close() throws IOException {
raf.close();
}
}
The presented FileDoubleMatrix supports relative get() and set() methods which is very useful if you process your whole array or a continuous part of it (e.g. you iterate over it). Use the relative methods when you can for faster operations.
Example using the FileDoubleMatrix:
final int rows = 10;
final int cols = 10;
try (FileDoubleMatrix arr = new FileDoubleMatrix(
new File("array.dat"), rows, cols)) {
System.out.println("BEFORE:");
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++) {
System.out.print(arr.get(row, col) + " ");
}
System.out.println();
}
// Process array; here we increment the values
for (int row = 0; row < rows; row++)
for (int col = 0; col < cols; col++)
arr.set(row, col, arr.get(row, col) + (row * cols + col));
System.out.println("\nAFTER:");
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++)
System.out.print(arr.get(row, col) + " ");
System.out.println();
}
} catch (IOException e) {
e.printStackTrace();
}
More about the relative get and set methods:
The absolute get and set methods require the position (row and column) of the element to be returned or set. The relative get and set methods do not require the position, they return or set the current element. The current element is in fact the pointer of the underlying file. The position can be set with the pos() method.
Whenever a relative get() or set() method is called, after returning they implicitly move the pointer to the next element, in a row-continuity manner (moving to the next in the row, and if the end of row reached, moving to the first element of the next row etc.)
For example here is how we can zero the whole array using the relative set method:
// Fill the whole array with zeros using relative set
// First position to the beginning:
arr.pos(0, 0);
// And execute a "set zero" operation
// as many times as many elements the array has:
for ( int i = rows * cols; i > 0; i--)
arr.set(0);
The relative get and set methods automatically move the pointer to the next element.
It should be obvious that in my implementation the absolute get and set methods also change the pointer which must not be forgotten when relative and absolute get/set methods are used.
Another example: let's set the sum of each row to the last element of the row, but also include the last element in the sum! For this we will use the mixture of realtive and absolute get/set methods:
// Start with the first row:
arr.pos(0, 0);
for (int row = 0; row < rows; row++) {
double sum = 0;
for (int col = 0; col < cols; col++)
sum += arr.get(); // Relative get to calculate row sum
// Now set the sum to the end of row.
// For this we have to position back, so we use the absolute set.
arr.set(row, cols - 1, sum);
// The absolute set method also moves the pointer, and since
// it is the end of row, it moves to the first of the next row.
}
And that's all. Using the relative get/set methods we don't have to pass the "matrix indices" when processing continuous parts of the array, and also the implementation does not have to move the internal pointer which is more than handy when processing millions of elements as in your example.
I would recommend the following things in order.
Investigate why your app is running out of memory. Are you creating arrays or other objects bigger than what you need. I hope you might have done that already. But still I thought it's worth mentioning because this should not be ignored.
If you think there is nothing wrong with step 1 then check you are not running with too low memory settings. or 32 bit jvm
If there is no issue with step 2. Now it's not always true that a light weight database will give you best performance. If you don't require searching the temp data may be you won't gain much from implementing a light weight database. But if your application needs lot of searching / querying the temp data it may be different case. If you don't need searching custom file format may be fast and efficient.
I hope it helps you solve the issue at hand :)
The simplest fix would be simply to give your program more memory. For example, if you specify -xmx 11G on your Java command line, the JVM will be able to allocate up to 11 GB of heap space - enough memory to carry several copies of your array, which is around 2.6 GB in size, in memory at a time.
If speed is really not an issue, you can do this even if you don't have enough physical memory, by allocating enough virtual memory and letting the OS swap the memory to disk.
I personally also think this is the best solution. Memory on this scale is cheaper than programmer time.
I would suggest a different approach.
Since most image processing operations are done by going over all of the pixels in some order exactly once, it's usually possible to do them on one piece of the image at a time. What I mean is that there's usually no random access to pixels of the image. If I'm not mistaking, all of the operations you mention in your question fit this description.
Therefore, I would suggest loading the image lazily, a piece at a time. Then, implement methods that retrieve the next chunk of pixels once the previous one is processed, and feeds these chunks to the algorithms you use.
In order to support that, I would suggest converting the images to a non compressed format that you could create a lazy reader for easily.
Not sure I would bother with a database for this, just open a temporary file and spill parts of your matrix in there as needed, and delete the file when you're done. Whatever solution you choose has to depend somewhat on your matrix library being able to use it. If you're using a third party library then you're probably limited to whatever options (if any) they provide. However if you've implemented your own matrix operations then definitely would just go with a temporary file that I manage myself. That will be fastest and lightest weight.
You can use split and reduce technique.
split your image into small fragments, or you can use sliding window technique
http://forums.ni.com/t5/Machine-Vision/sliding-window-technique/td-p/2586621
cheers,
I have:
final int ROWS = 100000;
final int COLS = 2000;
long[][] m = new long[COLS][ROWS];
and then:
public void xor(int row1, int row2) {
for (int col=0; col<COLS; col++) {
m[col][row1] ^= m[col][row2];
}
}
The above function is, simplified, what takes most of the time in a run. I was wondering if I should invest time to refactor my whole program to read "m = new long[ROWS][COLS]" (instead of the other way around) for better RAM access. Or won't I win much time with it?
I am aware that I could parallellize it with perhaps GPU's, but that's for a later stage.
In my opinion, it will definitely help to swap ROWS and COLS.
The layout of this arrays is (roughly) like this: [0][0], [0][1], [0][2],... [1][0], [1][1],... and so on. In your code, each column is a continuous chunk of memory, and a row is not.
Since each column is 800000 bytes, and in your xor method you access all of them, you are forcing more cache misses.
After transposing, each row becomes a continuous piece of memory, and since you tend to operate on rows, it should make it faster.
If you had long[][] m = new long[ROWS][COLS]; and for (int col=0; col<COLS; col++) m[row1][col] ^= m[row2][col];, you'd only need two 16000-byte-long rows to be in the cache during the execution of the xor method.
But since what I said is based mostly on theory, try to benchmark both variants and check which one is really faster.
I've been give some lovely Java code that has a lot of things like this (in a loop that executes about 1.5 million times).
code = getCode();
for (int intCount = 1; intCount < vA.size() + 1; intCount++)
{
oA = (A)vA.elementAt(intCount - 1);
if (oA.code.trim().equals(code))
currentName= oA.name;
}
Would I see significant increases in speed from switching to something like the following
code = getCode();
//AMap is a HashMap
strCurrentAAbbreviation = (String)AMap.get(code);
Edit: The size of vA is approximately 50. The trim shouldn't even be necessary, but definitely would be nice to call that 50 times instead of 50*1.5 million. The items in vA are unique.
Edit: At the suggestion of several responders, I tested it. Results are at the bottom. Thanks guys.
There's only one way to find out.
Ok, Ok, I tested it.
Results follow for your enlightenment:
Looping: 18391ms
Hash: 218ms
Looping: 18735ms
Hash: 234ms
Looping: 18359ms
Hash: 219ms
I think I will be refactoring that bit ..
The framework:
public class OptimizationTest {
private static Random r = new Random();
public static void main(String[] args){
final long loopCount = 1000000;
final int listSize = 55;
long loopTime = TestByLoop(loopCount, listSize);
long hashTime = TestByHash(loopCount, listSize);
System.out.println("Looping: " + loopTime + "ms");
System.out.println("Hash: " + hashTime + "ms");
}
public static long TestByLoop(long loopCount, int listSize){
Vector vA = buildVector(listSize);
A oA;
StopWatch sw = new StopWatch();
sw.start();
for (long i = 0; i< loopCount; i++){
String strCurrentStateAbbreviation;
int j = r.nextInt(listSize);
for (int intCount = 1; intCount < vA.size() + 1; intCount++){
oA = (A)vA.elementAt(intCount - 1);
if (oA.code.trim().equals(String.valueOf(j)))
strCurrentStateAbbreviation = oA.value;
}
}
sw.stop();
return sw.getElapsedTime();
}
public static long TestByHash(long loopCount, int listSize){
HashMap hm = getMap(listSize);
StopWatch sw = new StopWatch();
sw.start();
String strCurrentStateAbbreviation;
for (long i = 0; i < loopCount; i++){
int j = r.nextInt(listSize);
strCurrentStateAbbreviation = (String)hm.get(j);
}
sw.stop();
return sw.getElapsedTime();
}
private static HashMap getMap(int listSize) {
HashMap hm = new HashMap();
for (int i = 0; i < listSize; i++){
String code = String.valueOf(i);
String value = getRandomString(2);
hm.put(code, value);
}
return hm;
}
public static Vector buildVector(long listSize)
{
Vector v = new Vector();
for (int i = 0; i < listSize; i++){
A a = new A();
a.code = String.valueOf(i);
a.value = getRandomString(2);
v.add(a);
}
return v;
}
public static String getRandomString(int length){
StringBuffer sb = new StringBuffer();
for (int i = 0; i< length; i++){
sb.append(getChar());
}
return sb.toString();
}
public static char getChar()
{
final String alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int i = r.nextInt(alphabet.length());
return alphabet.charAt(i);
}
}
Eh, there's a good chance that you would, yes. Retrieval from a HashMap is going to be constant time if you have good hash codes.
But the only way you can really find out is by trying it.
This depends on how large your map is, and how good your hashCode implementation is (such that you do not have colisions).
You should really do some real profiling to be sure if any modification is needed, as you may end up spending your time fixing something that is not broken.
What actually stands out to me a bit more than the elementAt call is the string trimming you are doing with each iteration. My gut tells me that might be a bigger bottleneck, but only profiling can really tell.
Good luck
I'd say yes, since the above appears to be a linear search over vA.size(). How big is va?
Why don't you use something like YourKit (or insert another profiler) to see just how expensive this part of the loop is.
Using a Map would certainly be an improvement that helps maintaining that code later on.
If you can use a map depends on whether the (vector?) contains unique codes or not. The for loop given would remember the last object in the list with a given code, which would mean a hash is not the solution.
For small (stable) list sizes simply converting the list to an array of objects would show a performance increase on top of some better readability.
If none of the above holds, at least use an itarator to inspect the list, giving better readability and some (probable) performance increase.
Depends. How much memory you got?
I would guess much faster, but profile it.
I think the dominant factor here is how big vA is, since the loop needs to run n times, where n is the size of vA. With the map, there is no loop, no matter how big vA is. So if n is small, the improvement will be small. If it is huge, the improvement will be huge. This is especially true because even after finding the matching element the loop keeps going! So if you find your match at element 1 of a 2 million element list, you still need to check the last 1,999,999 elements!
Yes, it'll almost certainly be faster. Looping an average of 25 times (half-way through your 50) is slower than a hashmap lookup, assuming your vA contents decently hashable.
However, speaking of your vA contents, you'll have to trim them as you insert them into your aMap, because aMap.get("somekey") will not find an entry whose key is "somekey ".
Actually, you should do that as you insert into vA, even if you don't switch to the hashmap solution.