Array access optimization - java

I have a 10x10 array in Java, some of the items in array which are not used, and I need to traverse through all elements as part of a method. What Would be better to do :
Go through all elements with 2 for loops and check for the nulltype to avoid errors, e.g.
for(int y=0;y<10;y++){
for(int x=0;x<10;x++){
if(array[x][y]!=null)
//perform task here
}
}
Or would it be better to keep a list of all the used addresses... Say an arraylist of points?
Something different I haven't mentioned.
I look forward to any answers :)

Any solution you try needs to be tested in controlled conditions resembling as much as possible the production conditions. Because of the nature of Java, you need to exercise your code a bit to get reliable performance stats, but I'm sure you know that already.
This said, there are several things you may try, which I've used to optimize my Java code with success (but not on Android JVM)
for(int y=0;y<10;y++){
for(int x=0;x<10;x++){
if(array[x][y]!=null)
//perform task here
}
}
should in any case be reworked into
for(int x=0;x<10;x++){
for(int y=0;y<10;y++){
if(array[x][y]!=null)
//perform task here
}
}
Often you will get performance improvement from caching the row reference. Let as assume the array is of the type Foo[][]:
for(int x=0;x<10;x++){
final Foo[] row = array[x];
for(int y=0;y<10;y++){
if(row[y]!=null)
//perform task here
}
}
Using final with variables was supposed to help the JVM optimize the code, but I think that modern JIT Java compilers can in many cases figure out on their own whether the variable is changed in the code or not. On the other hand, sometimes this may be more efficient, although takes us definitely into the realm of microoptimizations:
Foo[] row;
for(int x=0;x<10;x++){
row = array[x];
for(int y=0;y<10;y++){
if(row[y]!=null)
//perform task here
}
}
If you don't need to know the element's indices in order to perform the task on it, you can write this as
for(final Foo[] row: array){
for(final Foo elem: row
if(elem!=null)
//perform task here
}
}
Another thing you may try is to flatten the array and store the elements in Foo[] array, ensuring maximum locality of reference. You have no inner loop to worry about, but you need to do some index arithmetic when referencing particular array elements (as opposed to looping over the whole array). Depending on how often you do it, it may or not be beneficial.
Since most of the elements will be not-null, keeping them as a sparse array is not beneficial for you, as you lose locality of reference.
Another problem is the null test. The null test itself doesn't cost much, but the conditional statement following it does, as you get a branch in the code and lose time on wrong branch predictions. What you can do is to use a "null object", on which the task will be possible to perform but will amount to a non-op or something equally benign. Depending on the task you want to perform, it may or may not work for you.
Hope this helps.

You're better off using a List than an array, especially since you may not use the whole set of data. This has several advantages.
You're not checking for nulls and may not accidentally try to use a null object.
More memory efficient in that you're not allocating memory which may not be used.

For a hundred elements, it's probably not worth using any of the classic sparse array
implementations. However, you don't say how sparse your array is, so profile it and see how much time you spend skipping null items compared to whatever processing you're doing.
( As Tom Hawtin - tackline mentions ) you should, when using an array of arrays, try to loop over members of each array rather than than looping over the same index of different arrays. Not all algorithms allow you to do that though.
for ( int x = 0; x < 10; ++x ) {
for ( int y = 0; y < 10; ++y ) {
if ( array[x][y] != null )
//perform task here
}
}
or
for ( Foo[] row : array ) {
for ( Foo item : row ) {
if ( item != null )
//perform task here
}
}
You may also find it better to use a null object rather than testing for null, depending what the complexity of the operation you're performing is. Don't use the polymorphic version of the pattern - a polymorphic dispatch will cost at least as much as a test and branch - but if you were summing properties having an object with a zero is probably faster on many CPUs.
double sum = 0;
for ( Foo[] row : array ) {
for ( Foo item : row ) {
sum += item.value();
}
}
As to what applies to android, I'm not sure; again you need to test and profile for any optimisation.

Holding an ArrayList of points would be "over engineering" the problem. You have a multi-dimensional array; the best way to iterate over it is with two nested for loops. Unless you can change the representation of the data, that's roughly as efficient as it gets.
Just make sure you go in row order, not column order.

Depends on how sparse/dense your matrix is.
If it is sparse, you better store a list of points, if it is dense, go with the 2D array. If in between, you can have a hybrid solution storing a list of sub-matrices.
This implementation detail should be hidden within a class anyway, so your code can also anytime convert between any of these representations.
I would discourage you from settling on any of these solutions without profiling with your real application.

I agree an array with a null test is the best approach unless you expect sparsely populated arrays.
Reasons for this:
1- More memory efficient for dense arrays (a list needs to store the index)
2- More computationally efficient for dense arrays (You need only compare the value you just retrieved to NULL, instead of having to also get the index from memory).
Also, a small suggestion, but in Java especially you are often better off faking a multi dimensional array with a 1D array where possible (square/rectangluar arrays in 2D). Bounds checking only happens once per iteration, instead of twice. Not sure if this still applies in the android VMs, but it has traditionally been an issue. Regardless, you can ignore it if the loop is not a bottleneck.

Related

Array iteration with static final limits

I have an array:
final int[] exampleArray = new int[ID_DATA_ARRAY_SIZE];
And I can iterate that array several ways, for example:
Way 1:
for (int i = 0; i < exampleArray.length; i++) {
// code where I use 'i' index
}
Way 2:
for (int i = 0; i < ID_DATA_ARRAY_SIZE; i++) {
// code where I use 'i' index
}
Which way is better? Are there any other better ways to do it?
If you don't need i for anything else than extracting the element, then the enhanced for loop looks a bit nicer:
for(int element : exampleArray) {
//code that uses element
}
If you are using i for both accessing the array, and something else, then I would argue Way 1 is best:
for (int i = 0; i < exampleArray.length; i++) {
// code where I use 'i' index
}
The reason is that the next time someone looks at a code, the person will immediately see that you are iterating to the length of the array. If you go for way 2 (using a constant), the reader might wonder if that constant really is the length of your array.
Tackling both performance, and code readability, way 2 is better.
Rated by performance, by using exampleArray.length you are calling upon a "member" variable which requires additional java bytecode to request when compared to calling a "local" variable. But, the difference in performance is extremely minuscule and you would never notice it unless you were making an extreme amount of calculations.
Rated by readability, ID_DATA_ARRAY_SIZE lays out your intent for whomever is reading, which is more important than it may seem. Yet, too many programmers lay out nonsensical or ambiguous variable names, and it makes reading their code lacking in naturalness. Naming variables and functions in a way that makes sense to our minds in an organic way makes the code much simpler to deal with for yourself in the future, and anyone else, making it a good practice.
The fundamental difference in the two approaches, I see is as below:
In Way 1: you use the constant exampleArray.length in the loop condition
In Way 2: you use the constant ID_DATA_ARRAY_SIZE in the loop condition
Obviously way 2 is superior in terms of performance.
This is because you are accessing a constant rather than access member variable of exampleArray object. This advantage is realized in every iteration of the for loop where the value of length member is accessed.
see it is all about personal taste which way you wanna do but whenever you are working with array better to check null for the array and then do your stuff

Fastest way to access a table of data Java

Basically I am amidst a friendly code optimisation battle (to get the fastest program), I am trying to find a way that is faster to access a dictionary of hard coded data than a multidimensional array.
e.g to get the value for x:
int x = array[v1][v2][v3] ;
I have read that nested switch statements in a custom array may possibly be faster. Or is there a way I can possibly access memory more directly similar to pointers in C. Any ideas appreciated!
My 'competitor' is using a truth table and idea is to find something faster!
Many Thanks
Sam
If the array is regular in shape (i.e. MxNxK for some fixed M, N and K), you could try flattening it to achieve better locality of reference:
int array[] = new int[M*N*K];
...
int x = array[v1*N*K + v2*K + v3];
Also, if the entire array doesn't fit in the CPU cache, you might want to examine the patterns in which the array is accessed, to perhaps re-order the indices or change your code to make better use of the caches.

How do you (get around) dynamically naming variables?

I'm not sure if I'm using the right nomenclature, so I'll try to make my question as specific as possible. That said, I imagine this problem comes up all the time, and there are probably several different ways to deal with it.
Let's say I have an array (vector) called main of 1000 random years between 1980 and 2000 and that I want to make 20 separate arrays (vectors) out of it. These arrays would be named array1980, array1981, etc., would also have length 1000 but would contain 1s where the index in the name was equal to the corresponding element in main and 0s elsewhere. In other words:
for(int i=0; i<1000; i++){
if(main[i]==1980){
array1980[i]=1;
} else {
array1980[i]=0;
}
Of course, I don't want to have to write twenty of these, so it'd be good if I could create new variable names inside a loop. The problem is that you can't generally assign variable names to expressions with operators, e.g.,
String("array"+ j)=... # returns an error
I'm currently using Matlab the most, but I can also do a little in Java, c++ and python, and I'm trying to get an idea for how people go about solving this problem in general. Ideally, I'd like to be able to manipulate the individual variables (or sub-arrays) in some way that the year remains in the variable name (or array index) to reduce the chance for error and to make things easier to deal with in general.
I'd appreciate any help.
boolean main[][] = new boolean[1000][20];
for (int i=0; i < 1000; i++) {
array[i][main[i]-1980] = true;
}
In many cases a map will be a good solution, but here you could use a 2-dim array of booleans, since the size is known before (0-20) and continuous, and numerable.
Some languages will initialize an array of booleans to false for every element, so you would just need to set the values to true, to which main[i] points.
since main[i] returns numbers from 1980 to 2000, 1980-main[i] will return 1980-1980=0 to 2000-1980=20. To find your values, you have to add 1980 to the second index, of course.
The general solution to this is to not create variables with dynamic names, but to instead create a map. Exactly how that's done will vary by language.
For Java, it's worth looking at the map section of the Sun collections tutorial for a start.
Don Roby's answer is correct, but i would like to complete it.
You can use maps for this purpose, and it would look something like this:
Map<Integer,ArrayList<Integer>> yearMap = new HashMap<Integer,ArrayList<Integer>>();
yearMap.put(1980,new ArrayList<Integer>());
for (int i = 0; i < 1000; i++){
yearMap.get(1980).add(0);
}
yearMap.get(1980).set(999,1);
System.out.println(yearMap.get(1980).get(999));
But there is probably a better way to solve the problem that you have. You should not ask how to use X to solve Y, but how to solve Y.
So, what is it, that you are trying to solve?

Starting Size for an ArrayList

I want to use an ArrayList (or some other collection) like how I would use a standard array.
Specifically, I want it to start with an intial size (say, SIZE), and be able to set elements explicitly right off the bat,
e.g.
array[4] = "stuff";
could be written
array.set(4, "stuff");
However, the following code throws an IndexOutOfBoundsException:
ArrayList<Object> array = new ArrayList<Object>(SIZE);
array.set(4, "stuff"); //wah wahhh
I know there are a couple of ways to do this, but I was wondering if there was one that people like, or perhaps a better collection to use. Currently, I'm using code like the following:
ArrayList<Object> array = new ArrayList<Object>(SIZE);
for(int i = 0; i < SIZE; i++) {
array.add(null);
}
array.set(4, "stuff"); //hooray...
The only reason I even ask is because I am doing this in a loop that could potentially run a bunch of times (tens of thousands). Given that the ArrayList resizing behavior is "not specified," I'd rather it not waste any time resizing itself, or memory on extra, unused spots in the Array that backs it. This may be a moot point, though, since I will be filling the array (almost always every cell in the array) entirely with calls to array.set(), and will never exceed the capacity?
I'd rather just use a normal array, but my specs are requiring me to use a Collection.
The initial capacity means how big the array is. It does not mean there are elements there. So size != capacity.
In fact, you can use an array, and then use Arrays.asList(array) to get a collection.
I recomend a HashMap
HashMap hash = new HasMap();
hash.put(4,"Hi");
Considering that your main point is memory. Then you could manually do what the Java arraylist do, but it doesn't allow you to resize as much you want. So you can do the following:
1) Create a vector.
2) If the vector is full, create a vector with the old vector size + as much you want.
3) Copy all items from the old vector to your new vector.
This way, you will not waste memory.
Or you can implement a List (not vector) struct. I think Java already has one.
Yes, hashmap would be a great ideia.
Other way, you could just start the array with a big capacity for you purpose.

What is the fastest way to find an array within another array in Java?

Is there any equivalent of String.indexOf() for arrays? If not, is there any faster way to find an array within another other than a linear search?
Regardless of the elements of your arrays, I believe this is not much different than the string search problem.
This article provides a general intro to the various known algorithms.
Rabin-Karp and KMP might be your best options.
You should be able to find Java implementations of these algorithms and adapt them to your problem.
List<Object> list = Arrays.asList(myArray);
Collections.sort(list);
int index = Collections.binarySearch(list, find);
OR
public static int indexOf(Object[][] array, Object[] find){
for (int i = 0; i < array.length(); i ++){
if (Arrays.equals(array[i], find)){
return i;
}
}
return -1;
}
OR
public static int indexOf(Object[] array, Object find){
for (int i = 0; i < array.length(); i ++){
if (array[i].equals(find)){
return i;
}
}
return -1;
}
OR
Object[] array = ...
int index = Arrays.asList(array).indexOf(find);
As far as I know, there is NO way to find an array within another without a linear search. String.indexOf uses a linear search, just inside a library.
You should write a little library called indexOf that takes two arrays, then you will have code that looks just like indexOf.
But no matter how you do it, it's a linear search under the covers.
edit:
After looking at #ahmadabolkader's answer I kind of take this back. Although it's still a linear search, it's not as simple as just "implement it" unless you are restricted to fairly small test sets/results.
The problem comes when you want to see if ...aaaaaaaaaaaaaaaaaab fits into a string of (x1000000)...aaaaaaaaab (in other words, strings that tend to match most places in the search string).
My thought was that as soon as you found a first character match you'd just check all subsequent characters one-on-one, but that performance would degrade terrifyingly when most of the characters matched most of the time. There was a rolling hash method in #a12r's answer that sounded much better if this is a real-world problem and not just an assignment.
I'm just going to vote for #a12r's answer because of those awesome Wikipedia references.
The short answer is no - there is no faster way to find an array within an array by using some existing construct in Java. Based on what you described, consider creating a HashSet of arrays instead of an array of arrays.
Normally the way you find things in collections in java is
put them in a hashmap (dictionary) and look them up by their hash.
loop through each object and test its equality
(1) won't work for you because an array object's hash won't tell you that the contents are the same. You could write some sort of wrapper that would create a hashcode based on the contents (you'd also have to make sure equals returned values consistent with that).
(2) also will require a bit of work because object equality for arrays will only test that the objects are the same. You'd need to wrap the arrays with a test of the contents.
So basically, not unless you write it yourself.
You mean you have an array which elements also are array elements? If that is the case and the elements are sorted you might be able to use binarysearch from java.util.Arrays

Categories