Starting Size for an ArrayList

Starting Size for an ArrayList - java

I want to use an ArrayList (or some other collection) like how I would use a standard array.
Specifically, I want it to start with an intial size (say, SIZE), and be able to set elements explicitly right off the bat,
e.g.
array[4] = "stuff";
could be written
array.set(4, "stuff");
However, the following code throws an IndexOutOfBoundsException:
ArrayList<Object> array = new ArrayList<Object>(SIZE);
array.set(4, "stuff"); //wah wahhh
I know there are a couple of ways to do this, but I was wondering if there was one that people like, or perhaps a better collection to use. Currently, I'm using code like the following:
ArrayList<Object> array = new ArrayList<Object>(SIZE);
for(int i = 0; i < SIZE; i++) {
array.add(null);
}
array.set(4, "stuff"); //hooray...
The only reason I even ask is because I am doing this in a loop that could potentially run a bunch of times (tens of thousands). Given that the ArrayList resizing behavior is "not specified," I'd rather it not waste any time resizing itself, or memory on extra, unused spots in the Array that backs it. This may be a moot point, though, since I will be filling the array (almost always every cell in the array) entirely with calls to array.set(), and will never exceed the capacity?
I'd rather just use a normal array, but my specs are requiring me to use a Collection.

The initial capacity means how big the array is. It does not mean there are elements there. So size != capacity.
In fact, you can use an array, and then use Arrays.asList(array) to get a collection.

I recomend a HashMap
HashMap hash = new HasMap();
hash.put(4,"Hi");

Considering that your main point is memory. Then you could manually do what the Java arraylist do, but it doesn't allow you to resize as much you want. So you can do the following:
1) Create a vector.
2) If the vector is full, create a vector with the old vector size + as much you want.
3) Copy all items from the old vector to your new vector.
This way, you will not waste memory.
Or you can implement a List (not vector) struct. I think Java already has one.

Yes, hashmap would be a great ideia.
Other way, you could just start the array with a big capacity for you purpose.

Related

Best way to read data from a file and store them

I am reading data from a file of students where each line is a student, I am then turning that data into a student object and I want to return an array of student objects. I am currently doing this by storing each student object in an arraylist then returning it as a standard Student[]. Is it better to use an arraylist to have a dynamic size array then turn it into a standard array for the return or should I first count the number of lines in the file, make a Student[] of that size then just populate that array. Or is there a better way entirely to do this.
Here is the code if it helps:
public Student[] readStudents() {
String[] lineData;
ArrayList<Student> students = new ArrayList<>();
while (scanner.hasNextLine()) {
lineData = scanner.nextLine().split(" ");
students.add(new Student(lineData));
}
return students.toArray(new Student[students.size()]);
}

Which is better depends on what you need and your data set size. Needs could be - simplest code, fastest load, least memory usage, fast iteration over resultind data set... Options could be
For one-off script or small data sets (tens of thousands of elements) probably anything would do.
Maybe do not store elements at all, and process them as you read them? - least memory used, good for very large data sets.
Use pre-allocated array - if you know data set size in advance - guaranteed least memory allocations - but counting elements itself might be expensive.
If unsure - use ArrayList to collect elements. It would work most efficiently if you can estimate upper bound of your data set size in advance, say you know that normally there is not more than 5000 elements. In that case create ArrayList with 5000 elements. It will resize itself if backing array is full.
LinkedList - probably the most conservative - it allocates space as you go but required memory per element is larger and iteration is slower than for arrays or ArrayLists.
Your own data structure optimized for your needs. Usually the effort is not worth it, so use this option only when you already know the problem you want to solve.
Note on ArrayList: it starts with pre-allocating an array with set of slots which are filled afterwards without memory re allocation. As long as backing array is full a new larger one is allocated and all elements are moved into it. New array size is by default twice the size of previous - normally this is not a problem but can cause out of memory if new one cannot get enough contiguous memory block.

Use an array for a fixed size array. For students that is not the case, so an ArrayList is more suited, as you saw on reading. A conversion from ArrayList to array is superfluous.
Then, use the most general type, here the List interface. The implementation, ArrayList or LinkedList then is a technical implementation question. You might later change an other implementation with an other runtime behavior.
But your code can handle all kinds of Lists which is really a powerful generalisation.
Here an incomplete list of useful interfaces with some implementations
List - ArrayList (fast, a tiny bit memory overhead), LinkedList
Set - HashSet (fast), TreeSet (is a SortedSet)
Map - HashMap (fast), TreeMap (is a SortedMap), LinkedHashMap (order of inserts)
So:
public List<Student> readStudents() {
List<Student> students = new ArrayList<>();
while (scanner.hasNextLine()) {
String[] lineData = scanner.nextLine().split(" ");
students.add(new Student(lineData));
}
return students;
}
In a code review one would comment on the constructor Student(String[] lineData) which risks a future change in data.

Java array vs Array

It's been a while since I took a proper course on Java and I'm hoping someone can confirm/correct my understanding.
Consider the variables int[] arr and ArrayList arrLi:
arr has pointers directly to each component. arr[3] goes directly to the fourth element whereas arrLi.get(3) would have to traverse through the first three elements to get to the fourth.
Reassigning a component, such as a[3] = 0, does not rewrite the entire array.
Each time you want to add an element to arr, you would need to rewrite the entire array. For example, if there are 100 elements in arr, you have to make a new array with size 101 and copy all the elements from arr then add the new one. If you later decide to add yet another element, you'd have to go through the whole process again to add the 102-nd element.
arrLi adds (to end, front, or middle) and removes elements very efficiently because all it does is add/remove nodes and adjust the links.

ArrayList is a resizable array implementation of the List interface. Therefore fetching an element does not require traversing the previous elements.
Rewriting a value does not require rewriting the entire array in either case.
Yes, an array does need to be recreated if you need more space.
While it is called a list, ArrayList internally behaves much more like an array. ArrayList sometimes needs to be resized, meaning the underlying array needs to be recreated. However, this happens infrequently enough to not affect the average performance of an ArrayList over an array by much.
Please refer to https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html for more information

How to add an element at the end of an array?

I want to know how to add or append a new element to the end of an array. Is any simple way to add the element at the end? I know how to use a StringBuffer but I don't know how to use it to add an element in an array. I prefer it without an ArrayList or list. I wonder if the StringBuffer will work on integers.

You can not add an element to an array, since arrays, in Java, are fixed-length. However, you could build a new array from the existing one using Arrays.copyOf(array, size) :
public static void main(String[] args) {
int[] array = new int[] {1, 2, 3};
System.out.println(Arrays.toString(array));
array = Arrays.copyOf(array, array.length + 1); //create new array from old array and allocate one more element
array[array.length - 1] = 4;
System.out.println(Arrays.toString(array));
}
I would still recommend to drop working with an array and use a List.

Arrays in Java have a fixed length that cannot be changed. So Java provides classes that allow you to maintain lists of variable length.
Generally, there is the List<T> interface, which represents a list of instances of the class T. The easiest and most widely used implementation is the ArrayList. Here is an example:
List<String> words = new ArrayList<String>();
words.add("Hello");
words.add("World");
words.add("!");
List.add() simply appends an element to the list and you can get the size of a list using List.size().

To clarify the terminology right: arrays are fixed length structures (and the length of an existing cannot be altered) the expression add at the end is meaningless (by itself).
What you can do is create a new array one element larger and fill in the new element in the last slot:
public static int[] append(int[] array, int value) {
int[] result = Arrays.copyOf(array, array.length + 1);
result[result.length - 1] = value;
return result;
}
This quickly gets inefficient, as each time append is called a new array is created and the old array contents is copied over.
One way to drastically reduce the overhead is to create a larger array and keep track of up to which index it is actually filled. Adding an element becomes as simple a filling the next index and incrementing the index. If the array fills up completely, a new array is created with more free space.
And guess what ArrayList does: exactly that. So when a dynamically sized array is needed, ArrayList is a good choice. Don't reinvent the wheel.

The OP says, for unknown reasons, "I prefer it without an arraylist or list."
If the type you are referring to is a primitive (you mention integers, but you don't say if you mean int or Integer), then you can use one of the NIO Buffer classes like java.nio.IntBuffer. These act a lot like StringBuffer does - they act as buffers for a list of the primitive type (buffers exist for all the primitives but not for Objects), and you can wrap a buffer around an array and/or extract an array from a buffer.
Note that the javadocs say, "The capacity of a buffer is never negative and never changes." It's still just a wrapper around an array, but one that's nicer to work with. The only way to effectively expand a buffer is to allocate() a larger one and use put() to dump the old buffer into the new one.
If it's not a primitive, you should probably just use List, or come up with a compelling reason why you can't or won't, and maybe somebody will help you work around it.

As many others pointed out if you are trying to add a new element at the end of list then something like, array[array.length-1]=x; should do. But this will replace the existing element.
For something like continuous addition to the array. You can keep track of the index and go on adding elements till you reach end and have the function that does the addition return you the next index, which in turn will tell you how many more elements can fit in the array.
Of course in both the cases the size of array will be predefined. Vector can be your other option since you do not want arraylist, which will allow you all the same features and functions and additionally will take care of incrementing the size.
Coming to the part where you want StringBuffer to array. I believe what you are looking for is the getChars(int srcBegin, int srcEnd,char[] dst,int dstBegin) method. Look into it that might solve your doubts. Again I would like to point out that after managing to get an array out of it, you can still only replace the last existing element(character in this case).

one-liner with streams
Stream.concat(Arrays.stream( array ), Stream.of( newElement )).toArray();

is there any faster way to generate List of N integer

well I know it is very novice question, but nothing is getting into my mind. Currently I am trying this, but it is the least efficient way for such a big number. Help me anyone.
int count = 66000000;
LinkedList<Integer> list = new LinkedList<Integer>();
for (int i=1;i<=count;i++){
list.add(i);
//System.out.println(i);
}
EDIT:
Actually I have o perform operation on whole list(queue) repeatedly (say on a condition remove some elements and add again), so having to iterate whole list became so slow what with such number it took more than 10min.

the size of your output is O(n) therefore it's literally impossible to have an algorithm that populates your list any more efficient than O(n) time complexity.
You're spending a whole lot more time just printing your numbers to the screen than you actually are spending generating the list. If you really want to speed this code up, remove the
System.out.println(i);
On a separate note, I've noticed that you're using a LinkedList, If you used an array(or array-based list) it should be faster.

You could implement a List where the get(int index) method simply returns the index (or some value based on the index). The creation of the list would then be constant time (O(1)). The list would have to be immutable.

Your question isn't just about building the list, it includes deletion and re-insertion. I suspect you should be using a HashSet, maybe even a BitSet instead of a List of any kind.

Array access optimization

I have a 10x10 array in Java, some of the items in array which are not used, and I need to traverse through all elements as part of a method. What Would be better to do :
Go through all elements with 2 for loops and check for the nulltype to avoid errors, e.g.
for(int y=0;y<10;y++){
for(int x=0;x<10;x++){
if(array[x][y]!=null)
//perform task here
}
}
Or would it be better to keep a list of all the used addresses... Say an arraylist of points?
Something different I haven't mentioned.
I look forward to any answers :)

Any solution you try needs to be tested in controlled conditions resembling as much as possible the production conditions. Because of the nature of Java, you need to exercise your code a bit to get reliable performance stats, but I'm sure you know that already.
This said, there are several things you may try, which I've used to optimize my Java code with success (but not on Android JVM)
for(int y=0;y<10;y++){
for(int x=0;x<10;x++){
if(array[x][y]!=null)
//perform task here
}
}
should in any case be reworked into
for(int x=0;x<10;x++){
for(int y=0;y<10;y++){
if(array[x][y]!=null)
//perform task here
}
}
Often you will get performance improvement from caching the row reference. Let as assume the array is of the type Foo[][]:
for(int x=0;x<10;x++){
final Foo[] row = array[x];
for(int y=0;y<10;y++){
if(row[y]!=null)
//perform task here
}
}
Using final with variables was supposed to help the JVM optimize the code, but I think that modern JIT Java compilers can in many cases figure out on their own whether the variable is changed in the code or not. On the other hand, sometimes this may be more efficient, although takes us definitely into the realm of microoptimizations:
Foo[] row;
for(int x=0;x<10;x++){
row = array[x];
for(int y=0;y<10;y++){
if(row[y]!=null)
//perform task here
}
}
If you don't need to know the element's indices in order to perform the task on it, you can write this as
for(final Foo[] row: array){
for(final Foo elem: row
if(elem!=null)
//perform task here
}
}
Another thing you may try is to flatten the array and store the elements in Foo[] array, ensuring maximum locality of reference. You have no inner loop to worry about, but you need to do some index arithmetic when referencing particular array elements (as opposed to looping over the whole array). Depending on how often you do it, it may or not be beneficial.
Since most of the elements will be not-null, keeping them as a sparse array is not beneficial for you, as you lose locality of reference.
Another problem is the null test. The null test itself doesn't cost much, but the conditional statement following it does, as you get a branch in the code and lose time on wrong branch predictions. What you can do is to use a "null object", on which the task will be possible to perform but will amount to a non-op or something equally benign. Depending on the task you want to perform, it may or may not work for you.
Hope this helps.

You're better off using a List than an array, especially since you may not use the whole set of data. This has several advantages.
You're not checking for nulls and may not accidentally try to use a null object.
More memory efficient in that you're not allocating memory which may not be used.

For a hundred elements, it's probably not worth using any of the classic sparse array
implementations. However, you don't say how sparse your array is, so profile it and see how much time you spend skipping null items compared to whatever processing you're doing.
( As Tom Hawtin - tackline mentions ) you should, when using an array of arrays, try to loop over members of each array rather than than looping over the same index of different arrays. Not all algorithms allow you to do that though.
for ( int x = 0; x < 10; ++x ) {
for ( int y = 0; y < 10; ++y ) {
if ( array[x][y] != null )
//perform task here
}
}
or
for ( Foo[] row : array ) {
for ( Foo item : row ) {
if ( item != null )
//perform task here
}
}
You may also find it better to use a null object rather than testing for null, depending what the complexity of the operation you're performing is. Don't use the polymorphic version of the pattern - a polymorphic dispatch will cost at least as much as a test and branch - but if you were summing properties having an object with a zero is probably faster on many CPUs.
double sum = 0;
for ( Foo[] row : array ) {
for ( Foo item : row ) {
sum += item.value();
}
}
As to what applies to android, I'm not sure; again you need to test and profile for any optimisation.

Holding an ArrayList of points would be "over engineering" the problem. You have a multi-dimensional array; the best way to iterate over it is with two nested for loops. Unless you can change the representation of the data, that's roughly as efficient as it gets.
Just make sure you go in row order, not column order.

Depends on how sparse/dense your matrix is.
If it is sparse, you better store a list of points, if it is dense, go with the 2D array. If in between, you can have a hybrid solution storing a list of sub-matrices.
This implementation detail should be hidden within a class anyway, so your code can also anytime convert between any of these representations.
I would discourage you from settling on any of these solutions without profiling with your real application.

I agree an array with a null test is the best approach unless you expect sparsely populated arrays.
Reasons for this:
1- More memory efficient for dense arrays (a list needs to store the index)
2- More computationally efficient for dense arrays (You need only compare the value you just retrieved to NULL, instead of having to also get the index from memory).
Also, a small suggestion, but in Java especially you are often better off faking a multi dimensional array with a 1D array where possible (square/rectangluar arrays in 2D). Bounds checking only happens once per iteration, instead of twice. Not sure if this still applies in the android VMs, but it has traditionally been an issue. Regardless, you can ignore it if the loop is not a bottleneck.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.