LinkedList<E> vs ArrayList<E> cost - java

I've read quite a few questions here that discuss the cost of using ArrayLists vs LinkedLists in Java. One of the most useful I've seen thus far is is here: When to use LinkedList over ArrayList?.
I want to be sure that I'm correctly understanding.
In my current use case, I have multiple situations where I have objects stored in a List structure. The number of objects in the list changes for each run, and random access to objects in the list is never required. Based on this information, I have elected to use LinkedLists with ListIterators to traverse the entire content of the list.
For example, my code may look something like this:
for (Object thisObject : theLinkedList) {
// do something
}
If this is a bad choice, please help me understand why.
My current understanding is that traversing the entire list of objects in a LinkedList would incur O(n) cost using the iterative solution. Since there is no random access to the list (i.e. The need to get item #3, for example), my current understanding is that this would be basically the same as looping over the content of an ArrayList and requesting each element with an index.
Assuming I knew the number of objects to be stored in the list beforehand, my current line of thinking is that it would be better to initialize an ArrayList to the appropriate size and switch to that structure entirely without using a ListIterator. Is this logic sound?
As always, I greatly appreciate everone's input!

Iteration over a LinkedList and ArrayList should take roughly the same amount of time to complete, since in each case the cost of stepping from one element to the next is a constant. The ArrayList might be a bit better due to locality of reference, though, so it might be worth profiling to see what happens.
If you are guaranteed that there will always be a fixed number of elements, and there won't be insertions and deletions in random locations, then a raw array might be a good choice, since it's extremely fast and well-optimized for this case.
That said, your analysis of why to use LinkedList seems sound. Again, it doesn't hurt to profile the program and see if ArrayList would actually be faster for your use case.
Hope this helps!

Related

Implement a list with o(1) running time

Could I make a list with everything having fast running time? Is it possible having this type of list? I can't wrap my head around how you could keep search or add times being constant if it needs to go through nodes to search for others, much less adding.
At first you only said get(), add() and set(), but then you said search() as well. The first three all have O(1) average run time in an ArrayList and similar implementations. You can't have O(1) search time in anything that would normally be considered a list.
Edit: Some people have pointed out, correctly, that you could get O(1) lookup time if the list implementation also stored element indices in a hashmap. Strictly speaking, as long as it implements a List interface, it is a list. I should have said that you can't do that with only a list.
Just posting my comment as an answer...
If you used an expanding list that expands in size as you need it, you could achieve amortized O(1) getting, setting, and adding, which just means you'd eventually need to expand its size, but that doesn't happen often enough. Pretty sure this is how Java's ArrayList class works.
You can read more about it here. https://stackoverflow.com/a/4450659/1572906
You mentioned searching, but with this kind of approach, O(1) doesn't seem very feasible. You could achieve O(1) using a hash table alongside the array.

Faster Access Version of ArrayList?

Does anyone know of something similar to ArrayList that is better geared to handling really large amounts of data as quickly as possible?
I've got a program with a really large ArrayList that's getting choked up when it tries to explore or modify the ArrayList.
Presumably when you do:
//i is an int;
arrayList.remove(i);
The code behind the scenes runs something like:
public T remove(int i){
//Let's say ArrayList stores it's data in a T [] array called "contents".
T output = contents[i];
T [] overwrite = new T [contents.length - 1];
//Yes, I know generic arrays aren't created this simply. Bear with me here...
for(int x=0;x<i;x++){
overwrite[x] = contents[x];
}
for(int x=i+1;x<contents.length;x++){
overwrite[x-1] = contents[x];
}
contents = overwrite;
return output;
}
When the size of the ArrayList is a couple million units or so, all those cycles rearranging the positions of items in the array would take a lot of time.
I've tried to alleviate this problem by creating my own custom ArrayList subclass which segments it's data storage into smaller ArrayLists. Any process that required the ArrayList to scan it's data for a specific item generates a new search thread for each of the smaller ArrayLists within (to take advantage of my multiple CPU cores).
But this system doesn't work because when the Thread calling the search has an item in any of the ArrayLists synchronized, it can block those seperate search threads from completing their search, which in turn locks up the original thread that called the search in the process, essentially deadlocking the whole program up.
I really need some kind of data storage class oriented to containing and manipulating large amounts of objects as quickly as the PC is capable.
Any ideas?
I really need some kind of data storage class oriented to containing and manipulating large amounts of objects as quickly as the PC is capable.
The answer depends a lot on what sort of data you are talking about and the specific operations you need. You use the work "explore" without defining it.
If you are talking about looking up a record then nothing beats a HashMap – ConcurrentHashMap for threaded operation. If you are talking about keeping in order, especially when dealing with threads, then I'd recommend a ConcurrentSkipListMap which has O(logN) lookup, insert, remove, etc..
You may also want to consider using multiple collections. You need to be careful that the collections don't get out of sync, which can be especially challenging with threads, but that might be faster depending on the various operations you are making.
When the size of the ArrayList is a couple million units or so, all those cycles rearranging the positions of items in the array would take a lot of time.
As mentioned ConcurrentSkipListMap is O(logN) for rearranging an item. i.e. remove and add with new position.
The [ArrayList.remove(i)] code behind the scenes runs something like: ...
Well not really. You can look at the code in the JDK right? ArrayList uses System.arraycopy(...) for these sorts of operations. They maybe not efficient for your case but it isn't O(N).
One example of good usage for a linked list is where the list elements are very large ie. large enough that only one or two can fit in CPU cache at the same time. At this point the advantage that contiguous block containers like vectors or arrays for iteration have is more or less nullified, and a performance advantage may be possible if many insertions and removals are occurring in realtime.
ref: Under what circumstances are linked lists useful?
ref : https://coderanch.com/t/508171/java/Collection-datastructure-large-data
Different collection types has different time complexity for various operations. Typical complexities are: O(1), O(N), and O(log(N)). To choose a collection, you first need to decide which operation you use often, and avoid collections which have O(N) complexity for that operations. Here you often use operation ArrayList.remove(i) which is O(N). Even worse, you use remove(i) and not remove(element). If remove(element) would have been the only operation used often, then LinkedList could help, its remove(element) is O(1), but LinkedList.remove(i)is also O(N).
I doubt that a List with remove(i) complexity of O(1) can be implemented. The best possible time is O(log(N)), which is definitely better than O(N). Java standard library has no such implementation. You can try to google it by "binary indexed tree" keywords.
But the first thing I would do is to review the algorithm and try to get rid of List.remove(i) operation.

Java: What collection type should I use for this case?

What I need:
Fastest put/remove, this is used alot.
Iteration, also used frequently.
Holds an object, e.g. Player. remove should be o(1) so maybe hashmap?
No duplicate keys
direct get() is never used, mainly iterating to retrieve data.`
I don't worry about memory, I just want the fastest speed possible even if it's at the cost of memory.
For iteration, nothing is faster than a plain old array. Entries are saved sequentially in memory, so the JVM can get to the next entry simply by adding the length of one entry to the its address.
Arrays are typically a bit of a hassle to deal with compared to maps or lists (e.g: no dictionary-style lookups, fixed length). However, in your case I think it makes sense to go with a one or two dimensional array since the length of the array will not change and dictionary-style lookups are not needed.
So if I understand you correctly you want to have a two-dimensional grid that holds information of which, if any, player is in specific tiles? To me it doesn't sound like you should be removing, or adding things to the grid. I would simply use a two-dimensional array that holds type Player or something similar. Then if no player is in a tile you can set that position to null, or some static value like Player.none() or Tile.empty() or however you'd want to implement it. Either way, a simple two-dimensional array should work fine. :)
The best Collection for your case is a LinkedList. Linked lists will allow for fast iteration, and fast removal and addition at any place in the linked list. For example, if you use an ArrayList, and you can to insert something at index i, then you have to move all the elements from i to the end one entry to the right. The same would happen if you want to remove. In a linked list you can add and remove in constant time.
Since you need two dimensions, you can use linked lists inside of linked lists:
List<List<Tile> players = new LinkedList<List<Tile>>(20);
for (int i = 0; i < 20; ++i){
List<Tile> tiles = new LinkedList<Tile>(20);
for (int j = 0; j < 20; ++j){
tiles.add(new Tile());
}
players.add(tiles);
}
use a map of sets guarantee O(1) for vertices lookup and amortized O(1) complexity edge insertion and deletions.
HashMap<VertexT, HashSet<EdgeT>> incidenceMap;
There is no simple one-size-fits-all solution to this.
For example, if you only want to append, iterate and use Iterator.remove(), there are two obvious options: ArrayList and LinkedList
ArrayList uses less memory, but Iterator.remove() is O(N)
LinkedList uses more memory, but Iterator.remove() is O(1)
If you also want to do fast lookup; (e.g. Collection.contains tests), or removal using Collection.remove, then HashSet is going to be better ... if the collections are likely to be large. A HashSet won't allow you to put an object into the collection multiple times, but that could be an advantage. It also uses more memory than either ArrayList or LinkedList.
If you were more specific on the properties required, and what you are optimizing for (speed, memory use, both?) then we could give you better advice.
The requirement of not allowing duplicates is effectively adding a requirement for efficient get().
Your options are either hash-based, or O(Log(N)). Most likely, hashcode will be faster, unless for whatever reason, calling hashCode() + equals() once is much slower than calling compareTo() Log(N) times. This could be, for instance, if you're dealing with very long strings. Log(N) is not very much, by the way: Log(1,000,000,000) ~= 30.
If you want to use a hash-based data structure, then HashSet is your friend. Make sure that Player has a good fast implementation of hashCode(). If you know the number of entries ahead of time, specify the HashSet size. ( ceil(N/load_factor)+1. The default load factor is 0.75).
If you want to use a sort-based structure, implement an efficient Player.compareTo(). Your choices are TreeSet, or Skip List. They're pretty comparable in terms of characteristics. TreeSet is nice in that it's available out of the box in the JDK, whereas only a concurrent SkipList is available. Both need to be rebalanced as you add data, which may take time, and I don't know how to predict which will be better.

Efficiency-wise, would it be quicker to make an ArrayList or use an array when adding to the first index?

I am using Java. I want to add to the start of an Array. Would it be more efficient to move all variables up one space in the array, leaving one spot for a new variable to be added in index 0, or to just use an ArrayList?
I am aware an ArrayList will move the values for me, but I have heard that they are very inefficient, is this true?
Are there any other APIs that will do this efficiently?
Apart from the method call overhead and some small maintenance cost, ArrayList is no more inefficient than copying array elements yourself. Some implementations of ArrayList may even be faster at moving data, by allowing the list to start somewhere else in the backing array than at index 0, as ArrayDeque does.
Neither would be efficient, because each insertion at the beginning needs to move what you've added so far. This means that inserting N elements takes O(N2) time, which is rather inefficient.
LinkedList<T>s are better for situations when you need to insert at the beginning of the list. However, they have memory overhead, and do not allow fast lookup based on the index.
If you do not need to use your list until after all elements have been inserted, you may be better off inserting elements at the back of the list, and then reversing the list before starting to use it.
ArrayList also uses Arrays internally to store the data. But, Sun/Oracle added a fastest algorithm to add the item in index 0 and move the items starting from index 1. So, better use the ArrayList for simpler coding, But if you can tweak a better algorithm, then go for Array.
If you would be adding to the first index very frequenlty, it will be very expensive as it needs to relocate all the indices from 1 to end of the array i.e it will resize it itself to adjust a new element at the top.
LinkedLists provide better performance in such cases but they do not implement the Random Access behaviour .
ArrayList provides enough performance for normal usage, and what's even more important - they are safe. So you don't need to worry about getting out-of-bounds, null-pointers etc.
To make it "faster" you can, for example, get rid of ArrayList's checking capacity etc., but then you are making your code unsafe, which means you must be sure you are setting the right parameters, because if not you will be getting IndexOutOfBounds etc.
You can read a very interesting post about Trove - using primitive collections for performance, for more information.
But 99 times out of 100, there is no real need. Remember and repeat after me:
Premature optimization is the root of all evil.
Besides, I really recommend checking out the JDK source code yourself. You can learn a lot and, obviously, see how it's made.

Best way to remove and add elements from the java List

I have 100,000 objects in the list .I want to remove few elements from the list based on condition.Can anyone tell me what is the best approach to achieve interms of memory and performance.
Same question for adding objects also based on condition.
Thanks in Advance
Raju
Your container is not just a List. List is an interface that can be implemented by, for example ArrayList and LinkedList. The performance will depend on which of these underlying classes is actually instantiated for the object you are polymorphically referring to as List.
ArrayList can access elements in the middle of the list quickly, but if you delete one of them you need to shift a whole bunch of elements. LinkedList is the opposite i nthis respect., requiring iteration for the access but deletion is just a matter of reassigning pointers.
Your performance depends on the implementation of List, and the best choice of implementation depends on how you will be using the List and which operations are most frequent.
If you're going to be iterating a list and applying tests to each element, then a LinkedList will be most efficient in terms of CPU time, because you don't have to shift any elements in the list. It will, however consume more memory than an ArrayList, because each list element is actually held in an entry.
However, it might not matter. 100,000 is a small number, and if you aren't removing a lot of elements the cost to shift an ArrayList will be low. And if you are removing a lot of elements, it's probably better to restructure as a copy-with filter.
However, the only real way to know is to write the code and benchmark it.
Collections2.filter (from Guava) produces a filtered collection based on a predicate.
List<Number> myNumbers = Arrays.asList(Integer.valueOf(1), Double.valueOf(1e6));
Collection<Number> bigNumbers = Collections2.filter(
myNumbers,
new Predicate<Number>() {
public boolean apply(Number n) {
return n.doubleValue() >= 100d;
}
});
Note, that some operations like size() are not efficient with this scheme. If you tend to follow Josh Bloch's advice and prefer isEmpty() and iterators to unnecessary size() checks, then this shouldn't bite you in practice.
LinkedList could be a good choice.
LinkedList does "remove and add elements" more effective than ArrayList. and no need to call such method as ArrayList.trimToSize() to remove useless memory. But LinkedList is a dual-linked list, each element is wrapped as an Entry which needs extra memory.

Categories