This API call returns a potentially large List<String>, which is not sorted. I need to sort it, search it, and access random elements. Currently the List is implemented by an ArrayList (I checked the source), but at some unknown point in the future the API developers may choose to switch to a LinkedList implementation (without changing the interface).
Sorting, searching, accessing a potentially large LinkedList would be extremely slow and unacceptable for my program. Therefore I need to convert the List to an ArrayList to ensure the practical efficiency of my program. However, since the List is most likely an ArrayList already, it would be inefficient to needlessly create a new ArrayList copy of the List.
Given these constraints, I have come up with the following method to convert a List into an ArrayList:
private static <T> ArrayList<T> asArrayList(List<T> list) {
if (list instanceof ArrayList) {
return (ArrayList<T>) (list);
} else {
return new ArrayList<T>(list);
}
}
My question is this: Is this the most efficient way to work with a List with an unknown implementation? Is there a better way to convert a List to an ArrayList? Is there a better option than converting the List into an ArrayList?
You can't really get much simpler than what you've got - looks about as efficient as it could possibly be to me.
That said, this sounds very much like premature optimisation - you should only really need to worry about this if and when the author of the API you're using changes to a LinkedList. If you worry about it now, you are likely to spend a lot of time and effort planning for a future scenario that may not even come to pass - this is time that might be better spent finding other issues to fix. Presumably the only time you'll be changing versions of the API is between versions of your own application - handle the issue at that point, if at all.
As you can see yourself, the code is simple and it is efficient inasmuch as it only creates a copy if it is necessary.
So the answer is that there is no significantly better option, save for a completely different type of solution, something that sorts your list as well, for example.
(Bear in mind that this level of optimisation is rarely required, so it's not a very frequent problem.)
Update: Just an afterthoughtL as a general rule, well-written APIs don't return data types that are inappropriate for the amount of data they are likely to contain. That is not to say you should trust them blindly, but it's not a completely unreasonable assumption.
Sorting, searching, accessing a potentially large LinkedList would be extremely slow and unacceptable for my program.
Actually, it is not as bad as that. IIRC, the Collections.sort methods copy the list to a temporary array, sort the array, clear() the original list and copy the array back to it. For large enough lists, the O(NlogN) sorting phase will dominate the O(N) copying phases.
Java collections that support efficient random access implement the RandomAccess marker interface. For such a list you could just run Collections.sort on the list directly. For lists without random access you should probably dump the list to an array using one of its toArray methods, sort that array, then wrap it into a random-access List.
T[] array = (T[])list.toArray(); // just suppress the warning if the compiler worries about unsafe cast
Arrays.sort(array);
List<T> sortedList = Arrays.asList(array);
I think Arrays.asList actually creates an ArrayList, so you could try casting its result, if you like.
Collections.sort is efficient for all List implementations, as long as they provide an efficient ListIterator. The method dumps the list into an array, sorts that, then uses the ListIterator to copy the values back into the list in O(n).
Related
I have implemented a graph.
I want to sort a given subset of vertices with respect to their degrees.
Therefore, I've written a custom comparator named DegreeComparator.
private class DegreeComparator implements Comparator<Integer>
{
#Override
public int compare(Integer arg0, Integer arg1)
{
if(adj[arg1].size() == adj[arg0].size()) return arg1 - arg0;
else return adj[arg1].size() - adj[arg0].size());
}
}
So, which one of the below is more efficient?
Using TreeSet
public Collection<Integer> sort(Collection<Integer> unsorted)
{
Set<Integer> sorted = new TreeSet<Integer>(new DegreeComparator());
sorted.addAll(unsorted);
return sorted;
}
Using ArrayList
Collections.sort(unsorted, new DegreeComparator());
Notice that the second approach is not a function, but a one-line code.
Intuitively, I'd rather choose the second one. But I'm not sure if it is more efficient.
Java API contains numerous Collection and Map implementations so it might be confusing to figure out which one to use. Here is a quick flowchart that might help with choosing from the most common implementations
A TreeSet is a Set. It removes duplicates (elements with the same degree). So both aren't equivalent.
Anyway, if what you want naturally is a sorted list, then sort the list. This will work whether the collection has duplicates or not, and even if it has the same complexity (O(n*log(n)) as populating a TreeSet, it is probably faster (because it just has to move elements in an array, instead of having to create lots of tree nodes).
If you only sort once, then the ArrayList is an obvious winner. The TreeSet is better if you add or remove items often as sorting a list again and again would be slow.
Note also that all tree structures need more memory and memory access indirection which makes them slower.
If case of medium sized lists, which change rather frequently by a single element, the fastest solution might be using ArrayList and inserting into the proper position (obviously assuming the arrays get sorted initially).
You'd need to determine the insert position via Arrays.binarySearch and insert or remove. Actually, I would't do it, unless the performance were really critical and a benchmark would show it helps. It gets slow when the list get really big and the gain is limited as Java uses TimSort, which is optimized for such a case.
As pointed in a comment, assuring that the Comparator returns different values is sometimes non-trivial. Fortunately, there's Guava's Ordering#arbitrary, which solves the problem if you don't need to be compatible with equals. In case you do, a similar method can be written (I'm sure I could find it somewhere if requested).
I've read quite a few questions here that discuss the cost of using ArrayLists vs LinkedLists in Java. One of the most useful I've seen thus far is is here: When to use LinkedList over ArrayList?.
I want to be sure that I'm correctly understanding.
In my current use case, I have multiple situations where I have objects stored in a List structure. The number of objects in the list changes for each run, and random access to objects in the list is never required. Based on this information, I have elected to use LinkedLists with ListIterators to traverse the entire content of the list.
For example, my code may look something like this:
for (Object thisObject : theLinkedList) {
// do something
}
If this is a bad choice, please help me understand why.
My current understanding is that traversing the entire list of objects in a LinkedList would incur O(n) cost using the iterative solution. Since there is no random access to the list (i.e. The need to get item #3, for example), my current understanding is that this would be basically the same as looping over the content of an ArrayList and requesting each element with an index.
Assuming I knew the number of objects to be stored in the list beforehand, my current line of thinking is that it would be better to initialize an ArrayList to the appropriate size and switch to that structure entirely without using a ListIterator. Is this logic sound?
As always, I greatly appreciate everone's input!
Iteration over a LinkedList and ArrayList should take roughly the same amount of time to complete, since in each case the cost of stepping from one element to the next is a constant. The ArrayList might be a bit better due to locality of reference, though, so it might be worth profiling to see what happens.
If you are guaranteed that there will always be a fixed number of elements, and there won't be insertions and deletions in random locations, then a raw array might be a good choice, since it's extremely fast and well-optimized for this case.
That said, your analysis of why to use LinkedList seems sound. Again, it doesn't hurt to profile the program and see if ArrayList would actually be faster for your use case.
Hope this helps!
What is faster than ArrayList<String> in Java ? I have a list which is of an undefined length. (sometimes 4 items, sometimes 100).
What is the FASTEST way to add and get it from any list ? arrayList.add(string) and get() are very slow.
Is there a better way for this? (string s[] and then copyArray are the slowest?)
Faster for what?
"basically arraylist.add(string) and get() is very slow." - based on what evidence? And compared to what? (No need for the word 'basically' here - it's a high tech "um".) I doubt that ArrayList is the issue with your app. Profiling your code is the only way to tell whether or not you're just guessing and grasping at straws.
Even an algorithm that's O(n^2) is likely to be adequate if the data set is small.
You have to understand the Big-Oh behavior of different data structures to answer this question. Adding to the end of an ArrayList is pretty fast, unless you have to resize it. Adding in the middle may take longer.
LinkedList will be faster to add in the middle, but you'll have to iterate to get to a particular element.
Both add() to end of list and get() should run in O(1). And since length is undefined, you can't use a fixed length array. You can't do any better I'm afraid.
add(int index, E element) takes linear time for worst case though if that's why you think it's slow. If that is the case, either use Hashtable (insertion takes constant time) or TreeMap (insertion takes logarithmic time).
100 items is not very many. Your bottleneck is elsewhere.
Take a look the Jodd Utilities. They have some collections that implement ArrayList but on primatives (jodd/util/collection/), such as IntArrayList. So if you're creating a ArrayList of int, float, double, etc.. it will be faster and consume less memory.
Even faster than that is what they call a FastBuffer, which excels at add() and can provide a get() at O(1).
The classes have little interdependency, so it's easy to just drop in the class you need into your code.
You can use javolution library. http://javolution.org
http://javolution.org/target/site/apidocs/javolution/util/FastList.html
ist much faster than arraylist ;)
Try to use hashtable it is much faster
I want to use data structure that needs to be sorted every now and again. The size of the data structure will hardly exceed 1000 items.
Which one is better - ArrayList or LinkedList?
Which sorting algorithm is better to use?
Up to Java 7, it made no difference because Collections.sort would dump the content of the list into an array.
With Java 8, using an ArrayList should be slightly faster because Collections.sort will call List.sort and ArrayList has a specialised version that sorts the backing array directly, saving a copy.
So bottom line is ArrayList is better as it gives a similar or better performance depending on the version of Java.
If you're going to be using java.util.Collections.sort(List) then it really doesn't matter.
If the List does not implement RandomAccess, then it will be dumped to a List The list will get dumped into an array for purposes of sorting anyway.
(Thanks for keeping me honest Ralph. Looks like I confused the implementations of sort and shuffle. They're close enough to the same thing right?)
If you can use the Apache library, then have a look at TreeList. It addresses your problem correctly.
Only 1000 items? Why do you care?
I usually always use ArrayList unless I have specific need to do otherwise.
Have a look at the source code. I think sorting is based on arrays anyway, if I remember correctly.
If you are just sorting and not dynamically updating your sorted list, then either is fine and an array will be more memory efficient. Linked lists are really better if you want to maintain a sorted list. Inserting an object is fast into the middle of a linked list, but slow into an array.
Arrays are better if you want to find an object in the middle. With an array, you can do a binary sort and find if a member is in the list in O(logN) time. With a linked list, you need to walk the entire list which is very slow.
I guess which is better for your application depends on what you want to do with the list after it is sorted.
I have 100,000 objects in the list .I want to remove few elements from the list based on condition.Can anyone tell me what is the best approach to achieve interms of memory and performance.
Same question for adding objects also based on condition.
Thanks in Advance
Raju
Your container is not just a List. List is an interface that can be implemented by, for example ArrayList and LinkedList. The performance will depend on which of these underlying classes is actually instantiated for the object you are polymorphically referring to as List.
ArrayList can access elements in the middle of the list quickly, but if you delete one of them you need to shift a whole bunch of elements. LinkedList is the opposite i nthis respect., requiring iteration for the access but deletion is just a matter of reassigning pointers.
Your performance depends on the implementation of List, and the best choice of implementation depends on how you will be using the List and which operations are most frequent.
If you're going to be iterating a list and applying tests to each element, then a LinkedList will be most efficient in terms of CPU time, because you don't have to shift any elements in the list. It will, however consume more memory than an ArrayList, because each list element is actually held in an entry.
However, it might not matter. 100,000 is a small number, and if you aren't removing a lot of elements the cost to shift an ArrayList will be low. And if you are removing a lot of elements, it's probably better to restructure as a copy-with filter.
However, the only real way to know is to write the code and benchmark it.
Collections2.filter (from Guava) produces a filtered collection based on a predicate.
List<Number> myNumbers = Arrays.asList(Integer.valueOf(1), Double.valueOf(1e6));
Collection<Number> bigNumbers = Collections2.filter(
myNumbers,
new Predicate<Number>() {
public boolean apply(Number n) {
return n.doubleValue() >= 100d;
}
});
Note, that some operations like size() are not efficient with this scheme. If you tend to follow Josh Bloch's advice and prefer isEmpty() and iterators to unnecessary size() checks, then this shouldn't bite you in practice.
LinkedList could be a good choice.
LinkedList does "remove and add elements" more effective than ArrayList. and no need to call such method as ArrayList.trimToSize() to remove useless memory. But LinkedList is a dual-linked list, each element is wrapped as an Entry which needs extra memory.