Is there any way to search an ArrayList in Java without using a loop since I have lots of collections to search, and it takes long time to search using loops.
If you keep your lists sorted, you can search them significantly faster using
Collections.binarySearch(array, key);
in your favorite java.util.Collections class.
Otherwise, you might want to look into TreeSet and HashSet.
But maybe you can improve your overall algorithm? Or build an index?
If the elements of your array list are not arranged in any particular order, then you have to loop over the list in one way or another.
If the array list does not change, one possibility might be to pre-sort it and then repeatedly use binary search.
Otherwise you'll need to employ a different data structure, such as a Set.
Related
I have a problem which uses many insertions in the list at the beginning and afterwards search and retrieval operations are extensively used, So which approach is good and efficient?
Approach 1: Use LinkedList as my data structure for the whole program.
Approach 2: Use ArrayList as my data structure for the whole program.
Approach 3: Use LinkedList as my data structure at the beginning for insertion and do
Arraylist al = new Arraylist(ll);
for retrieval operations.
How much does the changing of data structure cost?? Is it actually worth doing it?
Since they both implement the same interface you can find this out for yourself by writing your code so that the constructor can be plugged in and test your code both ways. Benchmarking can be done with jmh.
You can plug in the constructor by using the Supplier interface.
Depending on the nature of your problem you may find that using a Deque is appropriate.
Allow me to suggest a 4th approach: using ArrayDeque for the whole program. It has efficient insertion at the front and the back (possibly even faster then LinkedList) and search efficiency like ArrayList. ArrayDeque is an unfairly overlooked class in the Java Collections Framework, possibly because it was added later (Java 6, I think). It does not implement the List interface, so you will have to write your program specifically for it.
Other than that, the only two valid answers to your question are
Do not worry about efficiency until you absolutely have to.
If and when you have to worry about efficiency, you will have to make your own measurements of what performs satisfactorily on your data in your environment. No one here can tell you.
It depends on Frequencies of insert and retrieve operations. Here are the complexities:
ArrayList -> Insertion in the beginning : O(n)
ArrayList -> Retrieval based on index : O(1)
LinkedList -> Insertion in the beginning : O(1)
LinkedList -> Retrieval based on index : O(i) where i is number of elements to be scanned.
So, if you have more retrievals than insertions, go for ArrayList, if not, go for LinkedList.
I am new to java.
I am creating a school project in which I am using arrays.
My question is:
what is better - if i sort an array it also takes time in sorting.So will it be good to leave it unsorted for my small school project and while retrieving i will put some logic to retrive the desired array value.
It's better to keep it sorted if you think you will need sorted values at some later stage.
It would be better if you would paste your code here.
Either you sorted or not it is necessary to apply a logic to retrieve values from an array. If your array not a very large one you may not want to bother about performance since sorting doesn't take much time for smaller arrays.
It is better to use List over arrays since it has an implementation called ArrayList which grows dynamically.
If it is possible, may be is better use a vector
Have you think what would happen when you insert a new item in your array? Or if you change the way of sorting? If you keep the array sorted, keep in mind this things.
For these reasons, I think is better to sort the array each time you need to return a value (the best is using Vectors, as I said at the beginning).
I have a collection of objects that are guaranteed to be distinct (in particular, indexed by a unique integer ID). I also know exactly how many of them there are (and the number won't change), and was wondering whether Array would have a notable performance advantage over HashSet for storing/retrieving said elements.
On paper, Array guarantees constant time insertion (since I know the size ahead of time) and retrieval, but the code for HashSet looks much cleaner and adds some flexibility, so I'm wondering if I'm losing anything performance-wise using it, at least, theoretically.
Depends on your data;
HashSet gives you an O(1) contains() method but doesn't preserve order.
ArrayList contains() is O(n) but you can control the order of the entries.
Array if you need to insert anything in between, worst case can be O(n), since you will have to move the data down and make room for the insertion. In Set, you can directly use SortedSet which too has O(n) too but with flexible operations.
I believe Set is more flexible.
The choice greatly depends on what do you want to do with it.
If it is what mentioned in your question:
I have a collection of objects that are guaranteed to be distinct (in particular, indexed by a unique integer ID). I also know exactly how many of them there are
If this is what you need to do, the you need neither of them. There is a size() method in Collection for which you can get the size of it, which mean how many of them there are in the collection.
If what you mean for "collection of object" is not really a collection, and you need to choose a type of collection to store your objects for further processing, then you need to know, for different kind of collections, there are different capabilities and characteristic.
First, I believe to have a fair comparison, you should consider using ArrayList instead Array, for which you don't need to deal with the reallocation.
Then it become the choice of ArrayList vs HashSet, which is quite straight-forward:
Do you need a List or Set? They are for different purpose: Lists provide you indexed access, and iteration is in order of index. While Sets are mainly for you to keep a distinct set of data, and given its nature, you won't have indexed access.
After you made your decision of List or Set to use, then it is a choice of List/Set implementation, normally for Lists, you choose from ArrayList and LinkedList, while for Sets, you choose between HashSet and TreeSet.
All the choice depends on what you would want to do with that collection of data. They performs differently on different action.
For example, an indexed access in ArrayList is O(1), in HashSet (though not meaningful) is O(n), (just for your interest, in LinkedList is O(n), in TreeSet is O(nlogn) )
For adding new element, both ArrayList and HashSet is O(1) operation. Inserting in the middle is O(n) for ArrayList, while it doesn't make sense in HashSet. Both will suffer from reallocation, and both of them need O(n) for the reallocation (HashSet is normally slower in reallocation, because it involve calculation of hash for each element again).
To find if certain element exists in the collection, ArrayList is O(n) and HashSet is O(1).
There are still lots of operations you can do, so it is quite meaningless to discuss for performance without knowing what you want to do.
theoretically, and as SCJP6 Study guide says :D
arrays are faster than collections, and as said, most of the collections depend mainly on arrays (Maps are not considered Collection, but they are included in the Collections framework)
if you guarantee that the size of your elements wont change, why get stuck in Objects built on Objects (Collections built on Arrays) while you can use the root objects directly (arrays)
It looks like you will want an HashMap that maps id's to counts. Particularly,
HashMap<Integer,Integer> counts=new HashMap<Integer,Integer>();
counts.put(uniqueID,counts.get(uniqueID)+1);
This way, you get amortized O(1) adds, contains and retrievals. Essentially, an array with unique id's associated with each object IS a HashMap. By using the HashMap, you get the added bonus of not having to manage the size of the array, not having to map the keys to an array index yourself AND constant access time.
I have roughly 420,000 elements that I need to store easily in a Set or List of some kind. The restrictions though is that I need to be able to pick a random element and that it needs to be fast.
Initially I used an ArrayList and a LinkedList, however with that many elements it was very slow. When I profiled it, I saw that the equals() method in the object I was storing was called roughly 21 million times in a very short period of time.
Next I tried a HashSet. What I gain in performance I loose in functionality: I can't pick a random element. HashSet is backed by a HashMap which is backed by an array of HashMap.Entry objects. However when I attempted to expose them I was hindered by the crazy private and package-private visibility of the entire Java Collections Framework (even copying and pasting the class didn't work, the JCF is very "Use what we have or roll your own").
What is the best way to randomly select an element stored in a HashSet or HashMap? Due to the size of the collection I would prefer not to use looping.
IMPORTANT EDIT: I forgot a really important detail: exactly how I use the Collection. I populate the entire Collection at the begging of the table. During the program I pick and remove a random element, then pick and remove a few more known elements, then repeat. The constant lookup and changing is what causes the slowness
There's no reason why an ArrayList or a LinkedList would need to call equals()... although you don't want a LinkedList here as you want quick random access by index.
An ArrayList should be ideal - create it with an appropriate capacity, add all the items to it, and then you can just repeatedly pick a random number in the appropriate range, and call get(index) to get the relevant value.
HashMap and HashSet simply aren't suitable for this.
If ALL you need to do is get a large collection of values and pick a random one, then ArrayList is (literally) perfect for your needs. You won't get significantly faster (unless you went directly to primitive array, where you lose benefits of abstraction.)
If this is too slow for you, it's because you're using other operations as well. If you update your question with ALL the operations the collection must service, you'll get a better answer.
If you don't call contains() (which will call equals() many times), you can use ArrayList.get(randomNumber) and that will be O(1)
You can't do it with a HashMap - it stores the objects internally in an array, where the index = hashcode for the object. Even if you had that table, you'd need to guess which buckets contain objects. So a HashMap is not an option for random access.
Assuming that equals() calls are because you sort out duplicates with contains(), you may want to keep both a HashSet (for quick if-already-present lookup) and an ArrayList (for quick random access). Or, if operations don't interleave, build a HashSet first, then extract its data with toArray() or transform it into ArrayList with constructor of the latter.
If your problems are due to remove() call on ArrayList, don't use it and instead:
if you remove not the last element, just replace (with set()) the removed element with the last;
shrink the list size by 1.
This will of course screw up element order, but apparently you don't need it, judging by description. Or did you omit another important detail?
I want to use data structure that needs to be sorted every now and again. The size of the data structure will hardly exceed 1000 items.
Which one is better - ArrayList or LinkedList?
Which sorting algorithm is better to use?
Up to Java 7, it made no difference because Collections.sort would dump the content of the list into an array.
With Java 8, using an ArrayList should be slightly faster because Collections.sort will call List.sort and ArrayList has a specialised version that sorts the backing array directly, saving a copy.
So bottom line is ArrayList is better as it gives a similar or better performance depending on the version of Java.
If you're going to be using java.util.Collections.sort(List) then it really doesn't matter.
If the List does not implement RandomAccess, then it will be dumped to a List The list will get dumped into an array for purposes of sorting anyway.
(Thanks for keeping me honest Ralph. Looks like I confused the implementations of sort and shuffle. They're close enough to the same thing right?)
If you can use the Apache library, then have a look at TreeList. It addresses your problem correctly.
Only 1000 items? Why do you care?
I usually always use ArrayList unless I have specific need to do otherwise.
Have a look at the source code. I think sorting is based on arrays anyway, if I remember correctly.
If you are just sorting and not dynamically updating your sorted list, then either is fine and an array will be more memory efficient. Linked lists are really better if you want to maintain a sorted list. Inserting an object is fast into the middle of a linked list, but slow into an array.
Arrays are better if you want to find an object in the middle. With an array, you can do a binary sort and find if a member is in the list in O(logN) time. With a linked list, you need to walk the entire list which is very slow.
I guess which is better for your application depends on what you want to do with the list after it is sorted.