I need a collection class which has both: quick index and hash access.
Now I have ArrayList. It has good index acces, but his contains method is not performant. HashSet has good contains implementation but no indexed acces. Which collection has both? Probably something from Apache?
Or should I create my own collection class which has both: ArrayList for indexed acces and HashSet for contains check?
Just for clarification: i need both get(int index) and contains(Object o)
If indexed access performance is not a problem the closest match is LinkedHashSet whose API says that it is
Hash table and linked list implementation of the Set interface, with predictable iteration order.
at least I dont think that the performance will be be worse than that of LinkedListPerformance. Otherwise I cannot see no alternative but your ArrayList + HashTable solution
If you are traversing the index from start to finish, I think this might satisfy your needs: LinkedHashSet
If you need to randomly access via the index, as well as hash access, if no-one else has a better suggestion I guess you can make your own collection which does both.
Do like this; use combination of Hash technique as well as list to get best of both worlds :)
class DataStructure<Integer>{
Hash<Integer,Integer> hash = new HashMap<Integer, Integer>();
List<Integer> list = new ArrayList<Integer>();
public void add(Integer i){
hash.add(i,i);
list.add(i);
}
public Integer get(int index){
return list.get(index);
}
...
} //used Integers to make it simpler
So object; you keep in HashMap/HashSet as well as ArrayList.
So if you want to use
contains method : call hashed contains method.
get an object with index: use array to return the value
Just make sure that you have both these collections in sync. And take care of updation/deletion in both data structures.
I don't know the exact lookup times, but maybe you could use some implementation of the Map interface. You could store you objects with map.put(objectHash, obj).
Then you could verify that you have a certain object with:
boolean contained = map.containsValue(obj);
And you can use the hash to lookup an object in the map:
MyObject object = map.get(objectHash);
Though, the only downfall is that you would need to know your hashes on this lookup call which may not be probably in your implementation.
Related
This is a very generalized question, I'll try to be as clear as I can. let's say I have some collection of objects, for simplicity, make them integers. Now I want to make a class which represents these integers as some data structure. In this class I want to implement
a sort function, which sorts the collection according to some defined sorting logic
the iterable interface, where the Iterator traverses in insertion order
How could I make it so that, even if I add integers in unsorted order, e.g.
someCollection.add(1);
someCollection.add(3);
someCollection.add(2);
and then call
Collections.sort(someSortingLogic);
The iterator still traverses in insertion order, after the collection is sorted. Is there a particular data structure I could use for this purpose, or would it be a case of manually tracking which elements are inserted in which order, or something else I can't think of?
Many thanks!
Generally, to solve a problem like this, you maintain two indexes to the values. Perhaps one of those indexes contains the actual values, perhaps both indexes contain the actual values, or perhaps the actual values are stored elsewhere.
Then when you want to walk the sorted order, you use the sorted index to the values, and when you want the insertion order, you use the insert index to the values.
An index can be as simple as an array containing the values. Naturally, you can't store two different values into one spot in an array, so a simple solution is to wrap two arrays in an Object, such that calling the Object's sort() method sorts one array, while leaving the insertion order array untouched.
Fancier data structures leverage fancier techniques, but they all basically boil down to maintaining two orders, the insertion order AND the sort order.
public class SomeCollection {
public void add(int value) {
insertArray = expandIfNeeded(insertArray);
insertArray[insertIndex++] = value;
sortArray = expandIfNeeded(sortArray);
sortArray[sortIndex++] = value;
sort(sortArray);
}
...
}
I'm not sure you've shown us enough code to give you a good answer, but if you have a class that looks a bit like this:
public class Hand implements Iterator<Card>
{
private List<Card> cards = new ArrayList<>();
// Returns iterator for natural ordering of cards
#Override
public Iterator<Card> iterator()
{
return cards.iterator();
}
// Rest of code omitted
Then you can implement a sortedIterator(...) method like this:
// Returns iterator for sorted ordering by Comparator c
public Iterator<Card> sortedIterator(Comparator<? super Card> c)
{
return cards.stream().sorted(c).iterator();
}
If you show us some more code for what you have written, there may be better solutions.
Hi I'm wondering if it is possible to access the contents of a HashSet directly if you have the Hashcode for the object you're looking for, sort of like using the HashCode as a key in a HashMap.
I imagine it might work something sort of like this:
MyObject object1 = new MyObject(1);
Set<MyObject> MyHashSet = new HashSet<MyObject>();
MyHashSet.add(object1)
int hash = object1.getHashCode
MyObject object2 = MyHashSet[hash]???
Thanks!
edit: Thanks for the answers. Okay I understand that I might be pushing the contract of HashSet a bit, but for this particular project equality is solely determined by the hashcode and I know for sure that there will be only one object per hashcode/hashbucket. The reason I was pretty reluctant to use a HashMap is because I would need to convert the primitive ints I'm mapping with to Integer objects as a HashMap only takes in objects as keys, and I'm also worried that this might affect performance. Is there anything else I could do to implement something similar with?
The common implementation of HashSet is backed (rather lazily) by a HashMap so your effort to avoid HashMap is probably defeated.
On the basis that premature optimization is the root of all evil, I suggest you use a HashMap initially and if the boxing/unboxing overhead of int to and from Integer really is a problem you'll have to implement (or find) a handcrafted HashSet using primitive ints for comparison.
The standard Java library really doesn't want to concern itself with boxing/unboxing costs.
The whole language sold that performance issue for a considerable gain in simplicity long ago.
Notice that these days (since 2004!) the language automatically boxes and unboxes which reveals a "you don't need to be worrying about this" policy. In most cases it's right.
I don't know how 'richly' featured your HashKeyedSet needs to be but a basic hash-table is really not too hard.
HashSet is internally backed by a HashMap, which is unavailable through the public API unfortunately for this question. However, we can use reflection to gain access to the internal map and then find a key with an identical hashCode:
private static <E> E getFromHashCode(final int hashcode, HashSet<E> set) throws Exception {
// reflection stuff
Field field = set.getClass().getDeclaredField("map");
field.setAccessible(true);
// get the internal map
#SuppressWarnings("unchecked")
Map<E, Object> interalMap = (Map<E, Object>) (field.get(set));
// attempt to find a key with an identical hashcode
for (E elem : interalMap.keySet()) {
if (elem.hashCode() == hashcode) return elem;
}
return null;
}
Used in an example:
HashSet<String> set = new HashSet<>();
set.add("foo"); set.add("bar"); set.add("qux");
int hashcode = "qux".hashCode();
System.out.println(getFromHashCode(hashcode, set));
Output:
qux
This is not possible as HashSet is an object and there is no public API as such. Also multiple objects can have the same hashcode but the objects can be different.
Finally only arrays can be accessed using myArray[<index>] syntax.
You can easily write code that will directly access the internal data structures of the HashSet implementation using reflection. Of course, your code will depend on the implementation details of the particular JVM you are coding to. You also will be subject to the constraints of the SecurityManager (if any).
A typical implementation of HashSet uses a HashMap as its internal data structure. The HashMap has an array, which is indexed by the key's hashcode mapped to an index in the array. The hashcode mapping function is available by calling non-public methods in the implementation - you will have to read the source code and figure it out. Once you get to the right bucket, you will just need to find (using equals) the right entry in the bucket.
Basically what I'm trying to do is store two values into a hashmap, i've tried a dictionary and failed with that as-well, anyhow, here we go.
private HashMap<Integer, Integer> dropTable = new HashMap<Integer, Integer>();
Then, in my code I have this
for(int npc = 0; npc < 10; npc++){
dropTable.put(npc, Constants.itemDrops[npc][1]);
}
Basically, what I'm trying to do is save the values in this manner (With the ItemID being what is returned in the itemDrops array
<ArrayIndex, ItemID>
How,ever when I try to return this information, i can't figure it out.
Here is how I attempted returning this value
for(int i = 0; i < dropTable.size(); i++) {
System.out.println("NPC: " + dropTable.get((Integer)i));
}
However, that returns null, and looking at it, it wont give me what I need.
How would I go about retrieving the Key/Value separately from the HashMap based on the Index of the HashMap? (If Hashmaps even have Index's, that's what I'm under the impression of)
===============
My idea of a hashmap.
<Integer>, <Integer> Index: 0
<Integer>, <Integer> Index: 1
etc...
HashMaps are unsorted, do not have indexes and the order features are returned during iteration is not guaranteed to be the order of insertion.
There are other Map objects available which provide some of these features. TreeMap is sorted for example.
All Maps provide a method keySet() which provides the set of Keys in an iterator and a similar values() Collection iterating over these objects is likely to provide the sort of behavour you wish.
First thing to aks, is a hashmap realy the data structure you need?
If not not then just use a simple List and use its get(index) method!
If Yes then you could use a Map implmentation - such as LinkedHashMap - that preserves the insert/put order. Then you can call values() of that map and iterate over it using enhanced for-loop, or probably also create a List and passing it the collection reurned if you need index access.
If you are looking to get a your objects according to an index - what you are really looking for is an ArrayList - which allows access by index, and is basically a dynamic array.
If you want to objects "attached" to the same index, you can use 2 ArrayLists (one for each type of object) - or use a Pair (from apache commons) as an element in the arraylist.
Given a sorted array of objects, while the order is based on some object attribute. (Sorting is done via a List using Collections.sort() with a custom Comparator and then calling toArray()).
Duplicate instances of SomeObject are not allowed ("duplicates" in this regard depends on multiple attribute value in SomeObject), but it's possible that multiple instances of SomeObject have the same value for attribute1, which is used for sorting.
public SomeObject {
public attribute1;
public attribute2;
}
List<SomeObject> list = ...
Collections.sort(list, new Comparator<SomeObject>() {
#Override
public int compare(SomeObject v1, SomeObject v2) {
if (v1.attribute1 > v2.attribute1) {
return 1;
} else if (v1.attribute1 < v2.attribute1) {
return -1;
} else
return 0;
}
});
SomeObject[] array = list.toArray(new SomeObject[0]);
How to efficiently check whether a certain object based on some attribute is in that array while also being able to "mark" objects already found in some previous look up (e.g. simply by removing them from the array; already found objects don't need to be accessed at later time).
Without the later requirement, one could do a Arrays.binarySearch() with custom Comparator. But obviously it's not working when one want to remove objects already found.
Use a TreeSet (or TreeMultiset).
You can initialize it with your comparator; it sorts itself; look-up and removal are in logarithmic time.
You can also check for existence and remove in one step, because remove returns a boolean.
Building on Arian's answer, you can also use TreeBag from Apache Commons' Collections library. This is backed by a TreeMap, and maintains a count for repeated elements.
If you want you can put all the elements into some sort of linked list whose nodes are also connected in a heap form when you sort them. That way, finding an element would be log n and you can still delete the nodes in place.
I want to create a large (~300,000 entries) List of self defined objects of the class Drug.
Every Drug has an ID and I want to be able to search the Drugs in logarithmic time via that ID.
What kind of List do I have to use?
How do I declare that it should be searchable via the ID?
The various implementations of the Map interface should do what you want.
Just remember to override the hashCode() method of your Drug class if you plan to use a HashMap.
public class Drug implements Comparable<Drug> {
public int compareTo(Drug o) {
return this.id.compareTo(o.getId());
}
}
Then in your List you can use binarySearch
List<Drug> drugList; <--- List of all drugs
Drug drugToSearchFor; <---- The drug that you want to search for, containing the id
// Sort before search
Collections.sort(drugList);
int index = Collections.binarySearch(drugList, drugToSearchFor);
if (index >= 0) {
return true;
} else {
return false;
}
Wouldn't you use TreeMap instead of List using the ID as your Key?
If searching by a key is important for you, then you probably need to use a Map and not a List. From the Java Collections Trail:
The three general-purpose Map
implementations are HashMap, TreeMap
and LinkedHashMap. If you need
SortedMap operations or key-ordered
Collection-view iteration, use
TreeMap; if you want maximum speed and
don't care about iteration order, use
HashMap; if you want near-HashMap
performance and insertion-order
iteration, use LinkedHashMap.
Due to the high number of entries you might consider to use a database instead of holding everything in memory.
If you still want to keep it in memory you might have a look at b-trees.
You could use any list, and as long as it is sorted you can use a binary search.
But I would use a Map which searches in O(1).
I know I am pretty redundant with this statement, but as everybody said isnt this exactly the case for a Map ?