I want to write my own Map in Java. I know how map works, but i don't really know where you can keep keys and values. Can i keep them for example in List? So the keys would be store in the list and values would be store in another list?
Best would be if you checked out some of the concepts behind HashMap, TreeMap, HeapMap etc.
Once you understand those concepts, you're far better prepared for writing your own map when it comes to speed.
In other words: unless you know the concepts of all available implementations, it is very unlikely your wheel-re-invention will be a better solution.
Also be sure to test your implementations very thoroughly, as Collection are the backbone and heart of any good application.
Two very very simple (but slow) solutions are these:
1) As suggested above, you can use an ArrayList<Pair> and add your custom getItemByKey() (in Java commonly named 'get') method.
2) You can use two arrays, both keeping the same size, and keeping keys and values matched by their respective indices.
For choosing the data structure there's not better than Array (not all time but almost) of Entries (key/value) because the main goal of map is to map objects for objects, so mapping keys to values.
Using arrays for fast and constant access O(1), but you have a little problem, when your map is full, you have to create new Array and copy old entries.
Note: HashMap works in the same way.
Related
Given set of attributes and a comparator I'd like to generate an order preserving hash code that provides O(1) access. Is there a Java library for this sort of thing or would I have to design the hash function myself?
Try:
java.util.LinkedHashMap()
There is no single collection that will do this. Depending on the detail requirements there are several options to chose from.
For simplicity, I would just use a HashMap for lookups and when I need the sorted data, I'd make a copy of the values and sort it:
List<?> sorted = new ArrayList<?>(hashMap.values());
Collections.sort(sorted, Comparator<?>);
This suffices for most real world use cases.
You could also write your own super-container that internally holds the elements in two collections, one HashMap and maybe a TreeSet. You can then easily provide access methods that make use of the collection better for the purpose of the method. Just make sure you make additions and removals affect both the contained collections.
Input : Let's say I have an object as Person. It has 2 properties namely
ssnNo - Social Security Number
name.
In one hand I have a List of Person objects (with unique ssnNo) and in the other hand I have a Map containing Person's ssnNo as the key and Person's name as the value.
Output : I need Person names using its ssnNo.
Questions :
Which approach to follow out of the 2 I have mentioned above i.e. using list or map? (I think the obvious answer would be the map).
If it is the map, is it always recommended to use map whether the data-set is large or small? I mean are there any performance issues that come with the map.
Map is the way to go. Maps perform very well, and their advantages over lists for lookups get bigger the bigger your data set gets.
Of course, there are some important performance considerations:
Make sure you have a good hashcode (and corresponding equals) implementation, so that you data will be evenly spread across the buckets of the Map.
Make sure you pre-size your Map when you allocate it (if at all possible). The map will automatically resize, but the resize operation essentially requires re-inserting each prior element into the new, bigger Map.
You're right, you should use a map in this case. There are no performance issues using map compared to lists, the performance is significantly better than that of a list when data is large. Map uses key's hashcodes to retrieve entries, in similar way as arrays use indexes to retrieve values, which gives good performance
This looks like a situation appropriate for a Map<Long, Person> that maps a social security number to the relevant Person. You might want to consider removing the ssnNo field from Person so as to avoid any redundancies (since you would be storing those values as keys in your map).
In general, Maps and Lists are very different structures, each suited for different circumstances. You would use the former whenever you want to maintain a set of key-value pairs that allows you to easily and quickly (i.e. in constant time) look up values based on the keys (this is what you want to do). You would use the latter when you simply want to store an ordered, linear collection of elements.
I think it makes sense to have a Person object, but it also makes sense to use a Map over a List, since the look up time will be faster. I would probably use a Map with SSNs as keys and Person objects as values:
Map<SSN,Person> ssnToPersonMap;
It's all pointers. It actually makes no sense to have a Map<ssn,PersonName> instead of a Map<ssn,Person>. The latter is the best choice most of the time.
Using map especially one that implement using a hash table will be faster than the list since this will allow you to get the name in constant time O(1). However using the list you need to do a linear search or may be a binary search which is slower.
I'm making a java application that is going to be storing a bunch of random words (which can be added to or deleted from the application at any time). I want fast lookups to see whether a given word is in the dictionary or not. What would be the best java data structure to use for this? As of now, I was thinking about using a hashMap, and using the same word as both a value and the key for that value. Is this common practice? Using the same string for both the key and value in a (key,value) pair seems weird to me so I wanted to make sure that there wasn't some better idea that I was overlooking.
I was also thinking about alternatively using a treeMap to keep the words sorted, giving me an O(lgn) lookup time, but the hashMap should give an expected O(1) lookup time as I understand it, so I figured that would be better.
So basically I just want to make sure the hashMap idea with the strings doubling as both key and value in each (key,value) pair would be a good decision. Thanks.
I want fast lookups to see whether a given word is in the dictionary or not. What would be the best java data structure to use for this?
This is the textbook usecase of a Set. You can use a HashSet. The naive implementation for Set<T> uses a corresponding Map<T, Object> to simply mark whether the entry exists or not.
If you're storing it as a collection of words in a dictionary, I'd suggest taking a look at Tries. They require less memory than a Set and have quick lookup times of worst case O(string length).
Any class that is a Set should help your purpose. However, Do note that Set will not allow for duplicates. For that matter, even a Map won't allow duplicate keys. I would suggest on using a an ArrayList(assuming synchronization is not needed) if you need to add duplicate entries and treat them as separate.
My only concern would be memory, if you use the HashSet and if you have a very large collection of words... Then you will have to load the entire collection in the memory... If that's not a problem.... (And your collection must be very large for this to be a problem)... Then the HashSet should be fine... If you indeed have a very large collection of words, then you can try to use a tree, and only load in memory the parts that you are interested in.
Also keep in mind that insertion is fast, but not as fast as in a tree, remember that for this to work, Java is going to insert every element sorted. Again, nothing major, but if you add a lot of words at a time, you may consider using a tree...
I have a java class that contains a hash map as a member. This class is created with many objects. In many of these cases, one object of this type is cloned to another object, and then changed. The cloning is required because the changes modify the hash map, and I need to keep the original hash map of the original object intact.
I am wondering if anyone has any suggestions how to speed up the cloning part, or maybe some trick to avoid it. When I profile the code, most time is spent on the cloning these hash maps (which usually have very small set of values, a few hundreds or so).
(I am currently using the colt OpenIntDoubleHashMap implementation.)
You should use more effective algorithms for it. Look at the http://code.google.com/p/pcollections/ library, the PMap structure which allows immutable maps.
UPDATE
If your map is quite small (you said only a few hundreds), maybe more effective would be just two arrays:
int keys[size];
double values[size];
In this case to clone the map you just need do use System.arraycopy which should work very fast.
Maybe implement a copy-on-write wrapper for your map if the original only changes occasionally.
If only a small fraction of the objects change, could you implement a two-layer structure:
Layer 1 is the original map.
Layer 2 keeps the changed elements only.
Any object from the original map that needs to change gets cloned, modified an put into the layer-2 map.
Lookups first consult the layer-2 map and, if the object is not found, fall back to the layer-1 map.
This question relates to using most efficient data structure for a part of a uni-project.
I have to store several instruction objects in a data structure. Each instruction has a unique int ID called Stage. Is HashMap the best choice to find the instruction i need fast ?I havent used it before but from the description it seems that using the int ID as key would make this run efficiently. If you can, please suggest a more efficient way to do it. Thanks
If you only want to lookup entries and not add/delete move, sort or do anything else,
than an array is the fastest data structure for this.
Yes. Some kind of Map seems to be the data structure of choice in your scenario.
Note that a HashMap does not maintain the order of its elements. If order is important to you, I suggest you use LinkedHashMap (or perhaps even some List structure) instead.
I think that's the best way because that way you can access the table in O(1).
Depends also on what's the type of your ids, maybe an array is enough (and even more efficient), but generally a hash-table is more flexible for these purposes.
If you know the domain of the keys, an Arraylist or a plain array may be even more efficient. But there are reasons not to use plain arrays too much.
If the ID's are simply Integers and they are like 0,1,2..n then an array would be the best choice.