I have a requirement where I need to map a set of configurations with a set of values, ideally denoted by a Map<Map<String, Object>, Map<String, Object>> structure.
Both the configurations & the values part of the main Map are arbitrary & hence, I am unable to use a concrete class.
Please provide some feedback on this structure. Can a Map be used as a key for another Map. Doing a bit of research, I was able to establish that the Map's equals method utilizes all underlying Keys & Values to deem two Maps as equal. Also, the HashCode of a Map is based on the Hashcodes of the Map's Keys. This IMO should suffice the minimum requirements of using a Map as a key.
I would still like someone to validate this before I go ahead with the implementation. In case there is a better solution / design that someone can suggest, please feel free to do so.
EDIT
I ended up using a simple tilde ('~') & pipe ('|') separated String as the key & deconstructed it whenever needed. Thanks to all who helped.
Yes, a HashMap can be used as a key to another map, as the class properly overrides .equals() and .hashCode().
However it's broadly speaking a bad idea to use mutable types (such as HashMap) as Map keys or Set elements, because you violate the invariants these classes expect if the objects are mutated while in the collection.
While not quite what you're looking for, Guava offers several additional data structures such as Multiset, MultiMap, BiMap, Table which may be useful. They also offer immutable collections such as ImmutableMap which (because they can't be mutated) are safer to use as a Map key. Which isn't to say you should do so, simply that it's safe (if the keys and values are also immutable).
Consider posting a question exploring the problem that lead you to conclude a Map<Map<K, V>, Map<K, V>> structure was what you needed. You may get better answers to that question.
I want to write my own Map in Java. I know how map works, but i don't really know where you can keep keys and values. Can i keep them for example in List? So the keys would be store in the list and values would be store in another list?
Best would be if you checked out some of the concepts behind HashMap, TreeMap, HeapMap etc.
Once you understand those concepts, you're far better prepared for writing your own map when it comes to speed.
In other words: unless you know the concepts of all available implementations, it is very unlikely your wheel-re-invention will be a better solution.
Also be sure to test your implementations very thoroughly, as Collection are the backbone and heart of any good application.
Two very very simple (but slow) solutions are these:
1) As suggested above, you can use an ArrayList<Pair> and add your custom getItemByKey() (in Java commonly named 'get') method.
2) You can use two arrays, both keeping the same size, and keeping keys and values matched by their respective indices.
For choosing the data structure there's not better than Array (not all time but almost) of Entries (key/value) because the main goal of map is to map objects for objects, so mapping keys to values.
Using arrays for fast and constant access O(1), but you have a little problem, when your map is full, you have to create new Array and copy old entries.
Note: HashMap works in the same way.
Input : Let's say I have an object as Person. It has 2 properties namely
ssnNo - Social Security Number
name.
In one hand I have a List of Person objects (with unique ssnNo) and in the other hand I have a Map containing Person's ssnNo as the key and Person's name as the value.
Output : I need Person names using its ssnNo.
Questions :
Which approach to follow out of the 2 I have mentioned above i.e. using list or map? (I think the obvious answer would be the map).
If it is the map, is it always recommended to use map whether the data-set is large or small? I mean are there any performance issues that come with the map.
Map is the way to go. Maps perform very well, and their advantages over lists for lookups get bigger the bigger your data set gets.
Of course, there are some important performance considerations:
Make sure you have a good hashcode (and corresponding equals) implementation, so that you data will be evenly spread across the buckets of the Map.
Make sure you pre-size your Map when you allocate it (if at all possible). The map will automatically resize, but the resize operation essentially requires re-inserting each prior element into the new, bigger Map.
You're right, you should use a map in this case. There are no performance issues using map compared to lists, the performance is significantly better than that of a list when data is large. Map uses key's hashcodes to retrieve entries, in similar way as arrays use indexes to retrieve values, which gives good performance
This looks like a situation appropriate for a Map<Long, Person> that maps a social security number to the relevant Person. You might want to consider removing the ssnNo field from Person so as to avoid any redundancies (since you would be storing those values as keys in your map).
In general, Maps and Lists are very different structures, each suited for different circumstances. You would use the former whenever you want to maintain a set of key-value pairs that allows you to easily and quickly (i.e. in constant time) look up values based on the keys (this is what you want to do). You would use the latter when you simply want to store an ordered, linear collection of elements.
I think it makes sense to have a Person object, but it also makes sense to use a Map over a List, since the look up time will be faster. I would probably use a Map with SSNs as keys and Person objects as values:
Map<SSN,Person> ssnToPersonMap;
It's all pointers. It actually makes no sense to have a Map<ssn,PersonName> instead of a Map<ssn,Person>. The latter is the best choice most of the time.
Using map especially one that implement using a hash table will be faster than the list since this will allow you to get the name in constant time O(1). However using the list you need to do a linear search or may be a binary search which is slower.
I'm making a java application that is going to be storing a bunch of random words (which can be added to or deleted from the application at any time). I want fast lookups to see whether a given word is in the dictionary or not. What would be the best java data structure to use for this? As of now, I was thinking about using a hashMap, and using the same word as both a value and the key for that value. Is this common practice? Using the same string for both the key and value in a (key,value) pair seems weird to me so I wanted to make sure that there wasn't some better idea that I was overlooking.
I was also thinking about alternatively using a treeMap to keep the words sorted, giving me an O(lgn) lookup time, but the hashMap should give an expected O(1) lookup time as I understand it, so I figured that would be better.
So basically I just want to make sure the hashMap idea with the strings doubling as both key and value in each (key,value) pair would be a good decision. Thanks.
I want fast lookups to see whether a given word is in the dictionary or not. What would be the best java data structure to use for this?
This is the textbook usecase of a Set. You can use a HashSet. The naive implementation for Set<T> uses a corresponding Map<T, Object> to simply mark whether the entry exists or not.
If you're storing it as a collection of words in a dictionary, I'd suggest taking a look at Tries. They require less memory than a Set and have quick lookup times of worst case O(string length).
Any class that is a Set should help your purpose. However, Do note that Set will not allow for duplicates. For that matter, even a Map won't allow duplicate keys. I would suggest on using a an ArrayList(assuming synchronization is not needed) if you need to add duplicate entries and treat them as separate.
My only concern would be memory, if you use the HashSet and if you have a very large collection of words... Then you will have to load the entire collection in the memory... If that's not a problem.... (And your collection must be very large for this to be a problem)... Then the HashSet should be fine... If you indeed have a very large collection of words, then you can try to use a tree, and only load in memory the parts that you are interested in.
Also keep in mind that insertion is fast, but not as fast as in a tree, remember that for this to work, Java is going to insert every element sorted. Again, nothing major, but if you add a lot of words at a time, you may consider using a tree...
I was surprised by the fact that Map<?,?> is not a Collection<?>.
I thought it'd make a LOT of sense if it was declared as such:
public interface Map<K,V> extends Collection<Map.Entry<K,V>>
After all, a Map<K,V> is a collection of Map.Entry<K,V>, isn't it?
So is there a good reason why it's not implemented as such?
Thanks to Cletus for a most authoritative answer, but I'm still wondering why, if you can already view a Map<K,V> as Set<Map.Entries<K,V>> (via entrySet()), it doesn't just extend that interface instead.
If a Map is a Collection, what are the elements? The only reasonable answer is "Key-value pairs"
Exactly, interface Map<K,V> extends Set<Map.Entry<K,V>> would be great!
but this provides a very limited (and not particularly useful) Map abstraction.
But if that's the case then why is entrySet specified by the interface? It must be useful somehow (and I think it's easy to argue for that position!).
You can't ask what value a given key maps to, nor can you delete the entry for a given key without knowing what value it maps to.
I'm not saying that that's all there is to it to Map! It can and should keep all the other methods (except entrySet, which is redundant now)!
From the Java Collections API Design FAQ:
Why doesn't Map extend Collection?
This was by design. We feel that
mappings are not collections and
collections are not mappings. Thus, it
makes little sense for Map to extend
the Collection interface (or vice
versa).
If a Map is a Collection, what are the
elements? The only reasonable answer
is "Key-value pairs", but this
provides a very limited (and not
particularly useful) Map abstraction.
You can't ask what value a given key
maps to, nor can you delete the entry
for a given key without knowing what
value it maps to.
Collection could be made to extend
Map, but this raises the question:
what are the keys? There's no really
satisfactory answer, and forcing one
leads to an unnatural interface.
Maps can be viewed as Collections (of
keys, values, or pairs), and this fact
is reflected in the three "Collection
view operations" on Maps (keySet,
entrySet, and values). While it is, in
principle, possible to view a List as
a Map mapping indices to elements,
this has the nasty property that
deleting an element from the List
changes the Key associated with every
element before the deleted element.
That's why we don't have a map view
operation on Lists.
Update: I think the quote answers most of the questions. It's worth stressing the part about a collection of entries not being a particularly useful abstraction. For example:
Set<Map.Entry<String,String>>
would allow:
set.add(entry("hello", "world"));
set.add(entry("hello", "world 2"));
(assuming an entry() method that creates a Map.Entry instance)
Maps require unique keys so this would violate this. Or if you impose unique keys on a Set of entries, it's not really a Set in the general sense. It's a Set with further restrictions.
Arguably you could say the equals()/hashCode() relationship for Map.Entry was purely on the key but even that has issues. More importantly, does it really add any value? You may find this abstraction breaks down once you start looking at the corner cases.
It's worth noting that the HashSet is actually implemented as a HashMap, not the other way around. This is purely an implementation detail but is interesting nonetheless.
The main reason for entrySet() to exist is to simplify traversal so you don't have to traverse the keys and then do a lookup of the key. Don't take it as prima facie evidence that a Map should be a Set of entries (imho).
While you've gotten a number of answers that cover your question fairly directly, I think it might be useful to step back a bit, and look at the question a bit more generally. That is, not to look specifically at how the Java library happens to be written, and look at why it's written that way.
The problem here is that inheritance only models one type of commonality. If you pick out two things that both seem "collection-like", you can probably pick out a 8 or 10 things they have in common. If you pick out a different pair of "collection-like" things, they'll also 8 or 10 things in common -- but they won't be the same 8 or 10 things as the first pair.
If you look at a dozen or so different "collection-like" things, virtually every one of them will probably have something like 8 or 10 characteristics in common with at least one other one -- but if you look at what's shared across every one of them, you're left with practically nothing.
This is a situation that inheritance (especially single inheritance) just doesn't model well. There's no clean dividing line between which of those are really collections and which aren't -- but if you want to define a meaningful Collection class, you're stuck with leaving some of them out. If you leave only a few of them out, your Collection class will only be able to provide quite a sparse interface. If you leave more out, you'll be able to give it a richer interface.
Some also take the option of basically saying: "this type of collection supports operation X, but you're not allowed to use it, by deriving from a base class that defines X, but attempting to use the derived class' X fails (e.g., by throwing an exception).
That still leaves one problem: almost regardless of which you leave out and which you put in, you're going to have to draw a hard line between what classes are in and what are out. No matter where you draw that line, you're going to be left with a clear, rather artificial, division between some things that are quite similar.
I guess the why is subjective.
In C#, I think Dictionary extends or at least implements a collection:
public class Dictionary<TKey, TValue> : IDictionary<TKey, TValue>,
ICollection<KeyValuePair<TKey, TValue>>, IEnumerable<KeyValuePair<TKey, TValue>>,
IDictionary, ICollection, IEnumerable, ISerializable, IDeserializationCallback
In Pharo Smalltak as well:
Collection subclass: #Set
Set subclass: #Dictionary
But there is an asymmetry with some methods. For instance, collect: will takes association (the equivalent of an entry), while do: take the values. They provide another method keysAndValuesDo: to iterate the dictionary by entry. Add: takes an association, but remove: has been "suppressed":
remove: anObject
self shouldNotImplement
So it's definitively doable, but leads to some other issues regarding the class hierarchy.
What is better is subjective.
The answer of cletus is good, but I want to add a semantic approach. To combine both makes no sense, think of the case you add a key-value-pair via the collection interface and the key already exists. The Map-interface allows only one value associated with the key. But if you automatically remove the existing entry with the same key, the collection has after the add the same size as before - very unexpected for a collection.
Java collections are broken. There is a missing interface, that of Relation. Hence, Map extends Relation extends Set. Relations (also called multi-maps) have unique name-value pairs. Maps (aka "Functions"), have unique names (or keys) which of course map to values. Sequences extend Maps (where each key is an integer > 0). Bags (or multi-sets) extend Maps (where each key is an element and each value is the number of times the element appears in the bag).
This structure would allow intersection, union etc. of a range of "collections". Hence, the hierarchy should be:
Set
|
Relation
|
Map
/ \
Bag Sequence
Sun/Oracle/Java ppl - please get it right next time. Thanks.
Map<K,V> should not extend Set<Map.Entry<K,V>> since:
You can't add different Map.Entrys with the same key to the same Map, but
You can add different Map.Entrys with the same key to the same Set<Map.Entry>.
If you look at the respective data structure you can easily guess why Map is not a part of Collection. Each Collection stores a single value where as a Map stores key-value pair. So methods in Collection interface are incompatible for Map interface. For example in Collection we have add(Object o). What would be such implementation in Map. It doesn't make sense to have such a method in Map. Instead we have a put(key,value) method in Map.
Same argument goes for addAll(), remove(), and removeAll() methods. So the main reason is the difference in the way data is stored in Map and Collection.
Also if you recall Collection interface implemented Iterable interface i.e. any interface with .iterator() method should return an iterator which must allow us to iterate over the values stored in the Collection. Now what would such method return for a Map? Key iterator or a Value iterator? This does not make sense either.
There are ways in which we can iterate over keys and values stores in a Map and that is how it is a part of Collection framework.
Exactly, interface Map<K,V> extends
Set<Map.Entry<K,V>> would be great!
Actually, if it were implements Map<K,V>, Set<Map.Entry<K,V>>, then I tend to agree.. It seems even natural. But that doesn't work very well, right? Let's say we have HashMap implements Map<K,V>, Set<Map.Entry<K,V>, LinkedHashMap implements Map<K,V>, Set<Map.Entry<K,V> etc... that is all good, but if you had entrySet(), nobody will forget to implement that method, and you can be sure that you can get entrySet for any Map, whereas you aren't if you are hoping that the implementor has implemented both interfaces...
The reason I don't want to have interface Map<K,V> extends Set<Map.Entry<K,V>> is simply, because there will be more methods. And after all, they are different things, right? Also very practically, if I hit map. in IDE, I don't want to see .remove(Object obj), and .remove(Map.Entry<K,V> entry) because I can't do hit ctrl+space, r, return and be done with it.
Straight and simple.
Collection is an interface which is expecting only one Object, whereas Map requires Two.
Collection(Object o);
Map<Object,Object>