Why is Set at all required when we have Maps? - java

Sets are essentially Maps from an existential point of view. There is nothing a Map can not do which a Set can, I assume. We have these overheads of defining key-value pairs in Maps which is not there in the Sets. But again the elements of a Set are just the keys of the underlying Map, right? So what is the point of having Sets around when Maps are able to do all the things required? I hope a Set takes the same amount of memory as a Map does?
What are key arguments in favor of existence of Sets?
For instance, in the case of Lists, we have ArrayList and LinkedList which have differences and we can choose between these two as per our requirements.

I would argue that a Map is actually a Set!
Map<Key,Value> can be implemented with Set<Entry<Key,Value>>
This is similar to the mathematical foundations of what sets, maps, and functions are.
Firstly, can we agree that a Map is a function from Key=>Value (or Domain=>Range). Each key corresponds with at most one value, so it is a partial function (or a complete function only upon those keys in the map). So a map is a function. (Scala even goes so far as to have Map implement the Function1 interface.)
Secondly, what is a function? A function is a set of tuples where each first element occurs only once in the set. The second element of the tuple is the value returned by the function.
So we have Map is a Function is a Set.
On a practical note, there are very good reasons for having Sets. They are very often the correct data structure to use from a conceptual point of view, even before you start worrying about performance. I'd use them over a List in most situations.

The primary difference between a Set and a Map is that a Map holds two object per Entry e.g. key and value and it may contain duplicate values but keys are always unique. But Set holds only keys and those are unique.

Related

How does HashTable Order Values?

I want to know how hashtable orders its values after using Put method.
For example:
a b c d e
Normal 2 weeks Next Save and Finish Go to Cases
hashtable.put("a","Normal"); ...
The order of the values will be different and not in the same order we put.
I think the order will be like this:
b a e c d
2 weeks Normal Go to Cases Next Save and Finish
Please suggest data structures that solve the problem.
Thanks.
As very often in these cases, the answer is in the documentation:
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
Similar to HashMap, HashTable also does not guarantee insertion order for the elements.
Reason
HashTable is optimized for fast look up. This is achieved by calculating hash for key values stored. This ensures that searching for any value in HashTable is O(1), takes the same time irrespective the number entries in HashTable.
Thus the entries are stored based on the hash generated for key. This is the reason why HashTable does not guarantee the order of the elements in which they were inserted.
A hash value (or simply hash), also called a message digest, is a number generated from a string of text. The hash is substantially smaller than the text itself, and is generated by a formula in such a way that it is extremely unlikely that some other text will produce the same hash value.
http://www.webopedia.com/TERM/H/hashing.html
http://interactivepython.org/runestone/static/pythonds/SortSearch/Hashing.html
As previously explained, hashtable iterate order is just casual. If you want to preserve inserted order use LinkedHashMap. If you want to obtain natural order or a predefined one, use TreeMap. As natural order I mean the order of key, for example String, Integer, Long and so on, as implement Comparable interface, are automatically sorted as any other class that implements Comparable. Predefined order can be supplied by a Comparator too, creating the TreeMap.

Java or guava map implementation to use with multiple keys pointing to single value

I have a situation where many many keys are pointing to a single value. The situation arises from a service locator pattern that I am implementing such that -
each method in an interface is represented as a signature string
All such signatures of a single interface are used as keys
The value being the full canonical name of the implementation class
Thus my need is to retrieve a single value when user requests any of the matching keys.
In a sense I need an opposite of MultiMap from Guava .
I am looking for the most optimized solution there is since my keys are very similar though unique for a specific value and I am not sure if using a generic Map implementation like HashMap is efficient enough to handle this case.
e.g. all the below signatures
==============
_org.appops.server.core.service.mocks.MockTestService_testOperationThree
_org.appops.server.core.service.mocks.MockTestService_getService
_org.appops.server.core.service.mocks.MockTestService_start
_org.appops.server.core.service.mocks.MockTestService_testOperationTwo_String_int
_org.appops.server.core.service.mocks.MockTestService_getName
_org.appops.server.core.service.mocks.MockTestService_shutdown
_org.appops.server.core.service.mocks.MockTestService_testOperationOne_String
=======
Point to a single class i.e. org.appops.server.core.service.mocks.MockTestServiceImpl and I am anticipating hundreds of such classes (values) and thousands of such similar signatures (keys) .
In case there is no optimized way I could always use a HashMap with replicated values for each group of keys which I would like to avoid.
Ideally I would like to use a ready utility from Guava.
HashMap is actually what you need, and the issue is that you misunderstand what it does.
In case there is no optimized way I could always use a HashMap with replicated values for each group of keys which I would like to avoid.
HashMap does not store a copy of the value for each key mapping to that value. HashMap stores a reference to the Java object. It's always the same cost. A HashMap<Integer, BigExpensiveObject> where every key is mapped to the same BigExpensiveObject takes exactly the same amount of memory as a HashMap<Integer, Integer> where every key is mapped to the same Integer. The only memory difference in the whole program would be the memory difference between one BigExpensiveObject and one Integer.

Is the order of HashMap elements reproducible?

First of all, I want to make it clear that I would never use a HashMap to do things that require some kind of order in the data structure and that this question is motivated by my curiosity about the inner details of Java HashMap implementation.
You can read in the java documentation on Object about the Object method hashCode.
I understand from there that hashCode implementation for classes such as String and basic types wrappers (Integer, Long,...) is predictable once the value contained by the object is given. An example of that would be that calls to hashCode for any String object containing the value hello should return always: 99162322
Having an algorithm that always insert into an empty Java HashMap where Strings are used as keys the same values in the same order. Then, the order of its elements at the end should be always the same, am I wrong?
Since the hash code for a concrete value is always the same, if there are not collisions the order should be the same.
On the other hand, if there are collisions, I think (I don't know the facts) that the collisions resolutions should result in the same order for exactly the same input elements.
So, isn't it right that two HashMap objects with the same elements, inserted in the same order should be traversed (by an iterator) giving the same elements sequence?
As far as I know the order (assuming we call "order" the order of elements as returned by values() iterator) of the elements in HashMap are kept until map rehash is performed. We can influence on probability of that event by providing capacity and/or loadFactor to the constructor.
Nevertheless, we should never rely on this statement because the internal implementation of HashMap is not a part of its public contract and is a subject to change in future.
I think you are asking "Is HashMap non-deterministic?". The answer is "probably not" (look at the source code of your favourite implementation to find out).
However, bear in mind that because the Java standard does not guarantee a particular order, the implementation is free to alter at any time (e.g. in newer JRE versions), giving a different (yet deterministic) result.
Whether or not that is true is entirely dependent upon the implementation. What's more important is that it isn't guaranteed. If you order is important to you there are options. You could create your own implementation of Map that does preserve order, you can use a SortedMap/LinkedHashMap or you can use something like the apache commons-collections OrderedMap: http://commons.apache.org/proper/commons-collections/javadocs/api-release/org/apache/commons/collections4/OrderedMap.html.

Map with two-dimensional key in java

I want a map indexed by two keys (a map in which you put AND retrieve values using two keys) in Java. Just to be clear, I'm looking for the following behavior:
map.put(key1, key2, value);
map.get(key1, key2); // returns value
map.get(key2, key1); // returns null
map.get(key1, key1); // returns null
What's the best way to to it? More specifically, should I use:
Map<K1,Map<K2,V>>
Map<Pair<K1,K2>, V>
Other?
(where K1,K2,V are the types of first key, second key and value respectively)
You should use Map<Pair<K1,K2>, V>
It will only contain one map,
instead of N+1 maps
Key construction
will be obvious (creation of the
Pair)
Nobody will get confused as to
the meaning of the Map as its
programmer facing API won't have changed.
Dwell time in the data structure would be shorter, which is good if you find you need to synchronize it later.
If you're willing to bring in a new library (which I recommend), take a look at Table in Guava. This essentially does exactly what you're looking for, also possibly adding some functionality where you may want all of the entries that match one of your two keys.
interface Table<R,C,V>
A collection that associates an
ordered pair of keys, called a row key
and a column key, with a single value.
A table may be sparse, with only a
small fraction of row key / column key
pairs possessing a corresponding
value.
I'd recommend going for the second option
Map<Pair<K1,K2>,V>
The first one will generate more overload when retrieving data, and even more when inserting/removing data from the Map. Every time that you put a new Value V, you'll need to check if the Map for K1 exists, if not create it and put it inside the main Map, and then put the value with K2.
If you want to have an interface as you're exposing initially wrap your Map<Pair<K1,K2>,V> with your own "DoubleKeyMap".
(And don't forget to properly implement the methods hash and equals in the Pair class!!)
While I also am on board with what you proposed (a pair of values to use as the key), you could also consider making a wrapper which can hold/match both keys. This might get somewhat confusing since you would need to override the equals and hashCode methods and make that work, but it could be a straightforward way of indicating to the next person using your code that the key must be of a special type.
Searching a little bit, I found this post which may be of use to you. In particular, out of the Apache Commons Collection, MultiKeyMap. I've never used this before, but it looks like a decent solution and may be worth exploring.
I would opt for the Map<Pair<K1,K2>, V> solution, because:
it directly expresses what you want to do
is potentially faster because it uses fewer indirections
simplifies the client code (the code that uses the Map afterwards
Logically, you Pair (key1, key2) corresponds to something since it is the key of your map. Therefore you may consider writing your own class having K1 and K2 as parameters and overriding the hashCode() method (plus maybe other methods for more convenience).
This clearly appears to be a "clean" way to solve your problem.
I have used array for the key: like this
Map<Array[K1,K2], V>

What is the difference between Lists, ArrayLists, Maps, Hashmaps, Collections etc..?

I've been using HashMaps since I started programming again in Java without really understanding these Collections thing.
Honestly I am not really sure if using HashMaps all the way would be best for me or for production code. Up until now it didn't matter to me as long as I was able to get the data I need the way I called them in PHP (yes, I admit whatever negative thing you are thinking right now) where $this_is_array['this_is_a_string_index'] provides so much convenience to recall an array of variables.
So now, I have been working with java for more than 3 months and came across the Interfaces I specified above and wondered, why are there so many of these things (not to mention, vectors, abstractList {oh well the list goes on...})?
I mean how are they different from each other?
And more importantly, what is the best Interface to use in my case?
The API is pretty clear about the differences and/or relations between them:
Collection
The root interface in the collection hierarchy. A collection represents a group of objects, known as its elements. Some collections allow duplicate elements and others do not. Some are ordered and others unordered.
http://download.oracle.com/javase/6/docs/api/java/util/Collection.html
List
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
http://download.oracle.com/javase/6/docs/api/java/util/List.html
Set
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
http://download.oracle.com/javase/6/docs/api/java/util/Set.html
Map
An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
http://download.oracle.com/javase/6/docs/api/java/util/Map.html
Is there anything in particular you find confusing about the above? If so, please edit your original question. Thanks.
A short summary of common java collections:
'Map': A 'Map' is a container that allows to store key=>value pair. This enables fast searches using the key to get to its associated value. There are two implementations of this in the java.util package, 'HashMap' and 'TreeMap'. The former is implemented as a hastable, while the latter is implemented as a balanced binary search tree (thus also having the property of having the keys sorted).
'Set': A 'Set' is a container that holds only unique elements. Inserting the same value multiple times will still result in the 'Set' only holding one instance of it. It also provides fast operations to search, remove, add, merge and compute the intersection of two sets. Like 'Map' it has two implementations, 'HashSet' and 'TreeSet'.
'List': The 'List' interface is implemented by the 'Vector', 'ArrayList' and 'LinkedList' classes. A 'List' is basically a collection of elements that preserve their relative order. You can add/remove elements to it and access individual elements at any given position. Unlike a 'Map', 'List' items are indexed by an int that is their position is the 'List' (the first element being at position 0 and the last at 'List.size()'-1). 'Vector' and 'ArrayList' are implemented using an array while 'LinkedList', as the name implies, uses a linked list. One thing to note is, unlike php's associative arrays (which are more like a Map), an array in Java and many other languages actually represents a contiguous block of memory. The elements in an array are basically laid out side by side on adjacent "slots" so to speak. This gives very fast lookup and write times, much faster than associative arrays which are implemented using more complex data structures. But they can't be indexed by anything other than the numeric positions within the array, unlike associative arrays.
To get a really good idea of what each collection is good for and their performance characteristics I would recommend getting a good idea about data structures like arrays, linked lists, binary search trees, hashtables, as well as stacks and queues. There is really no substitute to learning this if you want to be an effective programmer in any language.
You can also read the Java Collections trail to get you started.
In Brief (and only looking at interfaces):
List - a list of values, something like a "resizable array"
Set - a container that does not allow duplicates
Map - a collection of key/value pairs
A Map vs a List.
In a Map, you have key/value pairs. To access a value you need to know the key. There is a relationship that exists between the key and the value that persists and is not arbitrary. They are related somehow. Example: A persons DNA is unique (the key) and a persons name (the value) or a persons SSN (the key) and a persons name (the value) there is a strong relationship.
In a List, all you have are values (a persons name), and to access it you need to know its position in the list (index) to access it. But there is no permanent relationship between the position of the value in the list and its index, it is arbitrary.
■ List — An ordered collection of elements that allows duplicate entries
Concrete Classes:
ArrayList — Standard resizable list.
LinkedList — Can easily add/remove from beginning or end.
Vector — Older thread-safe version of ArrayList.
Stack — Older last-in, first-out class.
■ Set — Does not allow duplicates
Concrete Classes:
HashSet—Uses hashcode() to find unordered elements.
TreeSet—Sorted and navigable. Does not allow null values.
■ Queue — Orders elements for processing
Concrete Classes:
LinkedList — Can easily add/remove from beginning or end.
ArrayDeque—First-in, first-out or last-in, first-out. Does not allow null values.
■ Map — Maps unique keys to values
Concrete Classes:
HashMap — Uses hashcode() to find keys.
TreeMap — Sorted map. Does not allow null keys.
Hashtable — Older version of hashmap. Does not allow null keys or values.
That is a question that ultimately has a very complex answer--there are entire college classes dedicated to data structures. The short answer is that they all have trade-offs in memory usage and the speed of various operations.
What would be really healthy is some time with a nice book on data structures--I can almost guarantee that your code will improve significantly if you get a nice understanding of data structures.
That said, I can give you some quick, temporary advice from my experience with Java. For most simple internal things, ArrayList is generally preferred. For passing collections of data about, simple arrays are generally used. HashMap is only really used for cases when there is some logical reason to have special keys corresponding to values--I haven't seen anyone use them as a general data structure for everything. Other structures are more complicated and tend to be used in special cases.
As you already know, they are containers for objects. Reading their respective APIs will help you understand their differences.
Since others have described what are their differences about their usage, I will point you to this link which describes complexity of various data structures.
This list is programming language agnostic, and, as always, real world implementations will vary.
It is useful to understand complexity of various operations for each of these structures, since in the real world, it will matter if you're constantly searching for an object in your 1,000,000 element linked list that's not sorted. Performance will not be optimal.
List Vs Set Vs Map
1) Duplicity: List allows duplicate elements. Any number of duplicate elements can be inserted into the list without affecting the same existing values and their indexes.
Set doesn’t allow duplicates. Set and all of the classes which implements Set interface should have unique elements.
Map stored the elements as key & value pair. Map doesn’t allow duplicate keys while it allows duplicate values.
2) Null values: List allows any number of null values.
Set allows single null value at most.
Map can have single null key at most and any number of null values.
3) Order: List and all of its implementation classes maintains the insertion order.
Set doesn’t maintain any order; still few of its classes sort the elements in an order such as LinkedHashSet maintains the elements in insertion order.
Similar to Set Map also doesn’t stores the elements in an order, however few of its classes does the same. For e.g. TreeMap sorts the map in the ascending order of keys and LinkedHashMap sorts the elements in the insertion order, the order in which the elements got added to the LinkedHashMap.enter code here
List Vs Set Vs Map
1) Duplicity: List allows duplicate elements. Any number of duplicate elements can be inserted into the list without affecting the same existing values and their indexes.
Set doesn’t allow duplicates. Set and all of the classes which implements Set interface should have unique elements.
Map stored the elements as key & value pair. Map doesn’t allow duplicate keys while it allows duplicate values.
2) Null values: List allows any number of null values.
Set allows single null value at most.
Map can have single null key at most and any number of null values.
3) Order: List and all of its implementation classes maintains the insertion order.
Set doesn’t maintain any order; still few of its classes sort the elements in an order such as LinkedHashSet maintains the elements in insertion order.
Similar to Set Map also doesn’t stores the elements in an order, however few of its classes does the same. For e.g. TreeMap sorts the map in the ascending order of keys and LinkedHashMap sorts the elements in the insertion order, the order in which the elements got added to the LinkedHashMap.
Difference between Set, List and Map in Java -
Set, List and Map are three important interface of Java collection framework and Difference between Set, List and Map in Java is one of the most frequently asked Java Collection interview question. Some time this question is asked as When to use List, Set and Map in Java. Clearly, interviewer is looking to know that whether you are familiar with fundamentals of Java collection framework or not. In order to decide when to use List, Set or Map , you need to know what are these interfaces and what functionality they provide. List in Java provides ordered and indexed collection which may contain duplicates. Set provides an un-ordered collection of unique objects, i.e. Set doesn't allow duplicates, while Map provides a data structure based on key value pair and hashing. All three List, Set and Map are interfaces in Java and there are many concrete implementation of them are available in Collection API. ArrayList and LinkedList are two most popular used List implementation while LinkedHashSet, TreeSet and HashSet are frequently used Set implementation. In this Java article we will see difference between Map, Set and List in Java and learn when to use List, Set or Map.
Set vs List vs Map in Java
As I said Set, List and Map are interfaces, which defines core contract e.g. a Set contract says that it can not contain duplicates. Based upon our knowledge of List, Set and Map let's compare them on different metrics.
Duplicate Objects
Main difference between List and Set interface in Java is that List allows duplicates while Set doesn't allow duplicates. All implementation of Set honor this contract. Map holds two object per Entry e.g. key and value and It may contain duplicate values but keys are always unique. See here for more difference between List and Set data structure in Java.
Order
Another key difference between List and Set is that List is an ordered collection, List's contract maintains insertion order or element. Set is an unordered collection, you get no guarantee on which order element will be stored. Though some of the Set implementation e.g. LinkedHashSet maintains order. Also SortedSet and SortedMap e.g. TreeSet and TreeMap maintains a sorting order, imposed by using Comparator or Comparable.
Null elements
List allows null elements and you can have many null objects in a List, because it also allowed duplicates. Set just allow one null element as there is no duplicate permitted while in Map you can have null values and at most one null key. worth noting is that Hashtable doesn't allow null key or values but HashMap allows null values and one null keys. This is also the main difference between these two popular implementation of Map interface, aka HashMap vs Hashtable.
Popular implementation
Most popular implementations of List interface in Java are ArrayList, LinkedList and Vector class. ArrayList is more general purpose and provides random access with index, while LinkedList is more suitable for frequently adding and removing elements from List. Vector is synchronized counterpart of ArrayList. On the other hand, most popular implementations of Set interface are HashSet, LinkedHashSet and TreeSet. First one is general purpose Set which is backed by HashMap , see how HashSet works internally in Java for more details. It also doesn't provide any ordering guarantee but LinkedHashSet does provides ordering along with uniqueness offered by Set interface. Third implementation TreeSet is also an implementation of SortedSet interface, hence it keep elements in a sorted order specified by compare() or compareTo() method. Now the last one, most popular implementation of Map interface are HashMap, LinkedHashMap, Hashtable and TreeMap. First one is the non synchronized general purpose Map implementation while Hashtable is its synchronized counterpart, both doesn' provide any ordering guarantee which comes from LinkedHashMap. Just like TreeSet, TreeMap is also a sorted data structure and keeps keys in sorted order.

Categories