I know following things about linkedHashSet
it maintains insertion order
uses LinkedList to preserve order
my question is how does hashing come into picture ??
I understand If hashing is used then the concept of bucketing comes in
However, from checking the code in the JDK it seems that LinkedHashSet implementation contains only constructor and no implementation, so I guess all the logic happens in HashSet?
so hashSet uses LinkedList by default ?
Let me put my question this way ... if objective is to write a collection that
maintains unique values
preserves insertion order using a linked list THEN ... it can easily be done without Hashing ... may be we can call this collection LinkedSet
saw a similar question what's the difference between HashSet and LinkedHashSet but not very helpful
Let me know if i need to explain my question more
False. The implementation of LinkedHashSet is really all in LinkedHashMap. (And the implementation of HashSet is really all in HashMap. Le gasp!)
HashSet has no linked list at all.
It's entirely possible to write a LinkedSet collection backed by a linked list, that keeps elements unique -- it's just that its performance will be pretty crappy.
It's an 'interesting' implementation. The constructors for LinkedHashSet defer to package-private constructors in HashSet which setup the data structure (a LinkedHashMap) for maintaining iteration order.
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = new LinkedHashMap<E,Object>(initialCapacity, loadFactor);
}
The API designers could simply have exposed this constructor as public, with appropriate documentation, but I guess they wanted the code to be more 'self-documenting'.
If you look closely, you will see it is actually using some protected constructors on the HashSet that are there just for it, not regular ones. e.g.,
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = new LinkedHashMap<E,Object>(initialCapacity, loadFactor);
}
So the keySet being used to back the LinkedHashSet is in fact coming from the implementation of LinkedHashMap, not a regular HashMap like a regular HashSet. It doesn't actually use java.util.LinkedList. It just maintains pointers that form a list within the implementation of the bucket contents (Map.Entry<K,V>)
316 private static class Entry<K,V> extends HashMap.Entry<K,V> {
317 // These fields comprise the doubly linked list used for iteration.
318 Entry<K,V> before, after;
319
320 Entry(int hash, K key, V value, HashMap.Entry<K,V> next) {
321 super(hash, key, value, next);
322 }
Hashing comes into the picture because it's an easy way to create a collection that enforces uniqueness and offers constant-time performance for most operations. Sure we could just use a linked list and add uniqueness checking, but the time for several operations would become O(N) cause you'd have to iterate the whole list to check for duplicates.
Code Sample
Set<Registeration> registerationSet = new LinkedHashSet<>();
registerationSet.add(new Registeration());
Explanation of Line2.
computes hashCode for Registeration object
search for hashCode in registerationSet to locate the bucket
check for equal object in shortlisted bucket
3.1. if equal found, replace it, with new objects reference
3.2. if not found, append/add Registeration object's reference in bucket
Parallel to it,
A List maintains entry order/queue of all elements inserted
Always, add new reference to the end
In case of replacement(3.1. in above), remove previous occurrence.
For a Specific answer to your question
how does hashing come into picture? (in a LinkedHashSet)
What the Java Docs say...
Like HashSet, it provides constant-time performance for the basic operations (add, contains and remove), assuming the hash function disperses elements properly among the buckets.
This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order).
The buckets accessed by a hashcode is used to speed up random access, and the LinkedList implementation is for returning an iterator which spits out elements in insertion order.
Hope i have answered your question?
Related
I cannot understand the use of HashFunction in LinkedHashMap.
In the HashMap implementation, the use of hashFunction is to find the index of the internal array, which can be justified, following the hashfunction contract (same key will must have same hashcode, but distinct key can have same hashcode).
My questions are:
1) What is the use of hashfunction in LinkedHashMap?
2) How does the put and get method works for LinkedHashMap?
3) Why does it maintains the doublylinkedlist internally?
Whats wrong in using the HashMap as internal implementation(just like HashSet) and maintain a separate Array/List of indexes of the Entry array in the sequence of insertion?
Appreciate useful response and references.
1) LinkedHashMap extends HashMap so the hashfunction is the same of HashMap (if you check the code the hash function is inherited from HashMap), i.e. the function computes a the hash of the object inserted and it use to store in a data structure together with the elements with the same key hash; the hasfunction is used in the get method to retrieve the object with the key specified as a param.
2)Put and Get method are behave the same way as HashMap plus the track the insertion order of the elements so when you iterate over the the keyset you get the key values in the order you inserted into the map (see here for more details)
3)the LinkedHashMap uses a double linked list instead of an Array because it's more compact; a double linked list is the the most efficient data structure for list where you insert and remove items; if you mostly insert/append elements then an array based implementation may be better. Since the map sematic is a key-value implementation and removing elements from the map could be a frequent operation a double linked list is a better fit. The internal implmentation could be made with a LinkedList but my opionion is that using a low level data stucture is more efficient and decouples LinkedHashMap from other classes.
A LinkedHashMap does use a HashMap (in fact it extends from it), so the hashCode is used to identify the right hash bucket in the array of hash buckets, just as for HashMap. put and get work just as for HashMap (except that the before and after references for iterating over the entries are updated differently for the two implementations).
The reason insertion order is not kept by keeping an Array or ArrayList is that addition or removal in the middle of an ArrayList is an O(n) operation because you have to move all subsequent items along one place. You could do this with a LinkedList because addition and removal in the middle of a LinkedList is O(1) (all you have to do is break a few links and make a few new ones). However there's no point using a separate LinkedList because you may as well make the Map.Entry objects reference the previous and next Entry objects, which is exactly how LinkedHashMap works.
LinkedHashMap is a good choice for a data structure where you want to be able to put and get entries with O(1) running time, but you also need the behavior of a LinkedList. The internal hashing function is what allows you put and get entries with constant-time.
Here is how you use LinkedHashMap:
Map<String, Double> linkedHashMap = new LinkedHashMap<String, String>();
linkedHashMap.put("today", "Wednesday");
linkedHashMap.put("tomorrow", "Thursday");
String today = linkedHashMap.get("today"); // today is 'Wednesday'
There are several arguments against using a simple HashMap and maintaining a separate List for the insertion order. First, if you go this route it means you will have to maintain 2 data structures instead of one. This is error prone, and makes maintaining your code more difficult. Second, if you have to make your data structure Thread-safe, this would be complex for 2 data structures. On the other hand, if you use a LinkedHashMap you only would have to worry about making this single data structure thread-safe.
As for implementation details, when you do a put into a LinkedHashMap, the JVM will take your key and use a cryptographic mapping function to ultimately convert that key into a memory address where your value will be stored. Doing a get with a given key will also use this mapping function to find the exact location in memory where the value be stored. The entrySet() method returns a Set consisting of all the keys and values in the LinkedHashMap. By definition, sets are not ordered. The entrySet() is not guaranteed to be Thread-safe.
Ans. 2)
when we call put(map,key) of linkedhashmap. Internally it calls createEntry
void createEntry(int hash, K key, V value, int bucketIndex) {
HashMap.Entry<K,V> old = table[bucketIndex];
Entry<K,V> e = new Entry<K,V>(hash, key, value, old);
table[bucketIndex] = e;
e.addBefore(header);
size++;
Ans 3)
To efficiently maintain a linkedHashmap, you actually need a doubly linked list.
Consider three entries in order
A ---> B ---> C
Suppose you want to remove B. Obviously A should now point to C. But unless you know the entry before B you cannot efficiently say which entry should now point to C. To fix this, you need entries to point in both the directions Like this
---> --->
A B C
<--- <---
This way, when you remove B you can look at the entries before and after B (A and C) and update so that A and C point to each other.
similar post in this link discussed earlier
why linkedhashmap maintains doubly linked list for iteration
I read that HashMap has the following implementation:
main array
↓
[Entry] → Entry → Entry ← linked-list implementation
[Entry]
[Entry] → Entry
[Entry]
[null ]
So, it has an array of Entry objects.
Questions:
I was wondering how can an index of this array store multiple Entry objects in case of same hashCode but different objects.
How is this different from LinkedHashMap implementation? Its doubly linked list implementation of map but does it maintain an array like the above and how does it store pointers to the next and previous element?
HashMap does not maintain insertion order, hence it does not maintain any doubly linked list.
Most salient feature of LinkedHashMap is that it maintains insertion order of key-value pairs. LinkedHashMap uses doubly Linked List for doing so.
Entry of LinkedHashMap looks like this-
static class Entry<K, V> {
K key;
V value;
Entry<K,V> next;
Entry<K,V> before, after; //For maintaining insertion order
public Entry(K key, V value, Entry<K,V> next){
this.key = key;
this.value = value;
this.next = next;
}
}
By using before and after - we keep track of newly added entry in LinkedHashMap, which helps us in maintaining insertion order.
Before refers to previous entry and
after refers to next entry in LinkedHashMap.
For diagrams and step by step explanation please refer http://www.javamadesoeasy.com/2015/02/linkedhashmap-custom-implementation.html
Thanks..!!
So, it has an array of Entry objects.
Not exactly. It has an array of Entry object chains. A HashMap.Entry object has a next field allowing the Entry objects to be chained as a linked list.
I was wondering how can an index of this array store multiple Entry objects in case of same hashCode but different objects.
Because (as the picture in your question shows) the Entry objects are chained.
How is this different from LinkedHashMap implementation? Its doubly linked list implementation of map but does it maintain an array like the above and how does it store pointers to the next and previous element?
In the LinkedHashMap implementation, the LinkedHashMap.Entry class extends the HashMap.Entry class, by adding before and after fields. These fields are used to assemble the LinkedHashMap.Entry objects into an independent doubly-linked list that records the insertion order. So, in the LinkedHashMap class, each entry object is in two distinct chains:
There are a number of singly linked hash chains that is accessed via the main hash array. This is used for (regular) hashmap lookups.
There is a separate doubly linked list that contains all of the entry objects. It is kept in entry insertion order, and is used when you iterate the entries, keys or values in the hashmap.
Take a look for yourself. For future reference, you can just google:
java LinkedHashMap source
HashMap uses a LinkedList to handle collissions, but the difference between HashMap and LinkedHashMap is that LinkedHashMap has a predicable iteration order, which is achieved through an additional doubly-linked list, which usually maintains the insertion order of the keys. The exception is when a key is reinserted, in which case it goes back to the original position in the list.
For reference, iterating through a LinkedHashMap is more efficient than iterating through a HashMap, but LinkedHashMap is less memory efficient.
In case it wasn't clear from my above explanation, the hashing process is the same, so you get the benefits of a normal hash, but you also get the iteration benefits as stated above, since you're using a doubly linked list to maintain the ordering of your Entry objects, which is independent of the linked-list used during hashing for collisions, in case that was ambiguous..
EDIT: (in response to OP's comment):
A HashMap is backed by an array, in which some slots contain chains of Entry objects to handle the collisions. To iterate through all of the (key,value) pairs, you would need to go through all of the slots in the array and then go through the LinkedLists; hence, your overall time would be proportional to the capacity.
When using a LinkedHashMap, all you need to do is traverse through the doubly-linked list, so the overall time is proportional to the size.
Since none of the other answers actually explain how something like this could be implemented I'll give it a shot.
One way would be to have some extra information in the value (of the key->value pair) not visible to the user, that had a reference to the previous and next element inserted into the hash map. The benefits are that you can still delete elements in constant time removing from a hashmap is constant time and removing from a linked list is in this case because you have a reference to the entry. You can still insert in constant time because hash map insert is constant, linked list isn't normally but in this case you have constant time access to a spot in the linked list so you can insert in constant time, and lastly retrieval is constant time because you only have to deal with the hash map part of the structure for it.
Keep in mind that a data structure like this does not come without costs. The size of the hash map will rise significantly because of all the extra references. Each of the main methods will be slightly slower (could matter if they are called repeatedly). And the indirection of the data structure (not sure if that's a real term :P) is increased, though this might not be as big a deal because the references are guaranteed to be pointing to stuff inside the hash map.
Since the only advantage of this type of structure is that it preserves order be careful when you use it. Also when reading the answer keep in mind I don't know that this is the way it's implemented but it is how I would do it if given the task.
On the oracle docs there is a quote confirming some of my guesses.
This implementation differs from HashMap in that it maintains a doubly-linked list running through all of its entries.
Another relevant quote from the same website.
This class provides all of the optional Map operations, and permits null elements. Like HashMap, it provides constant-time performance for the basic operations (add, contains and remove), assuming the hash function disperses elements properly among the buckets. Performance is likely to be just slightly below that of HashMap, due to the added expense of maintaining the linked list, with one exception: Iteration over the collection-views of a LinkedHashMap requires time proportional to the size of the map, regardless of its capacity. Iteration over a HashMap is likely to be more expensive, requiring time proportional to its capacity.
hashCode will be mapped to any bucket by the hash function. If there is a collision in hashCode than HashMap resolve this collision by chaining i.e. it will add the value to the linked list. Below is the code which does this:
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
392 Object k;
393 if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
394 `enter code here` V oldValue = e.value;
395 e.value = value;
396 e.recordAccess(this);
397 return oldValue;
398 }
399 }
You can clearly see that it traverse the linked list and if it finds the key than it replaces the old value with new else append to the linked list.
But the difference between LinkedHashMap and HashMap is LinkedHashMap maintains the insertion order. From docs:
This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order). Note that insertion order is not affected if a key is re-inserted into the map. (A key k is reinserted into a map m if m.put(k, v) is invoked when m.containsKey(k) would return true immediately prior to the invocation).
Is LinkedHashMap LIFO or FIFO in nature?
If my map is of the form:
map.put(1,"one");
map.put(2,"two");
what would be the order if I was to iterate on the map using keyset??
EDIT: I think I did actually confuse two different concepts. Let me rephrase the question. What would be the order in which I encounter the quantities using entryset?Thanks for pointing that out btw. I do not intend to remove any entry.
In a linked hash map the elements in the backing doubly-linked list are added at the end (clearly: for preserving iteration order), but can be removed from any part in the list as the elements get removed from the map, it's incorrect to label the backing list (and by extension: the map) as LIFO or FIFO, it's neither - there's no concept of removal order in a map, and consequently no removal order can be assumed for the backing list in a linked hash map.
What a linked hash map does guarantee is that iterating over its contents (be it: the keys or the entries) will occur in the same order in which the elements were inserted in the map; from the documentation:
This implementation differs from HashMap in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order).
EDIT :
Regarding the last edit to the question, a LinkedHashMap guarantees that the iteration order of the keySet() will be the same order in which the elements were inserted: 1, 2 for the example in the question. This has nothing to do with FIFO/LIFO, those concepts deal with the order in which elements are removed from a data structure, and they're not related with the iteration order after inserting elements.
LinkedHashMap to quote from the javadocs is "Hash table and linked list implementation of the Map interface, with predictable iteration order" . So the keySet will return keys based on the order of insertion, esssentially a FIFO.
When access order is not utilized (standard case) you can consider LHM as a linked list w/ very fast access O(1) by key.
In that aspect it is FIFO when access order is unused (look at the c-tors). When access order is used the insertion order doesn't matter if there are any get() operations as they reorder the Entries. Look at protected boolean removeEldestEntry(Map.Entry<K,V> eldest) eldest=FIFO.
Essentially the LHM is a good doubly linked list of Map.Entry<Key, Value> with a hash index over the keys.
I myself never use the vanilla HashMap as in its current impl. it has very little benefit over LHM - lower memory footprint but horrid iteration. Java8 (or 9) perhaps may finally fix HashMap, hopefully Doug Lea will push his impl.
According to Java docs, if you were to iterate over the map, the keyset would be in insertion-order. So the first key you get is the first key entered, over the existing keys. Note, reinserting a key-value pair does not change the original key position.
I've been using HashMaps since I started programming again in Java without really understanding these Collections thing.
Honestly I am not really sure if using HashMaps all the way would be best for me or for production code. Up until now it didn't matter to me as long as I was able to get the data I need the way I called them in PHP (yes, I admit whatever negative thing you are thinking right now) where $this_is_array['this_is_a_string_index'] provides so much convenience to recall an array of variables.
So now, I have been working with java for more than 3 months and came across the Interfaces I specified above and wondered, why are there so many of these things (not to mention, vectors, abstractList {oh well the list goes on...})?
I mean how are they different from each other?
And more importantly, what is the best Interface to use in my case?
The API is pretty clear about the differences and/or relations between them:
Collection
The root interface in the collection hierarchy. A collection represents a group of objects, known as its elements. Some collections allow duplicate elements and others do not. Some are ordered and others unordered.
http://download.oracle.com/javase/6/docs/api/java/util/Collection.html
List
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
http://download.oracle.com/javase/6/docs/api/java/util/List.html
Set
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
http://download.oracle.com/javase/6/docs/api/java/util/Set.html
Map
An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
http://download.oracle.com/javase/6/docs/api/java/util/Map.html
Is there anything in particular you find confusing about the above? If so, please edit your original question. Thanks.
A short summary of common java collections:
'Map': A 'Map' is a container that allows to store key=>value pair. This enables fast searches using the key to get to its associated value. There are two implementations of this in the java.util package, 'HashMap' and 'TreeMap'. The former is implemented as a hastable, while the latter is implemented as a balanced binary search tree (thus also having the property of having the keys sorted).
'Set': A 'Set' is a container that holds only unique elements. Inserting the same value multiple times will still result in the 'Set' only holding one instance of it. It also provides fast operations to search, remove, add, merge and compute the intersection of two sets. Like 'Map' it has two implementations, 'HashSet' and 'TreeSet'.
'List': The 'List' interface is implemented by the 'Vector', 'ArrayList' and 'LinkedList' classes. A 'List' is basically a collection of elements that preserve their relative order. You can add/remove elements to it and access individual elements at any given position. Unlike a 'Map', 'List' items are indexed by an int that is their position is the 'List' (the first element being at position 0 and the last at 'List.size()'-1). 'Vector' and 'ArrayList' are implemented using an array while 'LinkedList', as the name implies, uses a linked list. One thing to note is, unlike php's associative arrays (which are more like a Map), an array in Java and many other languages actually represents a contiguous block of memory. The elements in an array are basically laid out side by side on adjacent "slots" so to speak. This gives very fast lookup and write times, much faster than associative arrays which are implemented using more complex data structures. But they can't be indexed by anything other than the numeric positions within the array, unlike associative arrays.
To get a really good idea of what each collection is good for and their performance characteristics I would recommend getting a good idea about data structures like arrays, linked lists, binary search trees, hashtables, as well as stacks and queues. There is really no substitute to learning this if you want to be an effective programmer in any language.
You can also read the Java Collections trail to get you started.
In Brief (and only looking at interfaces):
List - a list of values, something like a "resizable array"
Set - a container that does not allow duplicates
Map - a collection of key/value pairs
A Map vs a List.
In a Map, you have key/value pairs. To access a value you need to know the key. There is a relationship that exists between the key and the value that persists and is not arbitrary. They are related somehow. Example: A persons DNA is unique (the key) and a persons name (the value) or a persons SSN (the key) and a persons name (the value) there is a strong relationship.
In a List, all you have are values (a persons name), and to access it you need to know its position in the list (index) to access it. But there is no permanent relationship between the position of the value in the list and its index, it is arbitrary.
■ List — An ordered collection of elements that allows duplicate entries
Concrete Classes:
ArrayList — Standard resizable list.
LinkedList — Can easily add/remove from beginning or end.
Vector — Older thread-safe version of ArrayList.
Stack — Older last-in, first-out class.
■ Set — Does not allow duplicates
Concrete Classes:
HashSet—Uses hashcode() to find unordered elements.
TreeSet—Sorted and navigable. Does not allow null values.
■ Queue — Orders elements for processing
Concrete Classes:
LinkedList — Can easily add/remove from beginning or end.
ArrayDeque—First-in, first-out or last-in, first-out. Does not allow null values.
■ Map — Maps unique keys to values
Concrete Classes:
HashMap — Uses hashcode() to find keys.
TreeMap — Sorted map. Does not allow null keys.
Hashtable — Older version of hashmap. Does not allow null keys or values.
That is a question that ultimately has a very complex answer--there are entire college classes dedicated to data structures. The short answer is that they all have trade-offs in memory usage and the speed of various operations.
What would be really healthy is some time with a nice book on data structures--I can almost guarantee that your code will improve significantly if you get a nice understanding of data structures.
That said, I can give you some quick, temporary advice from my experience with Java. For most simple internal things, ArrayList is generally preferred. For passing collections of data about, simple arrays are generally used. HashMap is only really used for cases when there is some logical reason to have special keys corresponding to values--I haven't seen anyone use them as a general data structure for everything. Other structures are more complicated and tend to be used in special cases.
As you already know, they are containers for objects. Reading their respective APIs will help you understand their differences.
Since others have described what are their differences about their usage, I will point you to this link which describes complexity of various data structures.
This list is programming language agnostic, and, as always, real world implementations will vary.
It is useful to understand complexity of various operations for each of these structures, since in the real world, it will matter if you're constantly searching for an object in your 1,000,000 element linked list that's not sorted. Performance will not be optimal.
List Vs Set Vs Map
1) Duplicity: List allows duplicate elements. Any number of duplicate elements can be inserted into the list without affecting the same existing values and their indexes.
Set doesn’t allow duplicates. Set and all of the classes which implements Set interface should have unique elements.
Map stored the elements as key & value pair. Map doesn’t allow duplicate keys while it allows duplicate values.
2) Null values: List allows any number of null values.
Set allows single null value at most.
Map can have single null key at most and any number of null values.
3) Order: List and all of its implementation classes maintains the insertion order.
Set doesn’t maintain any order; still few of its classes sort the elements in an order such as LinkedHashSet maintains the elements in insertion order.
Similar to Set Map also doesn’t stores the elements in an order, however few of its classes does the same. For e.g. TreeMap sorts the map in the ascending order of keys and LinkedHashMap sorts the elements in the insertion order, the order in which the elements got added to the LinkedHashMap.enter code here
List Vs Set Vs Map
1) Duplicity: List allows duplicate elements. Any number of duplicate elements can be inserted into the list without affecting the same existing values and their indexes.
Set doesn’t allow duplicates. Set and all of the classes which implements Set interface should have unique elements.
Map stored the elements as key & value pair. Map doesn’t allow duplicate keys while it allows duplicate values.
2) Null values: List allows any number of null values.
Set allows single null value at most.
Map can have single null key at most and any number of null values.
3) Order: List and all of its implementation classes maintains the insertion order.
Set doesn’t maintain any order; still few of its classes sort the elements in an order such as LinkedHashSet maintains the elements in insertion order.
Similar to Set Map also doesn’t stores the elements in an order, however few of its classes does the same. For e.g. TreeMap sorts the map in the ascending order of keys and LinkedHashMap sorts the elements in the insertion order, the order in which the elements got added to the LinkedHashMap.
Difference between Set, List and Map in Java -
Set, List and Map are three important interface of Java collection framework and Difference between Set, List and Map in Java is one of the most frequently asked Java Collection interview question. Some time this question is asked as When to use List, Set and Map in Java. Clearly, interviewer is looking to know that whether you are familiar with fundamentals of Java collection framework or not. In order to decide when to use List, Set or Map , you need to know what are these interfaces and what functionality they provide. List in Java provides ordered and indexed collection which may contain duplicates. Set provides an un-ordered collection of unique objects, i.e. Set doesn't allow duplicates, while Map provides a data structure based on key value pair and hashing. All three List, Set and Map are interfaces in Java and there are many concrete implementation of them are available in Collection API. ArrayList and LinkedList are two most popular used List implementation while LinkedHashSet, TreeSet and HashSet are frequently used Set implementation. In this Java article we will see difference between Map, Set and List in Java and learn when to use List, Set or Map.
Set vs List vs Map in Java
As I said Set, List and Map are interfaces, which defines core contract e.g. a Set contract says that it can not contain duplicates. Based upon our knowledge of List, Set and Map let's compare them on different metrics.
Duplicate Objects
Main difference between List and Set interface in Java is that List allows duplicates while Set doesn't allow duplicates. All implementation of Set honor this contract. Map holds two object per Entry e.g. key and value and It may contain duplicate values but keys are always unique. See here for more difference between List and Set data structure in Java.
Order
Another key difference between List and Set is that List is an ordered collection, List's contract maintains insertion order or element. Set is an unordered collection, you get no guarantee on which order element will be stored. Though some of the Set implementation e.g. LinkedHashSet maintains order. Also SortedSet and SortedMap e.g. TreeSet and TreeMap maintains a sorting order, imposed by using Comparator or Comparable.
Null elements
List allows null elements and you can have many null objects in a List, because it also allowed duplicates. Set just allow one null element as there is no duplicate permitted while in Map you can have null values and at most one null key. worth noting is that Hashtable doesn't allow null key or values but HashMap allows null values and one null keys. This is also the main difference between these two popular implementation of Map interface, aka HashMap vs Hashtable.
Popular implementation
Most popular implementations of List interface in Java are ArrayList, LinkedList and Vector class. ArrayList is more general purpose and provides random access with index, while LinkedList is more suitable for frequently adding and removing elements from List. Vector is synchronized counterpart of ArrayList. On the other hand, most popular implementations of Set interface are HashSet, LinkedHashSet and TreeSet. First one is general purpose Set which is backed by HashMap , see how HashSet works internally in Java for more details. It also doesn't provide any ordering guarantee but LinkedHashSet does provides ordering along with uniqueness offered by Set interface. Third implementation TreeSet is also an implementation of SortedSet interface, hence it keep elements in a sorted order specified by compare() or compareTo() method. Now the last one, most popular implementation of Map interface are HashMap, LinkedHashMap, Hashtable and TreeMap. First one is the non synchronized general purpose Map implementation while Hashtable is its synchronized counterpart, both doesn' provide any ordering guarantee which comes from LinkedHashMap. Just like TreeSet, TreeMap is also a sorted data structure and keeps keys in sorted order.
I'm using LinkedHashSet. I want to insert items at the 0th position, like:
Set<String> set = new LinkedHashSet<String>();
for (int i = 0; i < n; i++) {
set.add(0, "blah" + i);
}
I'm not sure how linked hash set is implemented, is inserting going to physically move all addresses of current items, or is it the same cost as inserting as in a linked-list implementation?
Thank you
------ Edit ---------------
Complete mess up by me, was referencing ArrayList docs. The Set interface has no add(index, object) method. Is there a way to iterate over the set backwards then? Right now to iterate I'm doing:
for (String it : set) {
}
can we do that in reverse?
Thanks
Sets are, by definition, independent of order. Thus, Set doesn't have add(int , Object) method available.
This is also true of LinkedHashSet http://download.oracle.com/javase/6/docs/api/java/util/LinkedHashSet.html
LinkedHashSet maintains insertion order and thus all elements are added at the end of the linked list. This is achieved using the LinkedHashMap. You can have a look at the method linkEntry in LinkedHashMap http://www.docjar.com/html/api/java/util/LinkedHashMap.java.html
Edit: in response to edited question
There is no API method available to do this. But you can do the following
Add Set to a List using new ArrayList(Set)
Use Collections.reverse(List)
Iterate this list
Judging by the source code of LinkedHashMap (which backs LinkedHashSet -- see http://www.docjar.com/html/api/java/util/LinkedHashMap.java.html ), inserts are cheap, like in a linked list.
To answer your latest question, there is no reverse iterator feature available from LinkedHashSet, even though internally the implementation uses a doubly linked list.
There is an open Request For Enhancement about this:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4848853
Mark Peters links to functionality available in guava, though their reverse list actually generates a reverse list.
As already mentioned, LinkedHashSet is build on LinkedHashMap, which is built on HashMap :) Javadocs says that it takes constant time to add an element into HashMap, assuming that your hash function is implemented properly. If your hash function is not implemented well, it may take up to O(n).
Iteration backwards in not supported at this moment.
You can't add elements to the front of a LinkedHashSet... it has no method such as add(int, Object) nor any other methods that make use of the concept of an "index" in the set (that's a List concept). It only provides consistent iteration order, based on the order in which elements were inserted. The most recently inserted element that was not already in the set will be the last element when you iterate over it.
And the Javadoc for LinkedHashSet explicitly states:
Like HashSet, it provides constant-time performance for the basic operations (add, contains and remove), assuming the hash function disperses elements properly among the buckets.
Edit: There is not any way to iterate over a LinkedHashSet in reverse short of something like copying it to a List and iterating over that in reverse. Using Guava you could do that like:
for (String s : Lists.reverse(ImmutableList.copyOf(set))) { ... }
Note that while creating the ImmutableList does require iterating over each element of the original set, the reverse method simply provides a reverse view and doesn't iterate at all itself.