what's the difference between HashSet and LinkedHashSet - java

I saw that LinkedHashSet extends HashSet and I know it preserves order.
However, from checking the code in the JDK it seems that LinkedHashSet contains only constuctor and no implementation, so I guess all the logic happens in HashSet?
If that is correct, why is it designed like that? it seems very confusing.
EDIT: there was an unfortunate mistake in the question. I wrote HashMap and LinkedHashMap instead of HashSet and LinkedHashSet. I fixed the question answer it if possible.
Also, I was interested why Java designers chose to implement it like that.

Yes, LinkedHashMap calls its super constructor. One thing it does is to override the init() method, which is called by the super constructor.
The LinkedHashMap is an HashMap with a doubly-linked list implementation added.

As you said the difference between the two data structures is that the LinkedHashMap is an HashMap that preserve the insertion order of pairs.
So the Linked one is intended to used as an HashMap via standard methods of the HashMap and the only method added is removeEldestEntry(), useful if you want to deal with the "list" part of the data structure.

Related

LinkedHashSet implementations of HashTable and LinkedList?

I came across the below in the JAVA docs related to the LinkedHashSet:
Hash table and linked list implementation of the Set interface, with predictable iteration order.
But If I see the source of LinkedHashSet I am not able to find any implements/extends related to HashTable or LinkedList. Then how does it inhibits the features of both these data structures?
It does not inherit from those classes, or use them in any way.
But you could write your own linked list class, and it would still be a linked list, even if it had no relationship to java.util.LinkedList. That's how LinkedHashSet works: it does not use java.util.Hashtable, nor java.util.LinkedList, but it has an implementation of the data structures nonetheless.

Why doesn't LinkedHashSet have addFirst method?

As the documentation of LinkedHashSet states, it is
Hash table and linked list implementation of the Set interface, with
predictable iteration order. This implementation differs from HashSet
in that it maintains a doubly-linked list running through all of its
entries.
So it's essentially a HashSet with FIFO queue of keys implemented by a linked list. Considering that LinkedList is Deque and permits, in particular, insertion at the beginning, I wonder why doesn't LinkedHashSet have the addFirst(E e) method in addition to the methods present in the Set interface. It seems not hard to implement this.
As Eliott Frisch said, the answer is in the next sentence of the paragraph you quoted:
… This linked list defines the iteration ordering, which is the order
in which elements were inserted into the set (insertion-order). …
An addFirst method would break the insertion order and thereby the design idea of LinkedHashSet.
If I may add a bit of guesswork too, other possible reasons might include:
It’s not so simple to implement as it appears since a LinkedHashSet is really implemented as a LinkedHasMap where the values mapped to are not used. At least you would have to change that class too (which in turn would also break its insertion order and thereby its design idea).
As that other guy may have intended in a comment, they didn’t find it useful.
That said, you are asking the question the wrong way around. They designed a class with a functionality for which they saw a need. They moved on to implement it using a hash table and a linked list. You are starting out from the implementation and using it as a basis for a design discussion. While that may occasionally add something useful, generally it’s not the way to good designs.
While I can in theory follow your point that there might be a situation where you want a double-ended queue with set property (duplicates are ignored/eliminated), I have a hard time imagining when a Deque would not fulfil your needs in this case (Eliott Frisch mentioned the under-used ArrayDeque). You need pretty large amounts of data and/or pretty strict performance requirements before the linear complexity of contains and remove would be prohibitive. And in that case you may already be better off custom designing your own data structure.

Set interface and natural ordering [nonsensical interview test] [duplicate]

This question already has answers here:
To store unique element in a collection with natural order
(5 answers)
Closed 7 years ago.
By a chance it happened to me twice that I got the same Java question during a job interview Java test. For me it seems like a nonsense. It goes something like this:
Which of this collections would you use if you needed a collection with no
duplicates and with natural ordering?
java.util.List
java.util.Map
java.util.Set
java.util.Collection
The closest answer would be Set. But as far as I know these interfaces, with exception of List do not define any ordering in their contract but it is the matter of the implementing classes to have or not to have defined ordering.
Was I right in pointing out in the test that the question is wrong?
The first major clue is "no duplicates." A mathematical set contains only unique items, which means no duplicates, so you are correct here.
In terms of ordering, perhaps the interviewer was looking for you to expand upon your answer. Just as a "Set" extends a "Collection" (in Java), there are more specific types of "Sets" possible in Java. See: HashSet, TreeSet, LinkedHashSet. For example, TreeSet is inherited from SortedSet interface.
However, it is most definitely true that a Java set does not provide any ordering. Frankly, I think this is a poorly worded question and you were right to point out the lack in precision.
Yes, you're correct that none of the answers given matches the requirements. A correct answer might have been SortedSet or its subinterface NavigableSet.
A Set with natural ordering is a SortedSet (which extends Set so it is-a Set), and a concrete implementation of that interface is TreeSet (which implements SortedSet so it is-a Set).
The correct answer for that test is Set Let's remember that it's asking for an interface that could provide that; given the right implementation, the Set interface could provide it.
The Map interface doesn't make any guarantees around what order
things are stored, as that's implementation specific. However, if you
use the right implementation (that is, [TreeMap][1] as spelled out
by the docs), then you're guaranteed a natural ordering and no
duplicate entries. However, there's no requirement about
key-value pairs.
The Set interface also doesn't make any guarantees around what order
things are stored in, as that's implementation specific. But, like
TreeMap, [TreeSet][2] is a set that can be used to store things in a
natural order with no duplicates. Here's how it'd look.
Set<String> values = new TreeSet<>();
The List interface will definitely allow duplicates, which
instantly rules it out.
The Collection interface doesn't have anything directly implementing
it, but it is the patriarch of the entire collections hierarchy.
So, in theory, code like this is legal:
Collection<String> values = new TreeSet<>();
...but you'd lose information about what
kind of collection it actually was, so I'd discourage its
usage.

Interchanging HashSet and TreeSet [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Hashset vs Treeset
Can you use HashSet and TreeSet interchangeably? If I exchanged TreeSet for Hashset and vice versa in a program what issues would there be? Im aware you need to implement Comparable for a TreeSet.
If some API requires Set, it absolutely doesn't matter which implementation you pass. If it requires concrete type (unlikely), you can't mix them.
In general, they difference is in performance (HashSet is faster) but this shouldn't affect how your program behaves and in order. Order of items in HashSet is unpredictable. If your program relies on any such order, it should rather use LinkedHashSet or TreeSet.
HashSet and TreeSet are both Sets. They are mostly interchangeable, but keep in mind TreeSet is also a SortedSet and thus its elements must implement Comparable.
If you want your set to be ordered, you should use TreeSet. If you use a HashSet instead, you'll get unpredictable results for operations that rely on the ordering.
On another hand, a HashSet is much faster than a TreeSet if order is not something you worry about.

Is there a way to create a List/Set which keeps insertion order and does not allow duplicates in Java?

What is the most efficient way of maintaining a list that does not allow duplicates, but maintains insertion order and also allows the retrieval of the last inserted element in Java?
Try LinkedHashSet, which keeps the order of input.
Note that re-inserting an element would update its position in the input order, thus you might first try and check whether the element is already contained in the set.
Edit:
You could also try the Apache commons collections class ListOrderedSet which according to the JavaDoc (if I didn't missread anything again :) ) would decorate a set in order to keep insertion order and provides a get(index) method.
Thus, it seems you can get what you want by using new ListOrderedSet(new HashSet());
Unfortunately this class doesn't provide a generic parameter, but it might get you started.
Edit 2:
Here's a project that seems to represent commons collections with generics, i.e. it has a ListOrderedSet<E> and thus you could for example call new ListOrderedSet<String>(new HashSet<String>());
I don't think there's anything in the JDK which does this.
However, LinkedHashMap, which is used as the basis for LinkedHashSet, comes close: it maintains a circular doubly-linked list of the entries in the map. It only tracks the head of the list not the tail, but because the list is circular, header.before is the tail (the most recently inserted element).
You could therefore implement what you need on top of this. LinkedHashMap has not been designed for extension, so this is somewhat awkward. You could copy the code into your own class and add a suitable last() method (be aware of licensing issues here), or you could extend the existing class, and add a method which uses reflection to get at the private header and before fields.
That would get you a Map, rather than a Set. However, HashSet is already a wrapper which makes a Map look like a Set. Again, it is not designed for general extension, but you could write a subclass whose constructor calls the super constructor, then uses more reflection to replace the superclass's value of map with an instance of your new map. From there on, the class should do exactly what you want.
As an aside, the library classes here were all written by Josh Bloch and Neal Gafter. Those guys are two of the giants of Java. And yet the code in there is largely horrible. Never meet your heroes.
Just use a TreeSet.

Categories