LinkedHashSet implementations of HashTable and LinkedList? - java

I came across the below in the JAVA docs related to the LinkedHashSet:
Hash table and linked list implementation of the Set interface, with predictable iteration order.
But If I see the source of LinkedHashSet I am not able to find any implements/extends related to HashTable or LinkedList. Then how does it inhibits the features of both these data structures?

It does not inherit from those classes, or use them in any way.
But you could write your own linked list class, and it would still be a linked list, even if it had no relationship to java.util.LinkedList. That's how LinkedHashSet works: it does not use java.util.Hashtable, nor java.util.LinkedList, but it has an implementation of the data structures nonetheless.

Related

Why doesn't LinkedHashSet have addFirst method?

As the documentation of LinkedHashSet states, it is
Hash table and linked list implementation of the Set interface, with
predictable iteration order. This implementation differs from HashSet
in that it maintains a doubly-linked list running through all of its
entries.
So it's essentially a HashSet with FIFO queue of keys implemented by a linked list. Considering that LinkedList is Deque and permits, in particular, insertion at the beginning, I wonder why doesn't LinkedHashSet have the addFirst(E e) method in addition to the methods present in the Set interface. It seems not hard to implement this.
As Eliott Frisch said, the answer is in the next sentence of the paragraph you quoted:
… This linked list defines the iteration ordering, which is the order
in which elements were inserted into the set (insertion-order). …
An addFirst method would break the insertion order and thereby the design idea of LinkedHashSet.
If I may add a bit of guesswork too, other possible reasons might include:
It’s not so simple to implement as it appears since a LinkedHashSet is really implemented as a LinkedHasMap where the values mapped to are not used. At least you would have to change that class too (which in turn would also break its insertion order and thereby its design idea).
As that other guy may have intended in a comment, they didn’t find it useful.
That said, you are asking the question the wrong way around. They designed a class with a functionality for which they saw a need. They moved on to implement it using a hash table and a linked list. You are starting out from the implementation and using it as a basis for a design discussion. While that may occasionally add something useful, generally it’s not the way to good designs.
While I can in theory follow your point that there might be a situation where you want a double-ended queue with set property (duplicates are ignored/eliminated), I have a hard time imagining when a Deque would not fulfil your needs in this case (Eliott Frisch mentioned the under-used ArrayDeque). You need pretty large amounts of data and/or pretty strict performance requirements before the linear complexity of contains and remove would be prohibitive. And in that case you may already be better off custom designing your own data structure.

Where can I find a documentation Vector and Hashtable in Java?

I know this question can be a little stupid but I just want to clear the doubt.
When going through Java tutorial for Collection ( http://docs.oracle.com/javase/tutorial/collections/index.html ), I didn't find any relevant information about both Vector and Hashtable. Both belong to Collection framework as Vector is implementation of List and Hashtable is implementation of Map. If it is so then why it is not in Sun tutorial? Where can I find Sun tutorial for Collection which contain good doc about both Vector and Hashtable and in depth knowledge about elements storing in List, Set and Map?
Because Vector and Hashtable are old, legacy collection classes. Don't use them.
Instead of Vector use ArrayList; instead of Hashtable use HashMap.
When Java 1.2 was released (very long ago), new collection classes were added to Java (the Collections Framework). Sun did not remove the old classes such as Vector and Hashtable because they wanted the new Java version to be backwards compatible. And now we still have those old classes.
One difference to be aware of is that Vector and Hashtable are synchronized, while ArrayList and HashMap are not. Most of the time you don't need synchronization; if you do, then you must take care to synchronize your ArrayList, and if you need a map, use ConcurrentHashMap instead of plain HashMap.
In general, Vector and Hashtable could be considered deprecated.
If you look at the online javadoc for Vector and Hashtable you'll see that they were the original implementation of ArrayList and HashMap, until the Collections framework came along, at which point they were retrofitted to implement interfaces from the Collections framework; this way, old classes that depended on those classes being there would not break. The only difference between them and their more common brethren is that they are synchronized.
In the vast majority of cases, synchronization isn't called for, so programmers will avoid the synchronization overhead and opt for regular ArrayLists and HashMaps. If a synchronized collection is desired there's always Collections.synchronized____() (or ConcurrentHashMap) that would work just fine too.
You probably don't need a tutorial for Vector and Hashtable because their behavior is already so similar to classes you're likely to be familiar with, and because they aren't used much any more. As for more info on List, Set, and Map, the online javadoc is a good place to start.
here are useful links for you
http://javarevisited.blogspot.com/2010/10/difference-between-hashmap-and.html
http://javarevisited.blogspot.com/2011/09/difference-vector-vs-arraylist-in-java.html
As mentioned by the JavaDoc of Vector:
As of the Java 2 platform v1.2, this class was retrofitted to implement the List interface, making it a member of the Java Collections Framework. Unlike the new collection implementations, Vector is synchronized. If a thread-safe implementation is not needed, it is recommended to use ArrayList in place of Vector.
it is kind of a legacy implementation of the List interface. The whole collection framework has been implemented to be by default not thread-safe. If you need thread safety, you may wrap any non tread-safe implementation by using the proper Collections.synchronizedXXX() methods, where XXX is List or Map or Set for example. The same applies for HashTable, which is by default synchronized as well. You should use HashMap instead and Collections.synchonizedMap() instead.

Is there a good Java library for generation of order preserving O(1) hash codes, based on a set of attributes and a comparator?

Given set of attributes and a comparator I'd like to generate an order preserving hash code that provides O(1) access. Is there a Java library for this sort of thing or would I have to design the hash function myself?
Try:
java.util.LinkedHashMap()
There is no single collection that will do this. Depending on the detail requirements there are several options to chose from.
For simplicity, I would just use a HashMap for lookups and when I need the sorted data, I'd make a copy of the values and sort it:
List<?> sorted = new ArrayList<?>(hashMap.values());
Collections.sort(sorted, Comparator<?>);
This suffices for most real world use cases.
You could also write your own super-container that internally holds the elements in two collections, one HashMap and maybe a TreeSet. You can then easily provide access methods that make use of the collection better for the purpose of the method. Just make sure you make additions and removals affect both the contained collections.

what's the difference between HashSet and LinkedHashSet

I saw that LinkedHashSet extends HashSet and I know it preserves order.
However, from checking the code in the JDK it seems that LinkedHashSet contains only constuctor and no implementation, so I guess all the logic happens in HashSet?
If that is correct, why is it designed like that? it seems very confusing.
EDIT: there was an unfortunate mistake in the question. I wrote HashMap and LinkedHashMap instead of HashSet and LinkedHashSet. I fixed the question answer it if possible.
Also, I was interested why Java designers chose to implement it like that.
Yes, LinkedHashMap calls its super constructor. One thing it does is to override the init() method, which is called by the super constructor.
The LinkedHashMap is an HashMap with a doubly-linked list implementation added.
As you said the difference between the two data structures is that the LinkedHashMap is an HashMap that preserve the insertion order of pairs.
So the Linked one is intended to used as an HashMap via standard methods of the HashMap and the only method added is removeEldestEntry(), useful if you want to deal with the "list" part of the data structure.

Data Structures for hashMap, List and Set

Can any one please guide me to look in depth about the Data Structures used and how is it implemented in the List, Set and Maps of Util Collection page.
In Interviews most of the questions will be on the Algorithms, but I never saw anywhere the implementation details, Can any one please share the information.
To learn how Java implements collections, the definitive place to go is the source code itself, freely available. Generally, Lists are implemented as either arrays (ArrayList) or linked lists (LinkedList); sets are either hashtables (HashSet) or trees (TreeSet); and maps are hashtables (HashMap).
Algorithms for manipulating arrays, linked lists, hashtables, and binary or n-ary trees (add, remove, search, sort) are complex enough in themselves that an entire course is necessary to cover them all. Anyone doing their own program design typically needs to understand these algorithms and their performance tradeoffs by heart. There's no substitute here for textbook study and/or practice.
The source code of the API is available, get a JDK and open up the src.zip file from the installation folder.
ArrayList: array
LinkedList: doubly linked list (Entry objects)
HashMap: array of Entry objects each Entry pointing to singly linked list
HashSet: internally uses HashMap, stores data as Key and dummy Object (of class Object) as Value in the map.
TreeMap: Red-Black tree implementation of Entry objects.
TreeSet: internally uses TreeMap. Key as data and dummy object as value.
*Entry: is an internal class in these collections and generally has Key, Value, references for other Entry objects etc.
You can always open the source files, it's all there, however, I wouldn't recommend it as usually they are quite hard to understand. Instead, I'd try finding the underlying data structure, and looking it up. Wikipedia contains most of the information you want to know on these subjects, and google contains the absolute rest.
List is just a dynamic array,
Set is a... set,
And maps are usually hash tables keyed by the key's hash, and stored as key-value pair.
If you're going to dive into the source code, I'd recommend familiarizing yourself with "how-it-probably-works", cause otherwise it will be hard to understand, especially the hash table.

Categories