How to differentiate different Collection in java?

How to differentiate different Collection in java? - java

Please explain how different collection are used under different scenario.
By this I mean to say how can I differentiate when to use a List, a Set or a Map interface.
Please provide some links to examples that can provide a clear explanation.
Also
if insertion order is preserved then we should go for List.
if insertion order is not preserved then we should go for Set.
What does "insertion order is preserved" means?

Insertion order
Insertion order is preserving the order in which you have inserted the data.
For example you have inserted data {1,2,3,4,5}
Set returns something like {2,3,1,4,5}
while list returns {1,2,3,4,5} .//It preserves the order of insertion
When to use List, Set and Map in Java
1) If you need to access elements frequently by using index, then List is a way to go. Its implementation e.g. ArrayList provides faster access if you know index.
2) If you want to store elements and want them to maintain an order on which they are inserted into collection then go for List again, as List is an ordered collection and maintain insertion order.
3) If you want to create collection of unique elements and don't want any duplicate then choose any Set implementation e.g. HashSet, LinkedHashSet or TreeSet. All Set implementation follow there general contract e.g. uniqueness but also add addition feature e.g. TreeSet is a SortedSet and elements stored on TreeSet can be sorted by using Comparator or Comparable in Java. LinkedHashSet also maintains insertion order.
4) If you store data in form of key and value then Map is the way to go. You can choose from Hashtable, HashMap, TreeMap based upon your subsequent need.
You will find some more useful info at http://java67.blogspot.com/2013/01/difference-between-set-list-and-map-in-java.html

Related

Acessing a element in Treeset using index

Suppose there is a string treeset (ts)of elemnent 1,2,3,4,5,6,7,8,9,10.
Is there is any in built method in treeset so that i can access an element.
For eg accessing 3 can i do ts.[2]and accessing 8 ts.[7].(something like that).
i used this method:
Iterator<String> it = ts.iterator();
int i=0;
while(it.hasNext()) {
String ele=it.next();
if(i==2){
System.out.println(ele+"");
}
i++;
}
though when i ran it didn't showed any o/p but if i did i=0 then it showed all the o/p i.e 1,2,3,4,5,6,7,8,9,10.
Secondly can anyone tell me that when it is best to use hashset,treeset and linkedhashset

If you wanna access elements in your collection like ts[2], then you should better convert your collection into array using collection inbuilt method.
Otherwise, using iterator is the standard and efficient way to access elements in collection.
For second question, Hashset is used as hash table ; LinkedHashSet is used as hash table with elements stored in same way as inserted; TreeSet is used for collection using navigations.
For complete knowledge you must check Oracle documentation.

TreeSet is a NavigableSetwhich means you have an order of items (natural ordering as default, but you can define your own ordering relationship by using Comparator or Comparable interface) and you can navigate through items by this order. However there is no index mechanism. Basically a TreeSet is based on a TreeMap which is a red-black tree. In such a data structure indexes (element indexes, not indexes in the sense of efficient access) are not much meaningfull.
HashSet on the other hand is based on a HashMap which is a classical hash table implementation. In this data structure there is no order defined. You can look up each item at O(1) time though due to hash function used.
LinkedHashSet is a subclass of HashSet. Other then HashSet methods no new method is defined, so LinkedHashSet does not allow any more capability like natural order or indexes. However it has an auxilary linked list that keeps track of the order in which elements are inserted. In this way when you iterate over a LinkedHashSet by .iterator() method or a for loop you get elements in the order you inserted.
So basically a HashSet is more appropriate if you will access elements individually. Or being the simplest Set implementation you can use HashSet in generic cases. If you need to keep the order of insertion you need to use LinkedHashSet and if you have to enforce any custom ordering or natural ordering of items you should use TreeSet.

Why don't TreeMap.values() reflect the order in which elements were originally added?

I need a data structure that will perform both the role of a lookup map by key as well as be able to be convertable into a sorted list. The data that goes in is a very siple code-description pair (e.g. M/Married, D/Divorced etc). The lookup requirement is in order to get the description once the user makes a selection in the UI, whose value is the code. The sorted list requirement is in order to feed the data into UI components (JSF) which take List as input and the values always need to be displayed in the same order (alphabetical order of description).
The first thing that came to mind was a TreeMap. So I retrieve the data from my DB in the order I want it to be shown in the UI and load it into my tree map, keyed by the code so that I can later look up descriptions for further display once the user makes selections. As for getting a sorted list out of that same map, as per this post, I am doing the following:
List<CodeObject> list = new ArrayList<CodeObject>(map.values());
However, the list is not sorted in the same order that they were put into the map. The map is declared as a SortedMap and implemented as a TreeMap:
SortedMap<String, CodeObject> map = new TreeMap<String, CodeObject>().
CodeObject is a simple POJO containing just the code and description and corresponding getters (setters in through the constructor), a list of which is fed to UI components, which use the code as the value and description for display. I used to use just a List and that work fine with respect to ordering but a List does not provide an efficient interface for looking up a value by key and I now do have that requirement.
So, my questions are:
If TreeMap is supposed to be a map in the ordered of item addition, why isn's TreeMap.values() in the same order?
What should I do to fulfill my requirements explained above, i.e. have a data structure that will serve as both a lookup map AND a sorted collection of elements? Will TreeMap do it for me if I use it differently or do I need an altogether different approach?

TreeMap maintain's the key's natural order. You can even order it (with a bit more manipulation and custom definition of a comparator) by the natural order/reverse order of the value. But this is not the same as saying "Insertion order". To maintain the insertion order you need to use LinkedHashMap. Java LinkedHashMap is a subclass of HashMap - the analogy is the same as LinkedList where you maintain the trace of the next node. However, it says it cannot "Guarantee" that the order is maintained, so don't ask your money back if you suddenly see an insertion order is maintained with HashMap

TreeMap's documentation says:
The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.
So unless you're providing a Comparator and tracking the insertion order and using it in that Comparator, you'll get the natural order of the keys, not the order in which the keys were inserted.
If you want insertion order, as davide said, you can use LinkedHashMap:
Hash table and linked list implementation of the Map interface, with predictable iteration order...This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order). Note that insertion order is not affected if a key is re-inserted into the map.

What you need is LinkedHashMap
See another question as well.

copy elements of arraylist in a set with the same order java

I've sorted an arraylist of int in ascending order, but when I copy it in a set, the elements are not sorted anymore.
I'm using this :
HashSet<Integer> set = new HashSet<Integer>(sortedArray);
why is like that?

LinkedHashSet will keep the order. TreeSet will sort based either on an external Comparator or natural ordering through Comparable.
A general point of a Set is that order is irrelevant. Hashing is intended to put the elements in as random an order as possible. LinkedHashSet maintains a linked-list between references to the elements, so can maintain an order.
BitSet (which is not a Set) may, or may not, provide a more efficient data structure.

HashSet's don't sort or maintain order, and the API will tell you this:
it does not guarantee that the order will remain constant over time.
Consider using another type of Set such as a TreeSet.

If you just care about uniqueness, use the HashSet. If you're after sorting, then consider the TreeSet.

you need to use TreeSet and implement a Comparator object or Comparable interface for your data. you can read about Object ordering here
hash set is designed for quick access to unique data, not for maintaining a particular order.

does java sortedhashset type collection exist?

Does such a thing exist anywhere? Basically I see java has LinkedHashSet but no type of navigatable hash set?

By its very nature, a hash-based data structure is not ordered. You can write wrappers which supplement it with an additional data structure (this is more or less what LinkedHashMap does). But while it makes some sense to keep a hash set and a list, in order to keep a good ordering, you would need a tree or similar data structure. But the tree can work as a set by itself, so you would essentially be duplicating the information (more than in the case of set plus list, which differ more than two different set implemnentations). So the best solution is to just use TreeSet or another SortedSet if you need order.

It's not a HashSet, but as a descendant of Set you have the TreeSet
This class implements the Set interface, backed by a TreeMap instance. This class guarantees that the sorted set will be in ascending element order
You can traverse the elements using the iterator
public Iterator iterator()
Returns an iterator over the elements in this set. The elements are returned in ascending order

You can use a TreeSet but all the operations in it are lg(n)
You can use a LinkedHashSet, which keeps a linked list on top of hashset, but it only maintains insertion ordering (first inserted will be first element in iterator), you cannot have natural or custom ordering
You could also use TreeSet+HashSet approach but two reference for each element will be kept and while add and remove would still be lg(n) the contains will become expected o(n)
choose wisely :)

I guess there's TreeMap which is...related but definitely not the same :)

What is the difference between Lists, ArrayLists, Maps, Hashmaps, Collections etc..?

I've been using HashMaps since I started programming again in Java without really understanding these Collections thing.
Honestly I am not really sure if using HashMaps all the way would be best for me or for production code. Up until now it didn't matter to me as long as I was able to get the data I need the way I called them in PHP (yes, I admit whatever negative thing you are thinking right now) where $this_is_array['this_is_a_string_index'] provides so much convenience to recall an array of variables.
So now, I have been working with java for more than 3 months and came across the Interfaces I specified above and wondered, why are there so many of these things (not to mention, vectors, abstractList {oh well the list goes on...})?
I mean how are they different from each other?
And more importantly, what is the best Interface to use in my case?

The API is pretty clear about the differences and/or relations between them:
Collection
The root interface in the collection hierarchy. A collection represents a group of objects, known as its elements. Some collections allow duplicate elements and others do not. Some are ordered and others unordered.
http://download.oracle.com/javase/6/docs/api/java/util/Collection.html
List
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
http://download.oracle.com/javase/6/docs/api/java/util/List.html
Set
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
http://download.oracle.com/javase/6/docs/api/java/util/Set.html
Map
An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
http://download.oracle.com/javase/6/docs/api/java/util/Map.html
Is there anything in particular you find confusing about the above? If so, please edit your original question. Thanks.

A short summary of common java collections:
'Map': A 'Map' is a container that allows to store key=>value pair. This enables fast searches using the key to get to its associated value. There are two implementations of this in the java.util package, 'HashMap' and 'TreeMap'. The former is implemented as a hastable, while the latter is implemented as a balanced binary search tree (thus also having the property of having the keys sorted).
'Set': A 'Set' is a container that holds only unique elements. Inserting the same value multiple times will still result in the 'Set' only holding one instance of it. It also provides fast operations to search, remove, add, merge and compute the intersection of two sets. Like 'Map' it has two implementations, 'HashSet' and 'TreeSet'.
'List': The 'List' interface is implemented by the 'Vector', 'ArrayList' and 'LinkedList' classes. A 'List' is basically a collection of elements that preserve their relative order. You can add/remove elements to it and access individual elements at any given position. Unlike a 'Map', 'List' items are indexed by an int that is their position is the 'List' (the first element being at position 0 and the last at 'List.size()'-1). 'Vector' and 'ArrayList' are implemented using an array while 'LinkedList', as the name implies, uses a linked list. One thing to note is, unlike php's associative arrays (which are more like a Map), an array in Java and many other languages actually represents a contiguous block of memory. The elements in an array are basically laid out side by side on adjacent "slots" so to speak. This gives very fast lookup and write times, much faster than associative arrays which are implemented using more complex data structures. But they can't be indexed by anything other than the numeric positions within the array, unlike associative arrays.
To get a really good idea of what each collection is good for and their performance characteristics I would recommend getting a good idea about data structures like arrays, linked lists, binary search trees, hashtables, as well as stacks and queues. There is really no substitute to learning this if you want to be an effective programmer in any language.
You can also read the Java Collections trail to get you started.

In Brief (and only looking at interfaces):
List - a list of values, something like a "resizable array"
Set - a container that does not allow duplicates
Map - a collection of key/value pairs

A Map vs a List.
In a Map, you have key/value pairs. To access a value you need to know the key. There is a relationship that exists between the key and the value that persists and is not arbitrary. They are related somehow. Example: A persons DNA is unique (the key) and a persons name (the value) or a persons SSN (the key) and a persons name (the value) there is a strong relationship.
In a List, all you have are values (a persons name), and to access it you need to know its position in the list (index) to access it. But there is no permanent relationship between the position of the value in the list and its index, it is arbitrary.

■ List — An ordered collection of elements that allows duplicate entries
Concrete Classes:
ArrayList — Standard resizable list.
LinkedList — Can easily add/remove from beginning or end.
Vector — Older thread-safe version of ArrayList.
Stack — Older last-in, first-out class.
■ Set — Does not allow duplicates
Concrete Classes:
HashSet—Uses hashcode() to find unordered elements.
TreeSet—Sorted and navigable. Does not allow null values.
■ Queue — Orders elements for processing
Concrete Classes:
LinkedList — Can easily add/remove from beginning or end.
ArrayDeque—First-in, first-out or last-in, first-out. Does not allow null values.
■ Map — Maps unique keys to values
Concrete Classes:
HashMap — Uses hashcode() to find keys.
TreeMap — Sorted map. Does not allow null keys.
Hashtable — Older version of hashmap. Does not allow null keys or values.

That is a question that ultimately has a very complex answer--there are entire college classes dedicated to data structures. The short answer is that they all have trade-offs in memory usage and the speed of various operations.
What would be really healthy is some time with a nice book on data structures--I can almost guarantee that your code will improve significantly if you get a nice understanding of data structures.
That said, I can give you some quick, temporary advice from my experience with Java. For most simple internal things, ArrayList is generally preferred. For passing collections of data about, simple arrays are generally used. HashMap is only really used for cases when there is some logical reason to have special keys corresponding to values--I haven't seen anyone use them as a general data structure for everything. Other structures are more complicated and tend to be used in special cases.

As you already know, they are containers for objects. Reading their respective APIs will help you understand their differences.
Since others have described what are their differences about their usage, I will point you to this link which describes complexity of various data structures.
This list is programming language agnostic, and, as always, real world implementations will vary.
It is useful to understand complexity of various operations for each of these structures, since in the real world, it will matter if you're constantly searching for an object in your 1,000,000 element linked list that's not sorted. Performance will not be optimal.

List Vs Set Vs Map
1) Duplicity: List allows duplicate elements. Any number of duplicate elements can be inserted into the list without affecting the same existing values and their indexes.
Set doesn’t allow duplicates. Set and all of the classes which implements Set interface should have unique elements.
Map stored the elements as key & value pair. Map doesn’t allow duplicate keys while it allows duplicate values.
2) Null values: List allows any number of null values.
Set allows single null value at most.
Map can have single null key at most and any number of null values.
3) Order: List and all of its implementation classes maintains the insertion order.
Set doesn’t maintain any order; still few of its classes sort the elements in an order such as LinkedHashSet maintains the elements in insertion order.
Similar to Set Map also doesn’t stores the elements in an order, however few of its classes does the same. For e.g. TreeMap sorts the map in the ascending order of keys and LinkedHashMap sorts the elements in the insertion order, the order in which the elements got added to the LinkedHashMap.enter code here

List Vs Set Vs Map
1) Duplicity: List allows duplicate elements. Any number of duplicate elements can be inserted into the list without affecting the same existing values and their indexes.
Set doesn’t allow duplicates. Set and all of the classes which implements Set interface should have unique elements.
Map stored the elements as key & value pair. Map doesn’t allow duplicate keys while it allows duplicate values.
2) Null values: List allows any number of null values.
Set allows single null value at most.
Map can have single null key at most and any number of null values.
3) Order: List and all of its implementation classes maintains the insertion order.
Set doesn’t maintain any order; still few of its classes sort the elements in an order such as LinkedHashSet maintains the elements in insertion order.
Similar to Set Map also doesn’t stores the elements in an order, however few of its classes does the same. For e.g. TreeMap sorts the map in the ascending order of keys and LinkedHashMap sorts the elements in the insertion order, the order in which the elements got added to the LinkedHashMap.

Difference between Set, List and Map in Java -
Set, List and Map are three important interface of Java collection framework and Difference between Set, List and Map in Java is one of the most frequently asked Java Collection interview question. Some time this question is asked as When to use List, Set and Map in Java. Clearly, interviewer is looking to know that whether you are familiar with fundamentals of Java collection framework or not. In order to decide when to use List, Set or Map , you need to know what are these interfaces and what functionality they provide. List in Java provides ordered and indexed collection which may contain duplicates. Set provides an un-ordered collection of unique objects, i.e. Set doesn't allow duplicates, while Map provides a data structure based on key value pair and hashing. All three List, Set and Map are interfaces in Java and there are many concrete implementation of them are available in Collection API. ArrayList and LinkedList are two most popular used List implementation while LinkedHashSet, TreeSet and HashSet are frequently used Set implementation. In this Java article we will see difference between Map, Set and List in Java and learn when to use List, Set or Map.
Set vs List vs Map in Java
As I said Set, List and Map are interfaces, which defines core contract e.g. a Set contract says that it can not contain duplicates. Based upon our knowledge of List, Set and Map let's compare them on different metrics.
Duplicate Objects
Main difference between List and Set interface in Java is that List allows duplicates while Set doesn't allow duplicates. All implementation of Set honor this contract. Map holds two object per Entry e.g. key and value and It may contain duplicate values but keys are always unique. See here for more difference between List and Set data structure in Java.
Order
Another key difference between List and Set is that List is an ordered collection, List's contract maintains insertion order or element. Set is an unordered collection, you get no guarantee on which order element will be stored. Though some of the Set implementation e.g. LinkedHashSet maintains order. Also SortedSet and SortedMap e.g. TreeSet and TreeMap maintains a sorting order, imposed by using Comparator or Comparable.
Null elements
List allows null elements and you can have many null objects in a List, because it also allowed duplicates. Set just allow one null element as there is no duplicate permitted while in Map you can have null values and at most one null key. worth noting is that Hashtable doesn't allow null key or values but HashMap allows null values and one null keys. This is also the main difference between these two popular implementation of Map interface, aka HashMap vs Hashtable.
Popular implementation
Most popular implementations of List interface in Java are ArrayList, LinkedList and Vector class. ArrayList is more general purpose and provides random access with index, while LinkedList is more suitable for frequently adding and removing elements from List. Vector is synchronized counterpart of ArrayList. On the other hand, most popular implementations of Set interface are HashSet, LinkedHashSet and TreeSet. First one is general purpose Set which is backed by HashMap , see how HashSet works internally in Java for more details. It also doesn't provide any ordering guarantee but LinkedHashSet does provides ordering along with uniqueness offered by Set interface. Third implementation TreeSet is also an implementation of SortedSet interface, hence it keep elements in a sorted order specified by compare() or compareTo() method. Now the last one, most popular implementation of Map interface are HashMap, LinkedHashMap, Hashtable and TreeMap. First one is the non synchronized general purpose Map implementation while Hashtable is its synchronized counterpart, both doesn' provide any ordering guarantee which comes from LinkedHashMap. Just like TreeSet, TreeMap is also a sorted data structure and keeps keys in sorted order.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.