I wrote a simple program that takes an array of strings that get converted into a list, then into a Set, which is finally printed.
Here is the code:
public static void main(String[] args) {
String[] array = {"hello", "goodbye", "welcome", "thanks"};
List<String> list = Arrays.asList(array);
System.out.println(list);
Set<String> set = new HashSet<String>(list);
System.out.println(set);
}
The set returns
[hello, goodbye, welcome, thanks]
[hello, thanks, goodbye, welcome]
And no matter what order I make the array it returns the Set in that particular order. So how does Set<> determine in what order the values should be put into?
The order of the elements in a Set is determined by the order of the elements in its Iterator and, as specified in Set.iterator()
The elements are returned in no particular order (unless this set is an instance of some class that provides a guarantee).
So there is no inherent order to a Set.
However, Set is only an interface. There are various implementstions of a Set that do provide a guarantee.
There's a HashSet - which doesn't - i.e. it optimises itself to achieve O(1) at the expense of a predictable order.
There's a TreeSet - which maintains the natural order of the objects - i.e. "ab" < "ac" and 1 < 10 or any order you define using a Comparator.
There's an EnumSet - which orders by the enum ordinal order - kind of like TreeSet.
There's a LinkedHashSet - which orders in the order the items were added.
There are other more obscure implementations of Set that also have their own character.
The iteration order of a HashSet is an implementation detail that may change from release to release. You should assume that the ordering is magic, inscrutable, and subject to change.
(In practice, it's affected by the hash codes of the elements, the smearing function HashSet uses internally, and the order the hash buckets appear generally.)
Related
my question was why does iterator work on set?
Here is my example code,
public class Staticex {
public static void main(String[] args) {
HashSet set = new HashSet();
set.add(1);
set.add(2);
set.add(3);
set.add(4);
set.add(5);
Iterator iter = set.iterator();
while (iter.hasNext()) {
System.out.println(iter.next());
}
}
}
I understand, set is unordered, In contrast List
So, How can get the values one by one through an iterator?
Is iterator changing set into like list which ordered data structure?
How can Iterator can using in set?
Like you are using it.
How can get the values one by one through an iterator?
Your code is doing that.
Is iterator changing set into like list which ordered data structure?
No.
The thing that you are missing is what "unordered" means. It means that the order in which the (set's) elements are returned is not predictable1, and not specified in the javadocs. However each element will be returned once and (since the elements of a set are unique!) only once for the iteration.
1 - Actually, this is not strictly true. If you have enough information about the element class, the element values, how they were created and how / when they were added to the HashSet, AND you analyze the specific HashSet implementation ... it is possible that you CAN predict what the iteration order is going to be. For example if you create a HashSet<Integer> and add 1, 2, 3, 4, ... to it, you will see a clear (and repeatable) pattern when you iterate the elements. This is in part due to the way that Integer.hashCode() is specified.
Referring to the documentation, we see that:
Iterator<E> iterator()
Returns an iterator over the elements in this collection. There are no guarantees concerning the order in which the elements are returned (unless this collection is an instance of some class that provides a guarantee).
Since there are no guarantees concerning the order in which the elements are returned for iterator, it is not a problem for iterator to apply to Set, which is unordered.
Further, it is not changing the Set into a List
Set is unordered in a logical sense. When you have a bag of things, there isn't a sense of order when they are inside the bag. But when you take each thing out of the bag, one at a time, you end up with some order. And like the other answer has mentioned, you cannot rely on that order since it is purely accidental.
I understand, set is unordered, In contrast List
This is not necessarily true. SortedSet is a subinterface of Set. As the name implies, instances of this interface are ordered in some fashion. For example, TreeSets are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used. Also, the main distinction between Set and List is that List allows for duplicate objects to be contained, whereas Set does not.
Now, if you are talking specifically about HashSet, then you are correct about being unordered.
I think your confusion is because you are asking yourself "why is the print out showing the numbers in numeric (insertion) order?" This is sort of a complicated answer for someone of your familiarization level, but the order in which they are printed out is because you are inserting integers and their hash code are basically their numeric values. And, although there is no guarantee as to the order in which the elements of the hash set are returned when iterating, the implementation of HashSet is backed by a hash table. In fact, if you change the insertion order of those same values, most likely the numbers will be printed out in the same numeric order. Now, remember that with all that, the order is not guaranteed. This may not be true, for instance, if you change the set elements to be String objects.
If i create 2 lists from the same set, can I be sure that I get the same ordering in both the lists? (I do not care about the ordering as long as both the lists have the same order and I am not performing any operations on the sets between creating the two lists.)
List l = new ArrayList(set);
List l1 = new ArrayList(set);
I understand that there are guaranteed ways of creating these lists and getting the same order and that there isn't a good reason for me to create two lists this way, but I would like to know why the ordering of elements in a set would change if no modify operations are performed on it.
Edit: The set is an unordered HashSet
You will propably get the same ordering in the lists l and l1. But since most Sets are unordered, you have no guarantee that there will be the same order.
Technically you could write an implementation of the Set interface which changes its order everytime any method is called. This would still fulfil the interface.
Since in the constructor new ArrayList(Collection) the toArray method of the collection is called, we can have a look at the Javadoc of Set#toArray():
Returns an array containing all of the elements in this set. If this set makes any guarantees as to what order its elements are returned by its iterator, this method must return the elements in the same order.
While the Javadoc of Set#iterator() says there is no general guarantee:
Returns an iterator over the elements in this set. The elements are returned in no particular order (unless this set is an instance of some class that provides a guarantee).
Given this, I would strongly advise you not to rely on the ordering of the lists.
As per documentation
public ArrayList(Collection c) Constructs a list
containing the elements of the specified collection, in the order they
are returned by the collection's iterator
So it really depends on the Set interface implementation class, if the order is constant.
For example, if you use LinkedHashSet the iteration order is predictable.
There are structures that their orders are guaranteed or not. If we mention of Set interface implemented by Java, there is no guarantee. Most likely the constructor of ArrayList make uses of iterator of Set. So both list certainly contain always same elements but order. That's actually why Set uses contains keyword instead of find to check an element whether it exists.
It's sub-interface, SortedSet, represents a set that is sorted
according to some criterion. In Java 6, there are two standard
containers that implement SortedSet. They are TreeSet and
ConcurrentSkipListSet.
In addition to the SortedSet interface, there is also the
LinkedHashSet class. It remembers the order in which the
elements were inserted into the set, and returns its elements in that
order.
One way to impose a desired (natural, or otherwise) order on an unordered collection like a Set is to create an ordered Set (in other words, a SortedSet) from the given set. If your sets are not too large and all you care for is a predictable iteration order, you can do:
// set = ...
List<? extends Comparable> list = new TreeSet<>(set).stream().collect(Collectors.toList());
This assumes that the set consists of elements that are comparable. Alternatively, you could use your own comparator in the TreeSet constructor. There may be some issues in creating such a comparator however, if the elements themselves are not comparable.
There are some intertesting and good answers here, I can propose a solution.
List list = new ArrayList(set);
List secondList = new ArrayList(list);
I have a question. It says that hashset in java doesn't mantain an order but looking at my program
public static void main(String[] args) {
HashSet<Integer> io=new HashSet<Integer>();
Integer io1=new Integer(4);
Integer io2=new Integer(5);
Integer io3=new Integer(6);
io.add(io2);
io.add(io3);
io.add(io1);
System.out.println(io);
}
and execute it it give me an ordered set everytime i run it. Why this is happening?
Another question is: if i implement a treeset (like i did in the previous program but instead of hashset using treeset and intead of Integer using my class) i have to implement compareto ?
HashSet doesn't maintain order, but it has to iterate over the elements in some order when you print them. HashSet is backed by a HashMap, which iterates over the elements in the order of the bins in which they are stored. In your simple example, 4,5,6 are mapped to bins 4,5,6 (since the hashCode of an integer is the value of the integer) so they are printed in ascending order.
If you tried to add 40,50,60 you would see a different order ([50, 40, 60]), since the default initial number of bins is 16, so the hash codes 40,50,60 will be mapped to bins 40%16 (8),50%16 (2) ,60%16 (12), so 50 is the first element iterated, followed by 50 and 60.
As for TreeSet<SomeCostumClass>, you can either implement Comparable<SomeCostumClass> in SomeCostumClass, or pass a Comparator<SomeCostumClass> to the constructor.
As per oracle docs, there is no guarantee that you will get the same order all the time.
This class implements the Set interface, backed by a hash table
(actually a HashMap instance). It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that
the order will remain constant over time.
A HashSet keeps an internal hash table (https://en.wikipedia.org/wiki/Hash_table), which is driven by the result of the respective objects hashCode(). For most objects, the hashCode() function is deterministic, so the results of iterating a HashSet of the same elements will likely be the same. That does not mean it will be ordered. However, for Integer specifically, the hashCode() of the function returns the very integer itself, therefore, for a single-level hash table it will be ordered.
I have a code in which for-each-loops on a Set need to rely on the fact that the iterator returns the elements always in the same order, e.g.
for(ParameterObject parameter : parameters) {
/* ... */
}
The iterators returned by HashSet are not guaranteed to have this property, however it is documented that the iterators of LinkedHashSet do have this property. So my code uses a LinkedHashSet and everything works fine.
However, I am wondering if I could endow the my code with a check that the set passed to it conforms to the requirement. It appears as if this is not possible (except of a direct test on LinkedHashSet). There is no interface implemented by LinkedHashSet which I could test on and there is no interface implemented by LinkedHashSet.iterator() which I could test on. It would be nice if there is an interface like OrderConsistentCollection or OrderConsistentIterator.
(I need this property here).
There isn't a way you can check for it -- but you can ensure it anyway, by simply copying the set into a collection that does have that property. A LinkedHashSet would do the trick, but if all you need is the iteration, an ArrayList would probably serve you better.
List<Foo> parameters = new ArrayList<>(parametersSet);
Now parameters will always return an iterator with the same ordering.
That said, you'd probably be fine with Evgeniy Dorofeev's suggestion, which points out that even the sets that don't guarantee a particular ordering usually do have a stable ordering (even if they don't guarantee it). HashSet acts that way, for instance. You'd actually have to have a pretty funky set, or take active randomization measures, to not have a stable ordering.
HashSet's ordering is not guaranteed, but it depends on the hash codes of its elements as well as the order in which they were inserted; they don't want to guarantee anything because they don't want to lock themselves into any one strategy, and even this loose of a contract would make for essentially random order if the objects' hash codes came from Object.hashCode(). Rather than specifying an ordering with complex implications, and then saying it's subject to change, they just said there's no guarantees. But those are the two factors for ordering, and if the set isn't being modified, then those two factors are going to be stable from one iteration to the next.
'HashSet.iterator does not return in any particular order' means that the elements returned by iterator are not sorted or ordered like in List or LinkedHashSet. But the HashSet.iterator will always return the elements in one and the same order while the HashSet is the same.
HashSet iterator is actually predictable, see this
HashSet set = new HashSet();
set.add(9);
set.add(2);
set.add(5);
set.add(1);
System.out.println(set);
I can foretell the output, it will be 1, 2, 5, 9. Because the elements kind of sorted by hashCode.
I need an example on how to use a comparable class on a HashSet to get an ascending order. Let’s say I have a HashSet like this one:
HashSet<String> hs = new HashSet<String>();
How can I get hs to be in ascending order?
Use a TreeSet instead. It has a constructor taking a Comparator. It will automatically sort the Set.
If you want to convert a HashSet to a TreeSet, then do so:
Set<YourObject> hashSet = getItSomehow();
Set<YourObject> treeSet = new TreeSet<YourObject>(new YourComparator());
treeSet.addAll(hashSet);
// Now it's sorted based on the logic as implemented in YourComparator.
If the items you have itself already implements Comparable and its default ordering order is already what you want, then you basically don't need to supply a Comparator. You could then construct the TreeSet directly based on the HashSet. E.g.
Set<String> hashSet = getItSomehow();
Set<String> treeSet = new TreeSet<String>(hashSet);
// Now it's sorted based on the logic as implemented in String#compareTo().
See also:
Object ordering tutorial
Collections tutorial - Set Implementations
HashSet "makes no guarantees as to the iteration order of the set." Use LinkedHashSet instead.
Addendum: I would second #BalusC's point about implementing Comparable and express
a slight preference for LinkedHashSet, which offers "predictable iteration order ... without incurring the increased cost associated with TreeSet."
Addendum: #Stephen raises an important point, which favors #BalusC's suggestion of TreeMap. LinkedHashSet is a more efficient alternative only if the data is (nearly) static and already sorted.
HashSets do not guarantee iteration order:
This class implements the Set
interface, backed by a hash table
(actually a HashMap instance). It
makes no guarantees as to the
iteration order of the set; in
particular, it does not guarantee that
the order will remain constant over
time. This class permits the null
element.
You probably need to choose a different datastructure if you want to be able to control the iteration order (or indeed have one at all!)