Java hashset and treeset - java

I have a question. It says that hashset in java doesn't mantain an order but looking at my program
public static void main(String[] args) {
HashSet<Integer> io=new HashSet<Integer>();
Integer io1=new Integer(4);
Integer io2=new Integer(5);
Integer io3=new Integer(6);
io.add(io2);
io.add(io3);
io.add(io1);
System.out.println(io);
}
and execute it it give me an ordered set everytime i run it. Why this is happening?
Another question is: if i implement a treeset (like i did in the previous program but instead of hashset using treeset and intead of Integer using my class) i have to implement compareto ?

HashSet doesn't maintain order, but it has to iterate over the elements in some order when you print them. HashSet is backed by a HashMap, which iterates over the elements in the order of the bins in which they are stored. In your simple example, 4,5,6 are mapped to bins 4,5,6 (since the hashCode of an integer is the value of the integer) so they are printed in ascending order.
If you tried to add 40,50,60 you would see a different order ([50, 40, 60]), since the default initial number of bins is 16, so the hash codes 40,50,60 will be mapped to bins 40%16 (8),50%16 (2) ,60%16 (12), so 50 is the first element iterated, followed by 50 and 60.
As for TreeSet<SomeCostumClass>, you can either implement Comparable<SomeCostumClass> in SomeCostumClass, or pass a Comparator<SomeCostumClass> to the constructor.

As per oracle docs, there is no guarantee that you will get the same order all the time.
This class implements the Set interface, backed by a hash table
(actually a HashMap instance). It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that
the order will remain constant over time.

A HashSet keeps an internal hash table (https://en.wikipedia.org/wiki/Hash_table), which is driven by the result of the respective objects hashCode(). For most objects, the hashCode() function is deterministic, so the results of iterating a HashSet of the same elements will likely be the same. That does not mean it will be ordered. However, for Integer specifically, the hashCode() of the function returns the very integer itself, therefore, for a single-level hash table it will be ordered.

Related

How can Iterator can using in set(java)?

my question was why does iterator work on set?
Here is my example code,
public class Staticex {
public static void main(String[] args) {
HashSet set = new HashSet();
set.add(1);
set.add(2);
set.add(3);
set.add(4);
set.add(5);
Iterator iter = set.iterator();
while (iter.hasNext()) {
System.out.println(iter.next());
}
}
}
I understand, set is unordered, In contrast List
So, How can get the values ​​one by one through an iterator?
Is iterator changing set into like list which ordered data structure?
How can Iterator can using in set?
Like you are using it.
How can get the values ​​one by one through an iterator?
Your code is doing that.
Is iterator changing set into like list which ordered data structure?
No.
The thing that you are missing is what "unordered" means. It means that the order in which the (set's) elements are returned is not predictable1, and not specified in the javadocs. However each element will be returned once and (since the elements of a set are unique!) only once for the iteration.
1 - Actually, this is not strictly true. If you have enough information about the element class, the element values, how they were created and how / when they were added to the HashSet, AND you analyze the specific HashSet implementation ... it is possible that you CAN predict what the iteration order is going to be. For example if you create a HashSet<Integer> and add 1, 2, 3, 4, ... to it, you will see a clear (and repeatable) pattern when you iterate the elements. This is in part due to the way that Integer.hashCode() is specified.
Referring to the documentation, we see that:
Iterator<E> iterator()
Returns an iterator over the elements in this collection. There are no guarantees concerning the order in which the elements are returned (unless this collection is an instance of some class that provides a guarantee).
Since there are no guarantees concerning the order in which the elements are returned for iterator, it is not a problem for iterator to apply to Set, which is unordered.
Further, it is not changing the Set into a List
Set is unordered in a logical sense. When you have a bag of things, there isn't a sense of order when they are inside the bag. But when you take each thing out of the bag, one at a time, you end up with some order. And like the other answer has mentioned, you cannot rely on that order since it is purely accidental.
I understand, set is unordered, In contrast List
This is not necessarily true. SortedSet is a subinterface of Set. As the name implies, instances of this interface are ordered in some fashion. For example, TreeSets are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used. Also, the main distinction between Set and List is that List allows for duplicate objects to be contained, whereas Set does not.
Now, if you are talking specifically about HashSet, then you are correct about being unordered.
I think your confusion is because you are asking yourself "why is the print out showing the numbers in numeric (insertion) order?" This is sort of a complicated answer for someone of your familiarization level, but the order in which they are printed out is because you are inserting integers and their hash code are basically their numeric values. And, although there is no guarantee as to the order in which the elements of the hash set are returned when iterating, the implementation of HashSet is backed by a hash table. In fact, if you change the insertion order of those same values, most likely the numbers will be printed out in the same numeric order. Now, remember that with all that, the order is not guaranteed. This may not be true, for instance, if you change the set elements to be String objects.

How does HashTable Order Values?

I want to know how hashtable orders its values after using Put method.
For example:
a b c d e
Normal 2 weeks Next Save and Finish Go to Cases
hashtable.put("a","Normal"); ...
The order of the values will be different and not in the same order we put.
I think the order will be like this:
b a e c d
2 weeks Normal Go to Cases Next Save and Finish
Please suggest data structures that solve the problem.
Thanks.
As very often in these cases, the answer is in the documentation:
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
Similar to HashMap, HashTable also does not guarantee insertion order for the elements.
Reason
HashTable is optimized for fast look up. This is achieved by calculating hash for key values stored. This ensures that searching for any value in HashTable is O(1), takes the same time irrespective the number entries in HashTable.
Thus the entries are stored based on the hash generated for key. This is the reason why HashTable does not guarantee the order of the elements in which they were inserted.
A hash value (or simply hash), also called a message digest, is a number generated from a string of text. The hash is substantially smaller than the text itself, and is generated by a formula in such a way that it is extremely unlikely that some other text will produce the same hash value.
http://www.webopedia.com/TERM/H/hashing.html
http://interactivepython.org/runestone/static/pythonds/SortSearch/Hashing.html
As previously explained, hashtable iterate order is just casual. If you want to preserve inserted order use LinkedHashMap. If you want to obtain natural order or a predefined one, use TreeMap. As natural order I mean the order of key, for example String, Integer, Long and so on, as implement Comparable interface, are automatically sorted as any other class that implements Comparable. Predefined order can be supplied by a Comparator too, creating the TreeMap.

Why does the Set in Java seem to insert the same way every time? I thought order doesn't matter in a Set

So I know that sets cannot take duplicates... more formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most has one null element. I realize Treeset sorts the set for me. Here is my set code:
import java.util.Set;
import java.util.HashSet;
import java.util.TreeSet;
public class SetExample {
public static void main(String args[]) {
int count[] = {11, 22, 33, 44, 55};
Set<Integer> hset = new HashSet<Integer>();
try{
for(int i = 0; i<4; i++){
hset.add(count[i]);
}
System.out.println(hset);
TreeSet<Integer> treeset = new TreeSet<Integer>(hset);
System.out.println("The sorted list is:");
System.out.println(treeset);
}
catch(Exception e){
e.printStackTrace();
}
}
}
This is my output:
ArrayList Elements:
[Chaitanya, Rahul, Ajeet]
LinkedList Elements: [Kevin, Peter, Kate]
[33, 22, 11, 44]
The sorted list is:
[11, 22, 33, 44]
Why is the set ALWAYS in the order of [33,22,11,44]?
Quoting javadoc of HashSet:
It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.
It doesn't say "random". It doesn't say it's not consistent. It basically just says that order is undefined and may change over time, meaning that if you add more values, the order of the current values may change.
Note that although you get consistent, but undefined, order of your specific values, another version of the Java Runtime Library may give a different, but still consistent, order.
The actual order of the objects depend on the hash values of those objects, the current number of hash buckets in the HashSet, and the algorithm used to map a hash value to a hash bucket. The algorithm may change between different versions of the Java Runtime Library, and the number of hash buckets may change as values are added and removed from the HashSet.
To see this in effect, try creating the HashSet with a different initial capacity, using the HashSet(int initialCapacity) constructor. Different capacities will likely order the values differently, but will be consistent, meaning that same values, same capacity and same Runtime version will always be same order.
Internally HashSet uses a map with the key as the element you're inserting and the value is just a new Object as a dummy placeholder (Look at the source code, you will see it just uses a map).
This means the keys(elements you add in set) could be placed anywhere in the map depending on the hashing. If you continue to place the same values in the set as an experiment they would probably be hashed to exactly the same location which is why the order wouldn't change when iterating over the entries.
Only when you change the values would you get some different ordering due to where they end up being placed in the backing map.
If you look at the implementation of HashMap it turns out that under the hood it is the HashMap, with values stored as keys in HashMap (and some dummy empty object as a value), so the order in HashSet is is implied by the ordering in HashMap. - You can read more about entry ordering in HashMap here: https://stackoverflow.com/a/2144822/1935341
If you are wondering why the order is consistent as it said that set doesn’t maintain any order.
The thing is set doesn’t maintain insert order but maintain an order of itself.
HashSet Maintain a HashMap inside to store the entries.
public HashSet() {
map = new HashMap<>();
}
When some item is inserted it is inserted to the Map.
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Now as of HashMap. It is know that the index of the key is determined by the hash value of the key. This hash value is consistent for any object.
So similar Object will generate same hash every time. So the similar order is maintained .

finding a hash function for long integer array

I am looking for a hash function for an integer array containing about 17 integers each. There are about 1000 items in the HashMap and I want the computation to be as fast as possible.
I am now kind of confused by so many hash functions to choose and I notice that most of them are designed for strings with different characters. So is there a hash function designed for strings with only numbers and quick to run?
Thanks for your patience!
You did not specify any requirements (except speed of calculation), but take a look at java.util.Arrays#hashCode. It should be fast, too, just iterating once over the array and combining the elements in an int calculation.
Returns a hash code based on the contents of the specified array. For any two non-null int arrays a and b such that Arrays.equals(a, b), it is also the case that Arrays.hashCode(a) == Arrays.hashCode(b).
The value returned by this method is the same value that would be obtained by invoking the hashCode method on a List containing a sequence of Integer instances representing the elements of a in the same order. If a is null, this method returns 0.
And the hashmap accepts an array of integer as the key.
Actually, no!
You could technically use int[] as a key in a HashMap in Java (you can use any kind of Object), but that won't work well, as arrays don't define a useful hashCode method (or a useful equals method). So the key will use object identity. Two arrays with identical content will be considered to be different from each-other.
You could use List<Integer>, which does implement hashCode and equals. But keep in mind that you must not mutate the list after setting it as a key. That would break the hashtable.
hashmap functions can be found in
https://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html
Creating a hashmap is easy.. it goes as
HashMap<Object, Integer> map = new HashMap<Object, Integer>();

How does a Set determine the order of its values?

I wrote a simple program that takes an array of strings that get converted into a list, then into a Set, which is finally printed.
Here is the code:
public static void main(String[] args) {
String[] array = {"hello", "goodbye", "welcome", "thanks"};
List<String> list = Arrays.asList(array);
System.out.println(list);
Set<String> set = new HashSet<String>(list);
System.out.println(set);
}
The set returns
[hello, goodbye, welcome, thanks]
[hello, thanks, goodbye, welcome]
And no matter what order I make the array it returns the Set in that particular order. So how does Set<> determine in what order the values should be put into?
The order of the elements in a Set is determined by the order of the elements in its Iterator and, as specified in Set.iterator()
The elements are returned in no particular order (unless this set is an instance of some class that provides a guarantee).
So there is no inherent order to a Set.
However, Set is only an interface. There are various implementstions of a Set that do provide a guarantee.
There's a HashSet - which doesn't - i.e. it optimises itself to achieve O(1) at the expense of a predictable order.
There's a TreeSet - which maintains the natural order of the objects - i.e. "ab" < "ac" and 1 < 10 or any order you define using a Comparator.
There's an EnumSet - which orders by the enum ordinal order - kind of like TreeSet.
There's a LinkedHashSet - which orders in the order the items were added.
There are other more obscure implementations of Set that also have their own character.
The iteration order of a HashSet is an implementation detail that may change from release to release. You should assume that the ordering is magic, inscrutable, and subject to change.
(In practice, it's affected by the hash codes of the elements, the smearing function HashSet uses internally, and the order the hash buckets appear generally.)

Categories