How does the Set interface guarantees non duplicates [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Today the interviewer asked me: How does Set guarantees non duplicates?

The answer lies in the source code of add method. For example in source code of TreeSet the add method is implemented as follows:
public boolean add(E e)
{
return m.put(e, PRESENT)==null;
}
Where, PRESENT is an object of Object class. And m is the object of NavigableMap. This NavigableMap m is used to store the element e as key and PRESENT as its value to the given key e. Consequently every key in m has same object PRESENT. The put method of Map as defined within oracle doc is :
Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.
...
...
Returns: the previous value associated with key, or null if there was no mapping for key. (A null return can also indicate that the map previously associated null with key.)
So, When you put the duplicate element within set this element is put in the NavigableMap as key with value PRESENT. If this key was not present in NavigableMap then put method returns null and hence
m.put(e,PRESENT)==null returns true and we come to know that the element is added. And if the key is already present in NavigableMap then put method overrided the value for the key e within NavigableMap with PRESENT and returns the old value (which is PRESENT) and hence
m.put(e,PRESENT)==null returns false and we come to know that the element is not added.

From a specification point of view, it achieves it by e.g. specifying what the add method must do if you try and add a duplicate. The documentation for the add method says this, for example:
Adds the specified element to this set if it is not already present
(optional operation). More formally, adds the specified element e to
this set if the set contains no element e2 such that (e==null ?
e2==null : e.equals(e2)). If this set already contains the element,
the call leaves the set unchanged and returns false. In combination
with the restriction on constructors, this ensures that sets never
contain duplicate elements.
From the same page (http://docs.oracle.com/javase/6/docs/api/java/util/Set.html):
The additional stipulation on constructors is, not surprisingly, that all constructors must create a set that contains no duplicate elements (as defined above).
(For completeness, there are also stipulations with regard to equals and hashCode that ensure that Set properly models the set abstraction.)

Set is an abstract data type that can be implemented in many ways. On its own it's a specification of a contract; as such it does not guarantee anything. It's up to the implementation of the interface to guarantee that the contract is fulfilled.
Therefore it's more interesting to look at how and why the implementations work. Some common implementations are:
Hash table, as implemented in Java by HashSet
Balanced tree, as implemented in Java by TreeSet
Bit set (for special types), as implemented in Java by EnumSet and BitSet
Skip lists, as implemented by ConcurrentSkipListSet
Naive arrays: scan the array for the element before adding it; not frequently used. Implemented in Java as CopyOnWriteArraySet
In a job interview you would have replied with the above and offered to explain the details of any one implementation. The interviewer should already know some of these and it wouldn't have been to your benefit to start rambling about them unless asked.

Related

Java List.add() and Map.put() [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Working with Lists and Maps, I started thinking, why Map method for adding Object is called put and for List it is add?
It is just developers will, or there was something under naming those methods different.
May be the methods names let the developer know while adding to Map/List what kind of data structure he is working with?
Or those names describes the way they works?
The difference is :
.add() means to insert something at the end or wherever you want to(you know where to add) whereas
.put() means to add an element wherever it needs to be placed, not necessarily at the end of the Map, because it all depends on the Key to be inserted (you don't know where to add).
To me, it has some cause.
After all, List is a dynamic array, which internally consists a logical index where we are adding.
And map internally carry a bucket of key and value pair. So kind of we are putting something into the bucket.
It can be stated as so because to get a clear understanding.
As java is a 3rd level human understandable language this also can state as a simple English for better understanding.
Collection#add() can be seen that you add your value to a pool of something (an implementation of Collection<E> defines what pool actually is).
Whereas with Map#put() you associate your value with the key which potentially already had a value associated with it.
Add will always add an entry to the end of a list.
Put injects an entry into the map if the key does not already exist; if the key already exists, the value is updated.
Thus the operations are different. On some level, the authors of the API have to make decisions that balance out various concerns. Add to a set has some aspects of add to a list and put to a map, in that adding an "equal" entry has no effect.
For this you should just read the Java docs for add and put.
They are 2 different function, that take completely incompatible inputs, and return completely incompatible values. They are 2 completely separate and distinct functions that behave completely differently (other than they both are for adding elements to a collection (the concept, not interface. As map doesn't implement that interface)).
From the docs
PUT
Associates the specified value with the specified key in this map (optional operation). If the map previously contained a mapping for the key, the old value is replaced by the specified value. (A map m is said to contain a mapping for a key k if and only if m.containsKey(k) would return true.)
ADD
Appends the specified element to the end of this list (optional operation).
Lists that support this operation may place limitations on what elements may be added to this list. In particular, some lists will refuse to add null elements, and others will impose restrictions on the type of elements that may be added. List classes should clearly specify in their documentation any restrictions on what elements may be added.
List :- If i say i'm adding some items to the some Container ill say that i have added the items to the container.Here we are more concentrate on the new item Addition to the existing container or List (in java).
Map :- If i want to put some of the things to the some locker or my computer which is already having the things which i dont care about i just have to put not add.
Here we are concentrate to addition of new data to the locker or Map (in java) regardless of existing the thing.
Real time example:- you add sugar to the tea keeping in mind the amount which is already their.you put your cloths to the Clothing Store regarding their exist any cloths or not.
In java side :-
if you list is like this :-
List<String> list = new ArrayList<String>();
list.add("java");
list.add("php");
list.add("python");
list.add("perl");
list.add("c");
list.add("lisp");
list.add("c#");
and you want to add something to the list you have to care about the existing thing because if it is list it will add duplicate and if set then don't duplicate.
If you create a Map.
Map<String, Object> foodData = new HashMap<String, Object>();
foodData.put("penguin", 1);
foodData.put("flamingo", 2);
and again you are adding something foodData.put("penguin", 3); you don't have to worry about adding and update the data internally.
I think if you get into etymology we can only guess that since when you place a value into a list you always increase the list length, however if you put the value into a map you would not necessary increase the number of map entries (if the key already exists)

How can we maintain unique object list without using set?

Suppose we have two employee instances having some common attributes like id,name,address (All values are same ).
I want unique objects list without implementing Set.
Please don’t explain the logic with Primitive data type ,I want the uniqueness with Object type.
Simple: you create a "collection" class that calls uses the equals() method of "incoming" objects to compare them against already stored objects.
If that method gives all false - no duplicate, you add to the collection. If true - not unique. No adding.
In other words - you re-invent the wheel and create something that resembles a Java set. Of course, with all the implicit drawbacks - such as repeating implementation bugs that were fixed in the Java set implementations 15 to 20 years ago.
If you don't want to use a Set, use a List. All you need to know to implement uniqueness checking logic is whatequals(Object other) method does:
Indicates whether some other object is "equal to" this one
Now you can test an incoming object against all objects currently on your list, and add it if a match is not found.
Obviously, performance of this method of maintaining a unique collection of objects is grossly inferior to both hash-based and ordering-based sets.
If you cannot use a Set for holding unique instances of your Employee class, you can use a List. This requires you to do two things:
Override equals() (and hashCode()) in Employee to contain the equality logic. This you would need even if you used a Set.
Each time you add items to the list, use List.contains() for checking whether an equal object is already in the list. The method will internally use your Employee.equals() implementation. Add an item only if it's not already in the list. Note that this method is quite inefficient as it needs to iterate through the whole list in worst case (when an item is not already in the list).

Java Collection HashSet vs HashMap [duplicate]

This question already has answers here:
Difference between HashSet and HashMap?
(20 answers)
Closed 8 years ago.
I have Question regarding the best practice of using a Collection regarding the memory. I need to call a method which returns pairs of (Key,Value) frequently, so which way is the best, using HashMap or creating an Object that contains Key and value and save this object in a HashSet?
Thanks & Regards.
It depends on whether you need to search the data structure based on the key alone or both the key and the value.
If you search by the key alone (i.e. map.containsKey(key)), you should use a HashMap.
If you search for existence of a key-value pair (i.e. set.contains(new Pair(key,value)), you should use a HashSet that contains those pairs.
Another thing to consider is how you determine the uniqueness of the elements. If it is determined by the key alone, you should use a HashMap. If it is determined by both key and value (i.e. you can have the same key appear twice with different values), you must use HashSet, since HashhMap doesn't allow the same key to appear more than once.

What do HashMap and HashSet have in common? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Everywhere you can find answer what are differences:
Map is storing keys-values, it is not synchronized(not a thread safe), allows null values and only one null key, faster to get value because all values have unique key, etc.
Set - not sorted, slower to get value, storing only value, does not allow duplicates or null values I guess.
BUT what means Hash word (that is what they have the same). Is it something about hashing values or whatever I hope you can answer me clearly.
Both use hash value of the Object to store which internally uses hashCode(); method of Object class.
So if you are storing instances of your custom class then you need to override hashCode(); method.
HashSet and HashMap have a number of things in common:
The start of their name - which is a clue to the real similarity.
They use Hash Codes (from the hashCode method built into all Java objects) to quickly process and organize Objects.
They are both unordered collections - but both provide ordered varients (LinkedHashX to store objects in the order of addition)
There is also TreeSet/TreeMap to sort all objects present in the collection and keep them sorted. A comparison of TreeSet to TreeMap will find very similar differences and similarities to one between HashSet and HashMap.
They are also both impacted by the strengths and limitations of Hash algorithms in general.
Hashing is only effective if the objects have well behaved hash functions.
Hashing breaks entirely if equals and hashCode do not follow the correct contract.
Key objects in maps and objects in set should be immutable (or at least their hashCode and equals return values should never change) as otherwise behavior becomes undefined.
If you look at the Map API you can also see a number of other interesting connections - such as the fact that keySet and entrySet both return a Set.
None of the Java Collections are thread safe. Some of the older classes from other packages were but they have mostly been retired. For thread-safety look at the concurrent package for non-thread-safety look at the collections package.
Just look into HashSet source code and you will see that it uses HashMap. So they have the same properties of null-safety, synchronization etc:
public class HashSet<E>
...
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
...
public boolean contains(Object o) {
return map.containsKey(o);
}
...
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
...
}
HashSet is like a HashMap where you don't care about the values but about the keys only.
So you care only if a given key K is in the set but not about the value V to which it is mapped (you can think of it as if V is a constant e.g. V=Boolean.TRUE for all keys in the HashSet). So HashSet has no values (V set). This is the whole difference from structural point of view. The hash part means that when putting elements into the structure Java first calls the hashCode method. See also http://en.wikipedia.org/wiki/Open_addressing to understand in general what happens under the hood.
The hash value is used to check faster if two objects are the same. If two objects have same hash, they can be equal or not equal (so they are then compared for equality with the equals method). But if they have different hashes they are different for sure and the check for equality is not needed. This doesn't mean that if two objects have same hash values they overwrite each other when they are stored in the HashSet or in the HashMap.
Both are not Thread safe and store values using hashCode(). Those are common facts. And another one is both are member of Java collection framework. But there are lots of variations between those two.
Hash regards the technique used to convert the key to an index. Back in the data strucutures class we used to learn how to construct a hash table, to do that you would need to get the strings that were inserted as values and convert them to a number to index an array used internally as the storing data structure.
One problem that was also very discussed was to find a hashing function that would incurr in minimum colision so that we won't have two different objects, with different keys sharing the same position.
So, the hash is about how the keys are processed to be stored. If we think about it for a while, there isn't a (real) way to index memory with strings, only with numbers, so to have a 2d structure like a table that is indexed by a string (or an object as you wish) you need to generate a number (or a hash) for that string and store the value in an array in this index. However, if you need the key "name" you would need a different array to, in the same index, store the key "name".
Cheers
The "HASH" word is common because both uses hashing mechanism. HashSet is actually implemented using HashMap, using dummy object instance on every entry of the Set. And thereby a wastage of 4 bytes for each entry.

addAll to Set is not adding the values in java

I have a property in an Object(Obj1)
Set<AssignedService> serviceList;
public Set<AssignedService> getServiceList();
I am doing the below operation in certain instances
Obj1.getServiceList().clear();
Obj1.getServiceList().addAll(services);
where Services is also Set
But what I see as an end result is services set is having 4 objects/data elements
but Obj1.getServiceList() is returning an empty set after addAll
What's the issue here. is it a problem with AssignedService object since it doesn't implements IComparable
You should first read this excellent piece on .equals()
Then, as others have pointed out, check your implementation of equals() and .hashcode() on the AssignedService class. Most likely the root cause is found here.
You could also check the return value of the .addAll(...) call - false would indicate that the underlying Set isn't modified by the method call.
Cheers,
Check the implementation of equals() in AssignedService.
Set: A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.

Categories