Java Collections-> Hashset - java

I want to handle a set of objects of class (MyClass) in a HashSet. When I try to add an object that already exists (relying on equals an hashCode of MyClass), the method return false. Is there a way/method to get in return the actual object that already exists?
Please give me any advice to handle that collection of object be able to get the existing object in return when add returns false?

Just check if the hashset contains you're object:
if (hashSet.contains(obj)) {
doWhateverWith(obj);
}

Short of iterating over the set, no, there is no way to get the existing member of the set that is equal to the value just added. The best way to do that would be to write a set wrapper around HashMap that maps each added value to itself.

If equals(..) returns true, then the objects are the same, so you can use the one you are trying to add to the set.

Why would you let it return the object which you're trying to add? You already have it there!
Just do something like:
if (!set.add(item)) {
// It already contains the item.
doSomethingWith(item);
}
If that does not achieve the desired result, then it simply means that the item's equals() is poorly implemented.

One possible way would be:
myClass[] myArray = mySet.toArray(new myClass[mySet.size]);
List myList = Arrays.asList(myArray);
MyClass myObject = myList.get(myList.indexof(myObject));
But of course as some people pointed out if it failed to get inserted, then that element is the element you are looking for, unless of course that you want what is stored in that memory location, and not what the equals and hashCode tells you, i.e. not the logically equal object, but the == object.

When using a HashSet, no, as far as I know you can't do that except by iterating over the whole thing and calling equals() on each one. You could, however, use a HashMap and just map every object to itself. Then call put(), which will return the previously mapped value, if any.

http://download.oracle.com/javase/6/docs/api/java/util/HashSet.html#contains(java.lang.Object)
You can use that method to check if you like; but the thing you have to keep in mind is that you already have the object that exists.

Related

Check all values of object in list are unique [duplicate]

I want to remove duplicates from a list but what I am doing is not working:
List<Customer> listCustomer = new ArrayList<Customer>();
for (Customer customer: tmpListCustomer)
{
if (!listCustomer.contains(customer))
{
listCustomer.add(customer);
}
}
Assuming you want to keep the current order and don't want a Set, perhaps the easiest is:
List<Customer> depdupeCustomers =
new ArrayList<>(new LinkedHashSet<>(customers));
If you want to change the original list:
Set<Customer> depdupeCustomers = new LinkedHashSet<>(customers);
customers.clear();
customers.addAll(dedupeCustomers);
If the code in your question doesn't work, you probably have not implemented equals(Object) on the Customer class appropriately.
Presumably there is some key (let us call it customerId) that uniquely identifies a customer; e.g.
class Customer {
private String customerId;
...
An appropriate definition of equals(Object) would look like this:
public boolean equals(Object obj) {
if (obj == this) {
return true;
}
if (!(obj instanceof Customer)) {
return false;
}
Customer other = (Customer) obj;
return this.customerId.equals(other.customerId);
}
For completeness, you should also implement hashCode so that two Customer objects that are equal will return the same hash value. A matching hashCode for the above definition of equals would be:
public int hashCode() {
return customerId.hashCode();
}
It is also worth noting that this is not an efficient way to remove duplicates if the list is large. (For a list with N customers, you will need to perform N*(N-1)/2 comparisons in the worst case; i.e. when there are no duplicates.) For a more efficient solution you could use a HashSet to do the duplicate checking. Another option would be to use a LinkedHashSet as explained in Tom Hawtin's answer.
java 8 update
you can use stream of array as below:
Arrays.stream(yourArray).distinct()
.collect(Collectors.toList());
Does Customer implement the equals() contract?
If it doesn't implement equals() and hashCode(), then listCustomer.contains(customer) will check to see if the exact same instance already exists in the list (By instance I mean the exact same object--memory address, etc). If what you are looking for is to test whether or not the same Customer( perhaps it's the same customer if they have the same customer name, or customer number) is in the list already, then you would need to override equals() to ensure that it checks whether or not the relevant fields(e.g. customer names) match.
Note: Don't forget to override hashCode() if you are going to override equals()! Otherwise, you might get trouble with your HashMaps and other data structures. For a good coverage of why this is and what pitfalls to avoid, consider having a look at Josh Bloch's Effective Java chapters on equals() and hashCode() (The link only contains iformation about why you must implement hashCode() when you implement equals(), but there is good coverage about how to override equals() too).
By the way, is there an ordering restriction on your set? If there isn't, a slightly easier way to solve this problem is use a Set<Customer> like so:
Set<Customer> noDups = new HashSet<Customer>();
noDups.addAll(tmpListCustomer);
return new ArrayList<Customer>(noDups);
Which will nicely remove duplicates for you, since Sets don't allow duplicates. However, this will lose any ordering that was applied to tmpListCustomer, since HashSet has no explicit ordering (You can get around that by using a TreeSet, but that's not exactly related to your question). This can simplify your code a little bit.
List → Set → List (distinct)
Just add all your elements to a Set: it does not allow it's elements to be repeated. If you need a list afterwards, use new ArrayList(theSet) constructor afterwards (where theSet is your resulting set).
I suspect you might not have Customer.equals() implemented properly (or at all).
List.contains() uses equals() to verify whether any of its elements is identical to the object passed as parameter. However, the default implementation of equals tests for physical identity, not value identity. So if you have not overwritten it in Customer, it will return false for two distinct Customer objects having identical state.
Here are the nitty-gritty details of how to implement equals (and hashCode, which is its pair - you must practically always implement both if you need to implement either of them). Since you haven't shown us the Customer class, it is difficult to give more concrete advice.
As others have noted, you are better off using a Set rather than doing the job by hand, but even for that, you still need to implement those methods.
private void removeTheDuplicates(List<Customer>myList) {
for(ListIterator<Customer>iterator = myList.listIterator(); iterator.hasNext();) {
Customer customer = iterator.next();
if(Collections.frequency(myList, customer) > 1) {
iterator.remove();
}
}
System.out.println(myList.toString());
}
The "contains" method searched for whether the list contains an entry that returns true from Customer.equals(Object o). If you have not overridden equals(Object) in Customer or one of its parents then it will only search for an existing occurrence of the same object. It may be this was what you wanted, in which case your code should work. But if you were looking for not having two objects both representing the same customer, then you need to override equals(Object) to return true when that is the case.
It is also true that using one of the implementations of Set instead of List would give you duplicate removal automatically, and faster (for anything other than very small Lists). You will still need to provide code for equals.
You should also override hashCode() when you override equals().
Nearly all of the above answers are right but what I suggest is to use a Map or Set while creating the related list, not after to gain performance. Because converting a list to a Set or Map and then reconverting it to a List again is a trivial work.
Sample Code:
Set<String> stringsSet = new LinkedHashSet<String>();//A Linked hash set
//prevents the adding order of the elements
for (String string: stringsList) {
stringsSet.add(string);
}
return new ArrayList<String>(stringsSet);
Two suggestions:
Use a HashSet instead of an ArrayList. This will speed up the contains() checks considerably if you have a long list
Make sure Customer.equals() and Customer.hashCode() are implemented properly, i.e. they should be based on the combined values of the underlying fields in the customer object.
As others have mentioned, you are probably not implementing equals() correctly.
However, you should also note that this code is considered quite inefficient, since the runtime could be the number of elements squared.
You might want to consider using a Set structure instead of a List instead, or building a Set first and then turning it into a list.
The cleanest way is:
List<XXX> lstConsultada = dao.findByPropertyList(YYY);
List<XXX> lstFinal = new ArrayList<XXX>(new LinkedHashSet<GrupoOrigen>(XXX));
and override hascode and equals over the Id's properties of each entity
IMHO best way how to do it these days:
Suppose you have a Collection "dups" and you want to create another Collection containing the same elements but with all duplicates eliminated. The following one-liner does the trick.
Collection<collectionType> noDups = new HashSet<collectionType>(dups);
It works by creating a Set which, by definition, cannot contain duplicates.
Based on oracle doc.
The correct answer for Java is use a Set. If you already have a List<Customer> and want to de duplicate it
Set<Customer> s = new HashSet<Customer>(listCustomer);
Otherise just use a Set implemenation HashSet, TreeSet directly and skip the List construction phase.
You will need to override hashCode() and equals() on your domain classes that are put in the Set as well to make sure that the behavior you want actually what you get. equals() can be as simple as comparing unique ids of the objects to as complex as comparing every field. hashCode() can be as simple as returning the hashCode() of the unique id' String representation or the hashCode().
Using java 8 stream api.
List<String> list = new ArrayList<>();
list.add("one");
list.add("one");
list.add("two");
System.out.println(list);
Collection<String> c = list.stream().collect(Collectors.toSet());
System.out.println(c);
Output:
Before values : [one, one, two]
After Values : [one, two]

How HashSet works with regards to hashCode()?

I'm trying to understand java.util.Collection and java.util.Map a little deeper but I have some doubts about HashSet funcionality:
In the documentation, it says: This class implements the Set interface, backed by a hash table (actually a HashMap instance). Ok, so I can see that a HashSet always has a Hashtable working in background. A hashtable is a struct that asks for a key and a value everytime you want to add a new element to it. Then, the value and the key are stored in a bucket based on the key hashCode. If the hashcodes of two keys are the same, they add both key values to the same bucket, using a linkedlist. Please, correct me if I said something wrong.
So, my question is: If a HashSet always has a Hashtable acting in background, then everytime we add a new element to the HashSet using HashSet.add() method, the HashSet should add it to its internal Hashtable. But, the Hashtable asks for a value and a key, so what key does it use? Does it just uses the value we're trying to add also as a key and then take its hashCode? Please, correct me if I said something wrong about HashSet implementation.
Another question that I have is: In general, what classes can use the hashCode() method of an java object? I'm asking this because, in the documentation, it says that everytime we override equals() method we need to override hashCode() method. Ok, it really makes sense, but my doubt is if it's just a recommendation we should do to keep everything 'nice and perfect' (putting in this way), or if it's really necessary, because maybe a lot of Java defaults classes will constantly uses hashCode() method of your objects. In my vision, I can't see other classes using this method instead of those classes related to Collections. Thank you very much guys
If you look at the actual javacode of HashSet you can see what it does:
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
...
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
So the element you are adding is the Key in the backing hashmap with a dummy value as the value. this dummy value is never actually used by the hashSet.
Your second question regarding overriding equals and hashcode:
It is really necessary to always override both if you want to override either one. This is because the contract for hashCode says equal objects must have the same hashcode. the default implementation of hashcode will give different values for each instance.
Therefore, if you override equals() but not hashcode() This could happen
object1.equals(object2) //true
MySet.add(object1);
MySet.contains(object2); //false but should be true if we overrode hashcode()
Since contains will use hashcode to find the bucket to search in we might get a different bucket back and not find the equal object.
If you look at the source for HashSet (the source comes with the JDK and is very informative), you will see that it creates an object to use as the value:
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
Each value that is added to the HashSet is used as a key to the backing HashMap with this PRESENT object as the value.
Regarding overriding equals() whenever you override hashCode() (and vice versa), it is very important that these two methods return consistent results. That is, they should agree with one another. For more details, see the book Effective Java by Josh Bloch.

How could a LinkedHashMap fail to find an entry produced by an iterator?

Under what circumstances given a correct implementation of hashCode and equals() can the following code return false?
myLinkedHashMap.containsKey(myLinkedHashMap.keySet().iterator().next())
Most likely scenario I can think of would be even though hashCode is "deterministic", it may be based on mutable fields. If you change the fields used to compute hashCode after it's put in the Map, then you won't be able to find it anymore.
Edit: should clarify you 'usually' won't be able to find it anymore. Occasionally it will still work since two numbers can still rehash into the same bucket. This, of course, only adds to the confusion when it happens!
Every hash algorithm I have seen is "deterministic", in that for a given set of input values, you get the same hash value.
If the hash code is computed based on mutable properties of the object, the hash code will change after it's in the hash map if any of those mutable properties are changed.
It's not clear what you mean by "deterministic", but any hash-changing mutation to the key after it's been inserted into the hash map could easily have that effect.
import java.util.*;
public class Test {
public static void main(String[] args) {
List<String> strings = new ArrayList<String>();
Map<List<String>, String> map = new LinkedHashMap<List<String>, String>();
map.put(strings, "");
System.out.println(map.containsKey(map.keySet().iterator().next())); // true
strings.add("Foo");
System.out.println(map.containsKey(map.keySet().iterator().next())); // false
}
}
The hash code of ArrayList<T> is deterministic, but that doesn't mean it won't change if the contents of the list changes.
If the hashCode() is based on instance attributes that are mutable and those attributes are changed after the insertion, the hashCode() call during the iteration will return something different. And the equals() should be based on these same attributes, it will be expected to fail as well.
When another thread has removed all the next items from an Map in the middle of an iteration, there will be no more next().
I would not use the hashCode() values as keys, I would you the objects themselves.
If your hashCode and equals don't agree with one another, this could return false. For example, if the equals method always returns false, this will return false, since there isn't any object that would ever compare equal to the keys in the map.
Hope this helps!
You might want to check hasNext() first.
You could remove the first key in another thread between getting the first keys and calling containsKey.

Understanding contains method of Java HashSet

Newbie question about java HashSet
Set<User> s = new HashSet<User>();
User u = new User();
u.setName("name1");
s.add(u);
u.setName("name3");
System.out.println(s.contains(u));
Can someone explain why this code output false ? Moreover this code does not even call equals method of User. But according to the sources of HashSet and HashMap it have to call it. Method equals of User simply calls equals on user's name. Method hashCode return hashCode of user's name
If the hash code method is based on the name field, and you then change it after adding the object, then the second contains check will use the new hash value, and won't find the object you were looking for. This is because HashSets first search by hash code, so they won't bother calling equals if that search fails.
The only way this would work is if you hadn't overridden equals (and so the default reference equality was used) and you got lucky and the hash codes of the two objects were equal. But this is a really unlikely scenario, and you shouldn't rely on it.
In general, you should never update an object after you have added it to a HashSet if that change will also change its hashcode.
Since your new User has a different hashcode, the HashSet knows that it isn't equal.
HashSets store their items according to their hashcodes.
The HashSet will only call equals if it finds an item with the same hashcode, to make sure that the two items are actually equal (as opposed to a hash collision)

Java List with Objects - find and replace (delete) entry if Object with certain attribute already exists

I've been working all day and I somehow can't get this probably easy task figured out - probably a lack of coffee...
I have a synchronizedList where some Objects are being stored. Those objects have a field which is something like an ID. These objects carry information about a user and his current state (simplified).
The point is, that I only want one object for each user. So when the state of this user changes, I'd like to remove the "old" entry and store a new one in the List.
protected static class Objects{
...
long time;
Object ID;
...
}
...
if (Objects.contains(ID)) {
Objects.remove(ID);
Objects.add(newObject);
} else {
Objects.add(newObject);
}
Obviously this is not the way to go but should illustrate what I'm looking for...
Maybe the data structure is not the best for this purpose but any help is welcome!
EDIT:
Added some information...
A Set does not really seem to fit my purpose. The Objects store some other fields besides the ID which change all the time. The purpose is, that the list will somehow represent the latest activities of a user. I only need to track the last state and only keep that object which describes this situation.
I think I will try out re-arranging my code with a Map and see if that works...
You could use a HashMap (or LinkedHashMap/TreeMap if order is important) with a key of ID and a value of Objects. With generics that would be HashMap<Object, Objects>();
Then you can use
if (map.containsKey(ID)) {
map.remove(ID);
}
map.put(newID, newObject);
Alternatively, you could continue to use a List, but we can't just modify the collection while iterating, so instead we can use an iterator to remove the existing item, and then add the new item outside the loop (now that you're sure the old item is gone):
List<Objects> syncList = ...
for (Iterator<Objects> iterator = syncList.iterator(); iterator.hasNext();) {
Objects current = iterator.next();
if (current.getID().equals(ID)) {
iterator.remove();
}
}
syncList.add(newObject);
And you can't use a Set to have only the first one stored ?
because it basically is precisely what you require.
You could use a HashSet to store the objects and then override the hashCode method in the class that the HashSet will contain to return the hashcode of your identifying field.
A Map is easiest, but a Set reflects your logic better. In that case I'd advice a Set.
There are 2 ways to use a set, depending on the equals and hashCode of your data object.
If YourObject already uses the ID object to determine equals (and hashCode obeys the contract) you can use any Set you want, a HashSet is probably best then.
If YourObjects business logic requires a different equals, taking into account multiple fields beside the ID field, then a custom comparator should be used. A TreeSet is a Set which can use such a Comparator.
An example:
Comparator<MyObject> comp = new Comparator<MyObject>{
public int compare(MyObject o1, MyObject o2) {
// NOTE this compare is not very good as it obeys the contract but
// is not consistent with equals. compare() == 0 -> equals() != true here
// Better to use some more fields
return o1.getId().hashCode() < o2.getId().hashCode();
}
public boolean equals(Object other) {
return 01.getId().equals(o2.getId());
}
}
Set<MyObject> myObjects = new TreeSet(comp);
EDIT
I have updated the code above to reflect that id is not an int, as suggested by the question.
My first option would be a HashSet, this would require that you override the hashCode and equals methods (don't forget: if you override one, override consistently the other !) so that objects with the same ID field are considered equal.
But this might break something if this assumption is NOT to be made in other parts of your application. In that case you might opt for using a HashMap (with the ID as key) or implement your own MyHashSet class (backed by such a HashMap).

Categories