Java Set removes "complex object" - java

I'm always confused about the Java Collections (set, map) remove "complex object", which I mean some self-defined class rather than just primitive type.
I'm experimenting like:
public class Main {
public static void main(String[] args) {
// set
Set<Node> set = new HashSet<>();
set.add(new Node(1,2));
set.add(new Node(3,4));
System.out.println(set);
set.remove(new Node(1,2));
System.out.println(set + "\n");
// tree set
TreeSet<Node> tset = new TreeSet<>((a, b) -> a.name - b.name);
tset.add(new Node(1,2));
tset.add(new Node(3,4));
System.out.println(tset);
tset.remove(new Node(1,2));
System.out.println(tset);
}
}
class Node {
int name;
int price;
Node(int name, int price) {
this.name = name;
this.price = price;
}
}
In the example above, the printout would be:
Set:
[Node#5ba23b66, Node#2ff4f00f]
[Node#5ba23b66, Node#2ff4f00f]
TreeSet:
[Node#48140564, Node#58ceff1]
[Node#58ceff1]
Obviously, the general Set can't remove new Node(1, 2), which is treated as a different object. But interestingly the TreeSet can remove, which I think because the hashing code is based on the lambda comparator I defined here?
And if I change to remove new Node(1, 6), interestingly it's the same printout, where obviously the remove in TreeSet is based on only the name value.
I think I still lack of deep understanding of how Set build up hashing and how comparator would affect this.

For HashMap and HashSet, you need to overwrite hashCode() and equals(Object), where if two objects are equal, they should have equal hash codes. E.g., in your case, you could implement it like this:
#Override
public boolean equals(Object o) {
if (o == null || getClass() != o.getClass()) {
return false;
}
Node node = (Node) o;
return name == node.name && price == node.price;
}
#Override
public int hashCode() {
return Objects.hash(name, price);
}
For TreeMap and TreeSet, the notion of equality is based on a comparison (whether the class implements Comparable, or you supply a custom Comparator). In the code you provided, you have a custom Comparator that only takes the name in to consideration, so it would consider any two Nodes with the same name as being equal, regardless of their price.

javadoc comes to rescue
https://docs.oracle.com/javase/7/docs/api/java/util/Set.html#remove(java.lang.Object)
Removes the specified element from this set if it is present (optional
operation). More formally, removes an element e such that (o==null ?
e==null : o.equals(e)), if this set contains such an element. Returns
true if this set contained the element (or equivalently, if this set
changed as a result of the call). (This set will not contain the
element once the call returns.)
So just change your Node class and override equals(Object o) method with your own.

Related

Distinguish objects by different fields for different contexts

Let say there is such immutable class:
public class Foo {
private final Long id;
private final String name;
private final LocalDate date;
public Foo(Long id, String name, LocalDate date) {
this.id = id;
this.name = name;
this.date = date;
}
public Long getId() {
return id;
}
public String getName() {
return name;
}
public LocalDate getDate() {
return date;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Foo foo = (Foo) o;
return Objects.equals(getId(), foo.getId()) &&
Objects.equals(getName(), foo.getName()) &&
Objects.equals(getDate(), foo.getDate());
}
#Override
public int hashCode() {
return Objects.hash(getId(), getName(), getDate());
}
}
There is a collection of objects of this class. In some cases, it is required to distinguish only by name and in some cases by name and date.
So pass collection to java.util.Set<Foo> or create java 8 Stream<Foo> calling .distinct() method is not working for this case.
I know it is possible to distinguish using TreeSet and Comparator. It looks like this:
private Set<Foo> distinct(List<Foo> foos, Comparator<Foo> comparator) {
TreeSet<Foo> treeSet = new TreeSet<>(comparator);
treeSet.addAll(foos);
return treeSet;
}
usage:
distinct(foos, Comparator.comparing(Foo::getName)); // distinct by name
distinct(foos, Comparator.comparing(Foo::getName).thenComparing(Foo::getDate)); // distinct by name and date
But I think that is not a good way to do it.
What's the most elegant way to solve this problem?
First, let's consider your current approach, then I'll show a better alternative.
Your current approach is succinct, yet uses a TreeMap when all you need is a TreeSet. If you are OK with the O(nlogn) complexity imposed by the red/black tree structure of TreeMap, I would only change your current code to:
public static <T> Set<T> distinct(
Collection<? extends T> list,
Comparator<? super T> comparator) {
Set<T> set = new TreeSet<>(comparator);
set.addAll(list);
return set;
}
Note that I've made your method generic and static, so that it can be used in a generic way for any collection, no matter the type of its elements. I've also changed the first argument to Collection, so that it can be used with more data structures.
Also, TreeSet still has O(nlogn) time complexity because it uses a TreeMap as its backing structure.
The usage of TreeSet has 3 disadvantages: first, it sorts your elements according to the passed Comparator (maybe you don't need this); second, time complexity is O(nlogn) (which might be way too much if all you require is to have distinct elements); and third, it returns a Set (which might not be the type of collection the caller needs).
So, here's another approach that returns a Stream, which you can then collect to the data-structure you want:
public static <T> Stream<T> distinctBy(
Collection<? extends T> list,
Function<? super T, ?>... extractors) {
Map<List<Object>, T> map = new LinkedHashMap<>(); // preserves insertion order
list.forEach(e -> {
List<Object> key = new ArrayList<>();
Arrays.asList(extractors)
.forEach(f -> key.add(f.apply(e))); // builds key
map.merge(key, e, (oldVal, newVal) -> oldVal); // keeps old value
});
return map.values().stream();
}
This converts every element of the passed collection to a list of objects, according to the extractor functions passed as the varargs argument.
Then, each element is put into a LinkedHashMap with this key and merged by means of preserving the initially put value (change this as per your needs).
Finally, a stream is returned from the values of the map, so that the caller can do whatever she wants with it.
Note: this approach requires that all the objects returned by the extractor functions implement the equals and hashCode methods consistently, so that the list formed by them can be safely used as the key of the map.
Usage:
List<Foo> result1 = distinctBy(foos, Foo::getName)
.collect(Collectors.toList());
Set<Foo> result2 = distinctBy(foos, Foo::getName, Foo::getDate)
.collect(Collectors.toSet());

Using PriorityQueue with any Comparator in Java

Common question: How to use different Comparators of the custom class for sorting sequence its objects in PriorityQueue?
I tried to do that using this comparators in appropriate pairs of priorityqueues and lists of the objects with expected similar sorting results in the next code:
class User{
private Integer id;
private String name;
public User(Integer i, String n){
this.id=i;
this.name=n;
}
public Integer getId() {return id;}
public String getName() {return name;}
#Override
public boolean equals(Object obj) {
if (this == obj)return true;
if (obj == null)return false;
if (getClass() != obj.getClass())return false;
User other = (User) obj;
if(id == null){
if (other.id != null)return false;
}else if(!id.equals(other.id))return false;
return true;
}
#Override
public String toString() {return "[id:" + id + ", name:" + name + "]";}
}
public class MyPriorityQueue {
public static Comparator<User> cmpId = Comparator.comparingInt(x -> x.getId());
public static Comparator<User> cmpNameLength = Comparator.comparingInt(x -> x.getName().length());
public static void main(String[] args) {
List<User> users = new ArrayList<User>(10);
users.add(new User(1,"11111"));
users.add(new User(3,"333"));
users.add(new User(5,"5"));
users.add(new User(4,"44"));
users.add(new User(2,"2222"));
Queue<User> ids = new PriorityQueue<User>(10, cmpId); //use first comparator
users.forEach(x-> ids.offer(x));
Queue<User> names = new PriorityQueue<User>(10, cmpNameLength); //use second comparator
names.addAll(users);
System.out.println("Variant_1.1:");
ids.forEach(System.out::println);
System.out.println("Variant_2.1:");
names.forEach(System.out::println);
System.out.println("Variant_1.2:");
users.sort(cmpId); //use first comparator
users.forEach(System.out::println);
System.out.println("Variant_2.2:");
users.sort(cmpNameLength); //use second comparator
users.forEach(System.out::println);
}
}
Output:
Variant_1.1: //Failed sorted queue by user.id with using comporator cmpId
[id:1, name:11111]
[id:2, name:2222]
[id:5, name:5]
[id:4, name:44]
[id:3, name:333]
Variant_2.1: //Failed sorted queue by length of the user.name with cmpNameLength
[id:5, name:5]
[id:4, name:44]
[id:3, name:333]
[id:1, name:11111]
[id:2, name:2222]
Variant_1.2: // OK: correctly sorted list by user.id with cmpId comporator
[id:1, name:11111]
[id:2, name:2222]
[id:3, name:333]
[id:4, name:44]
[id:5, name:5]
Variant_2.2: //OK: for list by length of the user.name with cmpNameLength
[id:5, name:5]
[id:4, name:44]
[id:3, name:333]
[id:2, name:2222]
[id:1, name:11111]
I expected that the:
results of the variant 1.1 and 2.1;
results of the variant 1.2 and 2.2;
will be same, but they were different.
My questions: What have I done wrong for ordering priorytyqueue/comparator and How to get sorting result for the priorityqueue as for the appropriate list in my example?
You haven't done anything wrong, it's just that PriorityQueue's iterator is:
not guaranteed to traverse the elements of the priority queue in any particular order (Javadoc)
The forEach method internally uses the iterator, so the same problem exists.
This is because the underlying data structure is such that it "sorts" as you deque items. If the implementor wanted to return items in sorted order, they would have had to first collect the items, and then sort them before returning them. This incurs a performance hit, and so (I presume) it was decided to return it unordered, because PriorityQueue is primarily a queue, rather than a sorted collection, and a user could always sort the item themselves (which is as efficient as it gets).
In order to obtain the elements ordered, do something like:
while(pq.peek() != null){
System.out.println(pq.poll());
}
In your code sample, you are using Iterable#forEach to iterate through the queues.
ids.forEach(System.out::println);
names.forEach(System.out::println);
forEach ultimately delegates into Iterable#iterator. However, it's important to note that the subclass override in PriorityQueue#iterator has different JavaDocs with a special note about ordering.
Returns an iterator over the elements in this queue. The iterator does not return the elements in any particular order.
In other words, there is no guarantee that iterating over a PriorityQueue will use your Comparator. If instead you changed your code to drain the queue by repeatedly calling PriorityQueue#poll, then I expect you'd see results ordered according to your custom Comparator.
Digging into the OpenJDK source, we can see that the internal data structure inside PriorityQueue is a binary heap. This is backed by an array, and as callers add and remove elements of the queue, the code internally maintains the heap invariant.
/**
* Priority queue represented as a balanced binary heap: the two
* children of queue[n] are queue[2*n+1] and queue[2*(n+1)]. The
* priority queue is ordered by comparator, or by the elements'
* natural ordering, if comparator is null: For each node n in the
* heap and each descendant d of n, n <= d. The element with the
* lowest value is in queue[0], assuming the queue is nonempty.
*/
transient Object[] queue; // non-private to simplify nested class access
However, the internal Iterator implementation simply uses an integer cursor to scan forward through that array, with no consideration of element priorities or heap layout.
return (E) queue[lastRet = cursor++];

Element is present but `Set.contains(element)` returns false

How can an element not be contained in the original set but in its unmodified copy?
The original set does not contain the element while its copy does. See image.
The following method returns true, although it should always return false. The implementation of c and clusters is in both cases HashSet.
public static boolean confumbled(Set<String> c, Set<Set<String>> clusters) {
return (!clusters.contains(c) && new HashSet<>(clusters).contains(c));
}
Debugging has shown that the element is contained in the original, but Set.contains(element) returns false for some reason. See image.
Could somebody please explain to me what's going on?
If you change an element in the Set (in your case the elements are Set<String>, so adding or removing a String will change them), Set.contains(element) may fail to locate it, since the hashCode of the element will be different than what it was when the element was first added to the HashSet.
When you create a new HashSet containing the elements of the original one, the elements are added based on their current hashCode, so Set.contains(element) will return true for the new HashSet.
You should avoid putting mutable instances in a HashSet (or using them as keys in a HashMap), and if you can't avoid it, make sure you remove the element before you mutate it and re-add it afterwards. Otherwise your HashSet will be broken.
An example :
Set<String> set = new HashSet<String>();
set.add("one");
set.add("two");
Set<Set<String>> setOfSets = new HashSet<Set<String>>();
setOfSets.add(set);
boolean found = setOfSets.contains(set); // returns true
set.add("three");
Set<Set<String>> newSetOfSets = new HashSet<Set<String>>(setOfSets);
found = setOfSets.contains(set); // returns false
found = newSetOfSets.contains(set); // returns true
The most common reason for this is that the element or key was altered after insertion resulting in a corruption of the underlying data structure.
note: when you add a reference to a Set<String> to another Set<Set<String>> you are adding a copy of the reference, the underlyingSet<String> is not copied and if you alter it these changes which affect the Set<Set<String>> you put it into.
e.g.
Set<String> s = new HashSet<>();
Set<Set<String>> ss = new HashSet<>();
ss.add(s);
assert ss.contains(s);
// altering the set after adding it corrupts the HashSet
s.add("Hi");
// there is a small chance it may still find it.
assert !ss.contains(s);
// build a correct structure by copying it.
Set<Set<String>> ss2 = new HashSet<>(ss);
assert ss2.contains(s);
s.add("There");
// not again.
assert !ss2.contains(s);
If the primary Set was a TreeSet (or perhaps some other NavigableSet) then it is possible, if your objects are imperfectly compared, for this to happen.
The critical point is that HashSet.contains looks like:
public boolean contains(Object o) {
return map.containsKey(o);
}
and map is a HashMap and HashMap.containsKey looks like:
public boolean containsKey(Object key) {
return getNode(hash(key), key) != null;
}
so it uses the hashCode of the key to check for presence.
A TreeSet however uses a TreeMap internally and it's containsKey looks like:
final Entry<K,V> getEntry(Object key) {
// Offload comparator-based version for sake of performance
if (comparator != null)
return getEntryUsingComparator(key);
...
So it uses a Comparator to find the key.
So, in summary, if your hashCode method does not agree with your Comparator.compareTo method (say compareTo returns 1 while hashCode returns different values) then you will see this kind of obscure behaviour.
class BadThing {
final int hash;
public BadThing(int hash) {
this.hash = hash;
}
#Override
public int hashCode() {
return hash;
}
#Override
public String toString() {
return "BadThing{" + "hash=" + hash + '}';
}
}
public void test() {
Set<BadThing> primarySet = new TreeSet<>(new Comparator<BadThing>() {
#Override
public int compare(BadThing o1, BadThing o2) {
return 1;
}
});
// Make the things.
BadThing bt1 = new BadThing(1);
primarySet.add(bt1);
BadThing bt2 = new BadThing(2);
primarySet.add(bt2);
// Make the secondary set.
Set<BadThing> secondarySet = new HashSet<>(primarySet);
// Have a poke around.
test(primarySet, bt1);
test(primarySet, bt2);
test(secondarySet, bt1);
test(secondarySet, bt2);
}
private void test(Set<BadThing> set, BadThing thing) {
System.out.println(thing + " " + (set.contains(thing) ? "is" : "NOT") + " in <" + set.getClass().getSimpleName() + ">" + set);
}
prints
BadThing{hash=1} NOT in <TreeSet>[BadThing{hash=1}, BadThing{hash=2}]
BadThing{hash=2} NOT in <TreeSet>[BadThing{hash=1}, BadThing{hash=2}]
BadThing{hash=1} is in <HashSet>[BadThing{hash=1}, BadThing{hash=2}]
BadThing{hash=2} is in <HashSet>[BadThing{hash=1}, BadThing{hash=2}]
so even though the object is in the TreeSet it is not finding it because the comparator never returns 0. However, once it is in the HashSet all is fine because HashSet uses hashCode to find it and they behave in a valid way.

Comparing two Lists of a class without iterating the lists

I have a class Abc as below
public class Abc {
int[] attributes;
Abc(int[] attributes){
this.attributes = attributes;
}
}
Overriding the Abc hash code as below
#Override
public int hashCode() {
int hashCode = 0;
int multiplier = 1;
for(int i = attributes.length-1 ; i >= 0 ; i++){
hashCode = hashCode+(attributes[i]*multiplier);
multiplier = multiplier*10;
}
return hashCode;
}
I am using above class to create a list of objects and I want to compare whether the two lists are equal i.e. lists having objects with same attributes.
List<Abc> list1 ;
list1.add(new Abc(new int[]{1,2,4}));
list1.add(new Abc(new int[]{5,8,9}));
list1.add(new Abc(new int[]{3,4,2}));
List<Abc> list2;
list2.add(new Abc(new int[]{5,8,9}));
list2.add(new Abc(new int[]{3,4,2}));
list2.add(new Abc(new int[]{1,2,4}));
How can I compare the above two lists with/without iterating over each list . Also is there any better way to override the hashcode , so that two classes having the same attributes(values and order) should be equal.
You have to override the function equals in your class Abc. If you are using an IDE, it can be used to generates something good enough. For example, Eclipse produces the following:
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
Abc other = (Abc) obj;
if (!Arrays.equals(attributes, other.attributes)) {
return false;
}
return true;
}
With this equals method, you can now check that two instance of Abc are equal.
If you want to compare your two lists list1 and list2, unfortunately you can not simply do
boolean listsAreEqual = list1.equals(list2); // will be false
because that would not only check if the elements in the lists are the same but also if they are in the same order. What you can do is to compare two sets, because in sets, the elements have no order.
boolean setAreEqual = new HashSet<Abc>(list1).equals(new HashSet<Abc>(list2)); // will be true.
Note that in that case, you should keep your implementation of hashcode() in Abc, for the HashSet to function well. As a general rule, a class that implements equals should also implement hashcode.
The problem with a Set (HashSet are Set) is that by design it will not contain several objects which are equal with each other. Objects are guaranteed to be unique in a set. For example, if you add a new new Abc(new int[]{5,8,9}) in the second set, the two sets will still be equal.
If it bothers you then the possible solution is either to compare two lists, but after having sorted them beforehand (for that you have to provide a comparator or implements compareTo), or use Guava's HashMultiset, which is an unordered container that can contain the same objects multiple times.
Override the equals method to compare objects. As the comments mention, you should be overriding the hashcode method as well when overriding equals method.
By this
so that two classes having the same attributes(values and order) should be equal.
i think you mean two objects having same attributes.
you can try something like this
public boolean equals(Object o) {
if(!(Object instanceOf Abc)) {
return false;
}
Abc instance = (Abc)o;
int[] array = instance.attributes;
for(i=0;i<array.length;i++){
if(array[i]!=this.attributes[i]) {
return false;
}
}
}
Edit: As for the hashcode the concept is that when
object1.equals(object2)
is true, then
object1.hashcode()
and
object2.hashcode()
must return the same value. and hashCode() of an object should be same and consistent through the entire lifetime of it. so generating hashcode based on the value of its instance variables is not a good option as a different hashcode may be generated when the instance variable value changes.

How do I add an item to a linked list in Java?

Using a Comparator and Iterator, I am trying to add objects into a linked list in order. So far, I have the following:
public class ComparatorClass implements Comparator<Integer> {
public int compare(Integer int1, Integer int2) {
return int1.compareTo(int2);
}
}
and:
import java.util.ArrayList;
import java.util.Comparator;
import java.util.Iterator;
public class OrderedListInheritance implements LinkedList {
ArrayList<Object> myList = new ArrayList<Object>();
Comparator comp = new ComparatorClass();
OrderedListInheritance(Comparator c) {
this.comp = c;
}
#Override
public void add(Object o) {
addLast(o);
}
#Override
public void addAtIndex(int index, Object o) {
Iterator it = getIterator();
while (it.hasNext()) {
Object element = it.next();
if (comp.compare(element, o) < 0) {
}else if (comp.compare(element, o) == 0) {
}else{
myList.add(o);
}
}
}
#Override
public void addFirst(Object o) {
addAtIndex(0, o);
}
#Override
public void addLast(Object o) {
addAtIndex(myList.size(), o);
}
#Override
public Object get(int index) {
return myList.get(index);
}
#Override
public Iterator getIterator() {
Iterator iter = myList.iterator();
return iter;
}
#Override
public int indexOf(Object o) {
return myList.indexOf(o);
}
}
I am unsure how to use the Iterator in conjunction with the comparator to add each element to the Linked List in order. Can somebody help me with the logic?
Your comparator is wrong.
Part of the general contract for a comparator is that if compare(a, b) is positive, compare(b, a) is negative.
If you pass in a comparator that does not fulfil the comparator contract, you're going to get undefined behaviour.
If you implement the add method to insert the element in sorted order (or anywhere but the end of the list), you are violating the contract of the List interface. Semantically, it's not a List, and it isn't safe to pass it to any code that is expecting one. Pretending to implement the List interface will only lead to trouble.
How about using a TreeSet? instead?
Set<Integer> list = new TreeSet<Integer>();
Of course, a Set will not permit duplicate elements.
If you want something that allows duplicates, but still allows efficient, in-order retrieval, try a heap-based collection, like PriorityQueue.
Is there any reason that you can't just use a normal ArrayList and then call Collections.sort(list) or Collections.sort(list, comparator)?
I would write it like this:
public class IntegerComparator
implements Comparator<Integer>
{
public int compare(final Integer a, final Integer b)
{
return (a.compareTo(b));
}
}
I was going to comment more on the code... but what you have given won't work... you declare an ArrayList but then want to use a Comparator and you are not casting... so that won't compile.
The other issue is that you should not use == 1 you should use < 0 and > 0 since the comparator may not return 1, 0, -1 but other numbers as well. Also you are not handling all cases a < b, a == b, and a > b.
Are we doing your homework? Your problem seems to be not so much with the comparator interface as a clear understanding of what you want it to do. This sounds to me like a perfect place to advocate a test driven development style. Start by writing tests to insert into an empty list, insert to the head of a list, insert to the tail, insert in the middle of a list of length 2. Then write tests to return the n'th element and the element with a given value. After you have these simple cases working it will be easy to find the element in an ordered list that is the first one larger than the element to be inserted and add the element in front of the larger one. Don't forget the edge cases of adding a duplicate value, a value smaller than any in the list and a value larger than any in the list.

Categories