Remove duplicates by certain key values - java

Say I have my object
class MyObject{
private int id;
private int secondId;
private String name;
private String address;
}
And I'm adding lists of these objects to a list.
List<MyObject> finalList = new ArrayList<MyObject>();
while(someCondition) {
List<MyObject> l = getSomeMoreObjects();
finalList.addAll(l);
}
This is all well and good, except I only want to add the new records to the list if they have a distinct id and secondId.
What would the best way to do this? I'm thinking it would involve using a HashMap.

You'll want to override the hashCode and equals methods in MyObject:
#Override
public int hashCode() {
int hash = 7;
hash = 97 * hash + this.id;
hash = 97 * hash + this.secondId;
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null)
return false;
if (!(obj instanceof MyObject))
return false;
MyObject other = (MyObject) obj;
return this.id == other.id && this.secondId == other.secondId;
}
Then create the HashSet:
HashSet<MyObject> set = new HashSet<>();
Then just add objects to it:
set.add(new MyObject());
The HashSet will ignore your new object if you already have one with the same id and secondId in the set.

Override the equals method in MyObject (two object are equal if and only if they have the same id and second id), and use a HashSet for storing distinct values.
Here is how you override the method:
Override equals method

You may use HashSet. For using HashSet -
override you MyObject equals() method and hashCode().
add l to you HashSet

Related

How does containsKey really work in Java when working with Map? [duplicate]

This question already has answers here:
HashMap way of doing containsKey not behaving as expected
(2 answers)
Closed 3 years ago.
How does containsKey really work? I know that if I do this:
Map<String, Integer> map = new HashMap<>();
map.put("user1", 1);
map.put("user2", 2);
map.put("user3", 3);
System.out.println(map.containsKey("user1")); // true
containsKey returns true
but If I do this:
Map<Person, Integer> table = new HashMap<>();
table.put(new Person("Steve"), 33);
table.put(new Person("Mark"), 29);
System.out.println(table.containsKey(new Person("Steve"))); // false
so why am I getting false even if I have the correct key? How do I check for value of 33 by using its key?
Here you are using String as key
Map<String, Integer> map = new HashMap<>();
map.put("user1", 1);
And String class ia implementing HashCode and Equals method and hence it is working as expected.
While you are using Person class object as key or any custom class, you should make sure that you override Hashcode and equals method.
HashMap implemention uses hashCode to find bucket and the uses equals if there are multiple entries present in bucket.
When working with maps all the objects that need to be stored in the map should implement equals and hashcode
Here is sample Person class that behaves as you would expect:
class Person {
private String name;
public Person(String name) {
this.name = name;
}
public String getName() {
return name;
}
#Override
public boolean equals(Object obj) {
if (obj == null || !(obj instanceof Person)) {
return false;
}
Person other = (Person) obj;
return other.getName().equals(this.name);
}
#Override
public int hashCode() {
return this.name.hashCode();
}
}
HashMap uses equals() to compare whether the keys are equal or not
hashCode() is used to calculate the index in which the item should be inserted

Is retainall() and removeAll() method of Arraylist in java works on the Hash code of Object or the values of object

I am trying to subtract two Arraylist of Custom Objects but the Hashcode of the objects in arraylist are different.
List<QuizObject> list1 = new ArrayList<QuizObject>();
List<QuizObject> list2 = new ArrayList<QuizObject>();
QuizObject obj1 = new QuizObject();
QuizObject obj2 = new QuizObject();
QuizObject obj3 = new QuizObject();
obj1.setName("piyush");
obj2.setName("stuti");
obj3.setName("ayush");
list1.add(obj1);
list1.add(obj2);
list1.add(obj3);
QuizObject obj4 = new QuizObject();
QuizObject obj5 = new QuizObject();
QuizObject obj6 = new QuizObject();
obj4.setName("piyush");
obj5.setName("stuti");
obj6.setName("teri");
list2.add(obj4);
list2.add(obj5);
list2.add(obj6);
list1.removeAll(list2);
Log.d("completezz", "List 1" + list1);
Log.d("completezz", "List 2" + list2);
System.out.println("Set A : " + list1);
System.out.println("Set B : " + list2);
ArrayList does not use the hashCode. It relies on object equality as defined by the equals() method. You need to override equals() suitably for your object, otherwise object identity is used via Object.equals() and ==.
In your QuizObject class, you need to override equals(and hashCode if you plan to use HashMap) methods and write a suitable logic that would compare any two QuizObject objects.
Below is just a sample code that would work in Java 1.7 Environment, but there are plenty of other ways to do this.
import java.util.Objects;
public class QuizObject {
private String name;
//getters and setters
#Override
public boolean equals(Object o) {
if (o == this) return true;
if (!(o instanceof QuizObject)) {
return false;
}
QuizObject quizObject= (QuizObject) o;
return Objects.equals(name, quizObject.name);
}
//Override this method only if you plan to use HashMaps
//an ArrayList does not actually need to use the hashCode() method
//since the order of the elements in an ArrayList is determined by the
//order in which they were inserted
#Override
public int hashCode() {
return Objects.hash(name);
}
}

Comparing two Lists of a class without iterating the lists

I have a class Abc as below
public class Abc {
int[] attributes;
Abc(int[] attributes){
this.attributes = attributes;
}
}
Overriding the Abc hash code as below
#Override
public int hashCode() {
int hashCode = 0;
int multiplier = 1;
for(int i = attributes.length-1 ; i >= 0 ; i++){
hashCode = hashCode+(attributes[i]*multiplier);
multiplier = multiplier*10;
}
return hashCode;
}
I am using above class to create a list of objects and I want to compare whether the two lists are equal i.e. lists having objects with same attributes.
List<Abc> list1 ;
list1.add(new Abc(new int[]{1,2,4}));
list1.add(new Abc(new int[]{5,8,9}));
list1.add(new Abc(new int[]{3,4,2}));
List<Abc> list2;
list2.add(new Abc(new int[]{5,8,9}));
list2.add(new Abc(new int[]{3,4,2}));
list2.add(new Abc(new int[]{1,2,4}));
How can I compare the above two lists with/without iterating over each list . Also is there any better way to override the hashcode , so that two classes having the same attributes(values and order) should be equal.
You have to override the function equals in your class Abc. If you are using an IDE, it can be used to generates something good enough. For example, Eclipse produces the following:
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
Abc other = (Abc) obj;
if (!Arrays.equals(attributes, other.attributes)) {
return false;
}
return true;
}
With this equals method, you can now check that two instance of Abc are equal.
If you want to compare your two lists list1 and list2, unfortunately you can not simply do
boolean listsAreEqual = list1.equals(list2); // will be false
because that would not only check if the elements in the lists are the same but also if they are in the same order. What you can do is to compare two sets, because in sets, the elements have no order.
boolean setAreEqual = new HashSet<Abc>(list1).equals(new HashSet<Abc>(list2)); // will be true.
Note that in that case, you should keep your implementation of hashcode() in Abc, for the HashSet to function well. As a general rule, a class that implements equals should also implement hashcode.
The problem with a Set (HashSet are Set) is that by design it will not contain several objects which are equal with each other. Objects are guaranteed to be unique in a set. For example, if you add a new new Abc(new int[]{5,8,9}) in the second set, the two sets will still be equal.
If it bothers you then the possible solution is either to compare two lists, but after having sorted them beforehand (for that you have to provide a comparator or implements compareTo), or use Guava's HashMultiset, which is an unordered container that can contain the same objects multiple times.
Override the equals method to compare objects. As the comments mention, you should be overriding the hashcode method as well when overriding equals method.
By this
so that two classes having the same attributes(values and order) should be equal.
i think you mean two objects having same attributes.
you can try something like this
public boolean equals(Object o) {
if(!(Object instanceOf Abc)) {
return false;
}
Abc instance = (Abc)o;
int[] array = instance.attributes;
for(i=0;i<array.length;i++){
if(array[i]!=this.attributes[i]) {
return false;
}
}
}
Edit: As for the hashcode the concept is that when
object1.equals(object2)
is true, then
object1.hashcode()
and
object2.hashcode()
must return the same value. and hashCode() of an object should be same and consistent through the entire lifetime of it. so generating hashcode based on the value of its instance variables is not a good option as a different hashcode may be generated when the instance variable value changes.

Is there a Java Class similar to ArrayList that can do this?

I have been running into this problem sometimes when programming.
Imagine I have a table of data with two columns. The first column has strings, the second column has integers.
I want to be able to store each row of the table into a dynamic array. So each element of the array needs to hold a string and an integer.
Previously, I have been accomplishing this by just splitting each column of the table into two separate ArrayLists and then when I want to add a row, I would call the add() method once on each ArrayList. To remove, I would call the remove(index) method once on each ArrayList at the same index.
But isn't there a better way? I know there are classes like HashMap but they don't allow duplicate keys. I am looking for something that allows duplicate entries.
I know that it's possible to do something like this:
ArrayList<Object[]> myArray = new ArrayList<Object[]>();
myArray.add(new Object[]{"string", 123});
I don't really want to have to cast into String and Integer every time I get an element out of the array but maybe this is the only way without creating my own? This looks more confusing to me and I'd prefer using two ArrayLists.
So is there any Java object like ArrayList where it would work like this:
ArrayList<String, Integer> myArray = new ArrayList<String, Integer>();
myArray.add("string", 123);
Just create simple POJO class to hold row data. Don't forget about equals and hashCode and prefer immutable solution (without setters):
public class Pair {
private String key;
private Integer value;
public Pair(String key, Integer value) {
this.key = key;
this.value = value;
}
public String getKey() {
return key;
}
public Integer getValue() {
return value;
}
// autogenerated
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Pair)) return false;
Pair pair = (Pair) o;
if (key != null ? !key.equals(pair.key) : pair.key != null) return false;
if (value != null ? !value.equals(pair.value) : pair.value != null) return false;
return true;
}
#Override
public int hashCode() {
int result = key != null ? key.hashCode() : 0;
result = 31 * result + (value != null ? value.hashCode() : 0);
return result;
}
}
Usage:
List<Pair> list = new ArrayList<Pair>();
list.add(new Pair("string", 123));
Note: in other languages there are build-in solutions for it like case-classes and tuples in Scala.
Create a Row class that holds the data.
package com.stackoverflow;
import java.util.ArrayList;
import java.util.List;
/**
* #author maba, 2012-10-10
*/
public class Row {
private int intValue;
private String stringValue;
public Row(String stringValue, int intValue) {
this.intValue = intValue;
this.stringValue = stringValue;
}
public int getIntValue() {
return intValue;
}
public String getStringValue() {
return stringValue;
}
public static void main(String[] args) {
List<Row> rows = new ArrayList<Row>();
rows.add(new Row("string", 123));
}
}
You can create very simple object, like :
public class Row{
private String strVal;
private Integer intVal;
public Row(String s, Integer i){
strVal = s;
intVal = i;
}
//getters and setters
}
Then use it as follows :
ArrayList<Row> myArray = new ArrayList<Row>();
myArray.add(new Row("string", 123));
Map is the option if you are sure that any one value among integer or string is unique. Then you can put that unique value as a key. If it is not true for your case, creating a simple POJO is best option for you. Infact, if in future, there a chance to come more values (columns) per row then also using a POJO will be less time consuming. You can define POJO like;
public class Data {
private int intValue;
private String strValue;
public int getIntValue() {
return intValue;
}
public void setIntValue(int newInt) {
this.intValue = newInt;
}
public String getStrValue() {
return strValue;
}
public void setStrValue(String newStr) {
this.strValue = newStr;
}
And in the class you can use it like;
ArrayList<Data> dataList = new ArrayList<Data>();
Data data = new Data();
data.setIntValue(123);
data.setStrValue("string");
dataList.add(data);
You should create a class (e.g. Foo) that contains an int and a String.
Then you can create an ArrayList of Foo objects.
List<Foo> fooList = new ArrayList<Foo>();
This is called a map my friend. It is similar to a dictionary in .net
http://docs.oracle.com/javase/6/docs/api/java/util/Map.html
HashMap my be the class you are looking for assuming "string" going to different for different values. Here is documentation on HashMap
Example:
HashMap<String, Integer> tempMap = new HashMap<String, Integer>();
tempMap.put("string", 124);
If you need to add more than one value, you may create HashMap<String, ArrayList> like that.
you can use google collection library Guava there is a Map called Multimap. It is collection similar to a Map, but which may associate multiple values with a single key. If you call put(K, V) twice, with the same key but different values, the multimap contains mappings from the key to both values.
Use Map to solve this problem:
Map<String, Integer> map = new HashMap<String, Integer>();
Eg:
map.put("string", 123);

How to remove duplicates from a list based on a custom java object not a primitive type?

Before I post this question, I found somehow similar question posted here. But the answer was based on a String. However, I have a different situation here. I am not trying to remove String but another object called AwardYearSource. This class has an int attribute called year. So I want to remove duplicates based on the year. i.e if there is year 2010 mentioned more than once, I want to remove that AwardYearSource object. How can I do that?
The simplest way to remove elements based on a field is as follows (preserving order):
Map<Integer, AwardYearSource> map = new LinkedHashMap<>();
for (AwardYearSource ays : list) {
map.put(ays.getYear(), ays);
}
list.clear();
list.addAll(map.values());
Another way would be to override hashCode() and equals(Object obj) for your object. Since it just has one field you want to use to determine equality, this is pretty straightforward. Something like:
public boolean equals(Object obj) {
if (obj == null || !(obj instanceof AwardYearSource)) {
return false;
}
return (this.year == ((AwardYearSource)obj).year);
}
public int hashCode() {
return this.year;
}
Then you can just stick all of the objects into a Set to remove duplicates:
Set<AwardYearSource> set = new Set<AwardYearSource>();
set.add(new AwardYearSource(2011));
set.add(new AwardYearSource(2012));
set.add(new AwardYearSource(2011));
for (AwardYearSource aws : set) {
System.out.println(aws.year);
}
Fairly simply. Although something bugs me about the map versions (not that I doubt they'd work, it just seems like overkill, somehow - although this version isn't necessarily any better in that regard).
Answer is functional, and threadsafe (assuming AwardYearSource is immutable).
public static List<AwardYearSource> removeDuplicateYears(
final Collection<AwardYearSource> awards) {
final ArrayList<AwardYearSource> input = new ArrayList<AwardYearSource>(awards);
// If there's only one element (or none), guaranteed unique.
if (input.size() <= 1) {
return input;
}
final HashSet<Integer> years = new HashSet<Integer>(input.size(), 1);
final Iterator<AwardYearSource> iter = input.iterator();
while(iter.hasNext()) {
final AwardYearSource award = iter.next();
final Integer year = award.getYear();
if (years.contains(year)) {
iter.remove();
} else {
years.add(year);
}
}
return input;
}
You could use a map and store your objects with the year as a key:
Map<Integer, AwardYearSource> map = new HashMap<Integer, AwardYearSource>();
map.put(someAwardYearSource1.getYear(), someAwardYearSource1);
map.put(someAwardYearSource2.getYear(), someAwardYearSource2);
etc.
At the end the map will contain unique values by year, which you can call with the values method:
Collection<AwardYearSource> noDups = map.values();
Create a HashMap object with int as the key type and your class as the value type. Then iterate over the list and insert each element to the map using:
mymap.put(source.year, source);
Then remove all elements from the origianl list and iterate over the map and insert each element to the list.
If your AwardYearSource class overrides equals and hashcode methods (Eclipse can generate both), then you can add them to a Set. The Set will not contain any duplicates.
public class AwardYearSource
{
private final int year;
public AwardYearSource(int year)
{
this.year = year;
}
#Override
public int hashCode()
{
final int prime = 31;
int result = 1;
result = prime * result + year;
return result;
}
#Override
public boolean equals(Object obj)
{
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
AwardYearSource other = (AwardYearSource) obj;
if (year != other.year)
return false;
return true;
}
#Override
public String toString()
{
return String.valueOf(year);
}
public static void main(String[] args)
{
Set<AwardYearSource> set = new HashSet<AwardYearSource>();
set.add(new AwardYearSource(2000));
set.add(new AwardYearSource(2000));
set.add(new AwardYearSource(2000));
set.add(new AwardYearSource(2000));
System.out.println(set);
}
}
The output is [2000]. Only one item in the set.
Set<Integer> set = new HashSet<>();
list.removeIf(i -> set.contains(i.getYear()) ? true : !set.add(i.getYear()));
This should help wherein, duplication is decided based on certain property (or combination of properties), year in this case. Hope this helps.

Categories