Java: Equalator? (removing duplicates from a collection of objects)

Java: Equalator? (removing duplicates from a collection of objects) - java

I have a bunch of objects of a class Puzzle. I have overridden equals() and hashCode(). When it comes time to present the solutions to the user, I'd like to filter out all the Puzzles that are "similar" (by the standard I have defined), so the user only sees one of each.
Similarity is transitive.
Example:
Result of computations:
A (similar to A)
B (similar to C)
C
D
In this case, only A or D and B or C would be presented to the user - but not two similar Puzzles. Two similar puzzles are equally valid. It is only important that they are not both shown to the user.
To accomplish this, I wanted to use an ADT that prohibits duplicates. However, I don't want to change the equals() and hashCode() methods to return a value about similarity instead. Is there some Equalator, like Comparator, that I can use in this case? Or is there another way I should be doing this?
The class I'm working on is a Puzzle that maintains a grid of letters. (Like Scrabble.) If a Puzzle contains the same words, but is in a different orientation, it is considered to be similar. So the following to puzzle:
(2, 2): A
(2, 1): C
(2, 0): T
Would be similar to:
(1, 2): A
(1, 1): C
(1, 0): T

Okay you have a way of measuring similarity between objects. That means they form a Metric Space.
The question is, is your space also a Euclidean space like normal three dimensional space, or integers or something like that? If it is, then you could use a binary space partition in however many dimensions you've got.
(The question is, basically: is there a homomorphism between your objects and an n-dimensional real number vector? If so, then you can use techniques for measuring closeness of points in n-dimensional space.)
Now, if it's not a euclidean space then you've got a bigger problem. An example of a non-euclidean space that programers might be most familiar with would be the Levenshtein Distance between to strings.
If your problem is similar to seeing how similar a string is to a list of already existing strings then I don't know of any algorithms that would do that without O(n2) time. Maybe there are some out there.
But another important question is: how much time do you have? How many objects? If you have time or if your data set is small enough that an O(n2) algorithm is practical, then you just have to iterate through your list of objects to see if it's below a certain threshold. If so, reject it.
Just overload AbstractCollection and replace the Add function. Use an ArrayList or whatever. Your code would look kind of like this
class SimilarityRejector<T> extends AbstractCollection<T>{
ArrayList<T> base;
double threshold;
public SimilarityRejector(double threshold){
base = new ArrayList<T>();
this.threshold = threshold;
}
public void add(T t){
boolean failed = false;
for(T compare : base){
if(similarityComparison(t,compare) < threshold) faled = true;
}
if(!failed) base.add(t);
}
public Iterator<T> iterator() {
return base.iterator();
}
public int size() {
return base.size();
}
}
etc. Obviously T would need to be a subclass of some class that you can perform a comparison on. If you have a euclidean metric, then you can use a space partition, rather then going through every other item.

I'd use a wrapper class that overrides equals and hashCode accordingly.
private static class Wrapper {
public static final Puzzle puzzle;
public Wrapper(Puzzle puzzle) {
this.puzzle = puzzle;
}
#Override
public boolean equals(Object object) {
// ...
}
#Override
public int hashCode() {
// ...
}
}
and then you wrap all your puzzles, put them in a map, and get them out again…
public Collection<Collection<Puzzle>> method(Collection<Puzzles> puzzles) {
Map<Wrapper,<Collection<Puzzle>> map = new HashMap<Wrapper,<Collection<Puzzle>>();
for (Puzzle each: puzzles) {
Wrapper wrapper = new Wrapper(each);
Collection<Puzzle> coll = map.get(wrapper);
if (coll == null) map.put(wrapper, coll = new ArrayList<Puzzle>());
coll.add(puzzle);
}
return map.values();
}

Create a TreeSet using your Comparator
Adds all elements into the set
All duplicates are stripped out

Normally "similarity" is not a transitive relationship. So the first step would be to think of this in terms of equivalence rather than similarity. Equivalence is reflexive, symmetric and transitive.
Easy approach here is to define a puzzle wrapper whose equals() and hashCode() methods are implemented according to the equivalence relation in question.
Once you have that, drop the wrapped objects into a java.util.Set and that filters out duplicates.

IMHO, most elegant way was described by Gili (TreeSet with custom Comparator).
But if you like to make it by yourself, seems this easiest and clearest solution:
/**
* Distinct input list values (cuts duplications)
* #param items items to process
* #param comparator comparator to recognize equal items
* #return new collection with unique values
*/
public static <T> Collection<T> distinctItems(List<T> items, Comparator<T> comparator) {
List<T> result = new ArrayList<>();
for (int i = 0; i < items.size(); i++) {
T item = items.get(i);
boolean exists = false;
for (int j = 0; j < result.size(); j++) {
if (comparator.compare(result.get(j), item) == 0) {
exists = true;
break;
}
}
if (!exists) {
result.add(item);
}
}
return result;
}

Related

Add additional rules to the compare method of a Comparator

I currently have a code snippet which returns strings of a list in ascending order:
Collections.sort(myList, new Comparator<MyClass>() {
#Override
public int compare(MyClass o1, MyClass o2) {
return o1.aString.compareTo(o2.aString);
}
});
While it works, I would like to add some custom "rules" to the order to put certain strings to the front. For instance:
if(aString.equals("Hi")){
// put string first in the order
}
if(aString begins with a null character, e.g. " ") {
// put string after Hi, but before the other strings
}
// So the order could be: Hi, _string, a_string, b_string, c_string
Is it possible to customize the sorting of a list with a Comparator like this?

The answer from MC Emperor is quite nice (+1) in that it fulfills the OP's requirement of not using Java 8 APIs. It also uses a neat internal function technique (the getOrder method) of mapping conditions to small integer values in order to effect a first-level comparison.
Here's an alternative that uses Java 8 constructs. It assumes that MyClass has a getString method that does the obvious thing.
Collections.sort(myList,
Comparator.comparing((MyClass mc) -> ! mc.getString().equals("Hi"))
.thenComparing(mc -> ! mc.getString().startsWith(" "))
.thenComparing(MyClass::getString));
This is pretty opaque until you get used to this style. The key insight is that the "extractor" function that's supplied to Comparator.comparing and Comparator.thenComparing often simply extracts a field, but it can be a general mapping to any other value. If that value is Comparable then an additional Comparator for it needn't be provided. In this case the extractor function is a boolean expression. This gets boxed to a Boolean which as it turns out is Comparable. Since false orders before true we need to negate the boolean expression.
Also note that I had to provide an explicit type declaration for the lambda parameter, as type inference often doesn't work for chained comparator cases such as this one.

That's possible.
Using Java 8 features
You could pass a function to the Comparator.comparing method to define your rules. Note that we simply return integers, the lowest integer for the elements which should come first.
Comparator<MyClass> myRules = Comparator.comparing(t -> {
if (t.aString.equals("Hi")) {
return 0;
}
else if (t.aString.startsWith(" ")) {
return 1;
}
else {
return 2;
}
});
If you want the remaining elements to be sorted alphabetically, you could use thenComparing(Comparator.naturalOrder()), if your class implements Comparable. Otherwise, you should extract the sort key first:
Collections.sort(myList, myRules.thenComparing(Comparator.comparing(t -> t.aString)));
Note that the actual specific numbers returned don't matter, what matters is that lower numbers come before higher numbers when sorting, so if one would always put the string "Hi" first, then the corresponding number should be the lowest returned (in my case 0).
Using Java <= 7 features (Android API level 21 compatible)
If Java 8 features are not available to you, then you could implement it like this:
Comparator<MyClass> myRules = new Comparator<MyClass>() {
#Override
public int compare(MyClass o1, MyClass o2) {
int order = Integer.compare(getOrder(o1), getOrder(o2));
return (order != 0 ? order : o1.aString.compareTo(o2.aString));
}
private int getOrder(MyClass m) {
if (m.aString.equals("Hi")) {
return 0;
}
else if (m.aString.startsWith(" ")) {
return 1;
}
else {
return 2;
}
}
};
And call it like this:
Collections.sort(list, myRules);
This works as follows: first, both received strings are mapped to your custom ruleset and subtracted from eachother. If the two differ, then the operation Integer.compare(getOrder(o1), getOrder(o2))1 determines the comparison. Otherwise, if both are the same, then the lexiographic order is used for comparison.
Here is some code in action.
1 Always use Integer::compare rather than subtracting one from the other, because of the risk of erroneous results due to integer overflow. See here.

Yes, that is possible, you have complete control over the compareTo() method. Two things:
Use String#equals instead of == to compare strings
Make sure you check both arguments to compareTo for your exceptional cases.
A concrete way of implementing something where some words are always first and some words are always last, with ordering defined among the exceptions:
Map<String, Integer> exceptionMap = new HashMap<>();
exceptionMap.put("lowest", -2);
exceptionMap.put("second_lowest", -1);
exceptionMap.put("second_highest", 1);
exceptionMap.put("highest", 2);
public int compareToWithExceptionMap(String s1, String s2) {
int firstExceptional = exceptionMap.getOrDefault(s1, 0);
int secondExceptional = exceptionMap.getOrDefault(s2, 0);
if (firstExceptional == 0 && secondExceptional == 0) {
return s1.compareTo(s2);
}
return firstExceptional - secondExceptional;
}

Why does the HashSet contains multiple the same objects? [duplicate]

Let's say you have a class and you create a HashSet which can store this instances of this class. If you try to add instances which are equal, only one instance is kept in the collection, and that is fine.
However if you have two different instances in the HashSet, and you take one and make it an exact copy of the other (by copying the fields), the HashSet will then contain two duplicate instances.
Here is the code which demonstrates this:
public static void main(String[] args)
{
HashSet<GraphEdge> set = new HashSet<>();
GraphEdge edge1 = new GraphEdge(1, "a");
GraphEdge edge2 = new GraphEdge(2, "b");
GraphEdge edge3 = new GraphEdge(3, "c");
set.add(edge1);
set.add(edge2);
set.add(edge3);
edge2.setId(1);
edge2.setName("a");
for(GraphEdge edge: set)
{
System.out.println(edge.toString());
}
if(edge2.equals(edge1))
{
System.out.println("Equals");
}
else
{
System.out.println("Not Equals");
}
}
public class GraphEdge
{
private int id;
private String name;
//Constructor ...
//Getters & Setters...
public int hashCode()
{
int hash = 7;
hash = 47 * hash + this.id;
hash = 47 * hash + Objects.hashCode(this.name);
return hash;
}
public boolean equals(Object o)
{
if(o == this)
{
return true;
}
if(o instanceof GraphEdge)
{
GraphEdge anotherGraphEdge = (GraphEdge) o;
if(anotherGraphEdge.getId() == this.id && anotherGraphEdge.getName().equals(this.name))
{
return true;
}
}
return false;
}
}
The output from the above code:
1 a
1 a
3 c
Equals
Is there a way to force the HashSet to validate its contents so that possible duplicate entries created as in the above scenario get removed?
A possible solution could be to create a new HashSet and copy the contents from one hashset to another so that the new hashset won't contain duplicates however I don't like this solution.

The situation you describe is invalid. See the Javadoc: "The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set."

To add to #EJP's answer, what will happen in practice if you mutate objects in a HashSet to make them duplicates (in the sense of the equals / hashcode contract) is that the hash table data structure will break.
Depending on the exact details of the mutation, and the state of the hash table, one or both of the instances will become invisible to lookup (e.g. contains and other operations). Either it is on the wrong hash chain, or because the other instance appears before it on the hash chain. And it is hard to predict which instance will be visible ... and whether it will remain visible.
If you iterate the set, both instances will still be present ... in violation of the Set contract.
Of course, this is very broken from the application perspective.
You can avoid this problem by either:
using an immutable type for your set elements,
making a copy of the objects as you put them into the set and / or pull them out of the set,
writing your code so that it "knows" not to change the objects for the duration ...
From the perspective of correctness and robustness, the first option is clearly best.
Incidentally, it would be really difficult to "fix" this in a general way. There is no pervasive mechanism in Java for knowing ... or being notified ... that some element has changed. You can implement such a mechanism on a class by class basis, but it has to be coded explicitly (and it won't be cheap). Even if you did have such a mechanism, what would you do? Clearly one of the objects should now be removed from the set ... but which one?

You are correct and I don't think there is any way to protect against the case you discuss. All of collections which use hashing and equals are subject to this problem. The collection has no notification that the object has changed since it was added to the collection. I think the solution you outline is good.
If you are so concerned with this issue, perhaps you need to rethink your data structures. You could use immutable objects for instance. With immutable objects you would not have this problem.

HashSet is not aware of its member's properties changing after the object has been added. If this is a problem for you, then you may want to consider making GraphEdge immutable. For example:
GraphEdge edge4 = edge2.changeName("new_name");
In the case where GraphEdge is immutable, changing a value result in returning a new instance rather changing the existing instance.

method that can be used to print the elements of a LinkedList of String objects, without any duplicate elements. The method takes a LinkedList object as an input, and then creates a new HashSet object. The method then iterates through the elements of the input LinkedList, and adds each element to the HashSet. Since a HashSet does not allow duplicate elements, this ensures that only unique elements are added to the HashSet.
Then, the method iterates through the HashSet and prints each element to the console, separated by a space. Unlike the printList method, this method does not add any newlines before or after the list of elements. It simply prints the string "Non-duplicates are: " followed by the elements of the HashSet.
public static void printSetList(LinkedList<String> list) {
Set<String> hashSet = new HashSet<>();
for (String v : list) {
hashSet.add(v);
}
System.out.print("Non-duplicates are: ");
for (String v : hashSet) {
System.out.print(v + " ");
}
}

Objects.hashCode is meant to be used to generate a hascode using parameter objects. You are using it as part of the hascode calculation.
Try replacing your implementation of hashCode with the following:
public int hashCode()
{
return Objects.hashCode(this.id, this.name);
}

You will need to do the unique detection a the time you iterate your list. Making a new HashSet might not seem the right way to go, but why not try this... And maybe not use a HashSet to start with...
public class TestIterator {
public static void main(String[] args) {
List<String> list = new ArrayList<String>();
list.add("1");
list.add("1");
list.add("2");
list.add("3");
for (String s : new UniqueIterator<String>(list)) {
System.out.println(s);
}
}
}
public class UniqueIterator<T> implements Iterable<T> {
private Set<T> hashSet = new HashSet<T>();
public UniqueIterator(Iterable<T> iterable) {
for (T t : iterable) {
hashSet.add(t);
}
}
public Iterator<T> iterator() {
return hashSet.iterator();
}
}

Java TreeMap with variable Keys

I attempted to implement Fortune's algorithm in Java, and to avoid writing an AVLTree I decided to use a TreeMap with BeachNode keys.
BeachNode has the following code:
public abstract class BeachNode implements Comparable<BeachNode>{
static int idcounter=0;
int id;
public StatusNode(){
//Allows for two nodes with the same coordinate
id=idcounter;
idcounter++;
}
#Override
public int compareTo(BeachNode s) {
if(s.getX()<this.getX())
return -1;
if(s.getX()>this.getX())
return 1;
if(s.id<this.id)
return 1;
if(s.id>this.id)
return -1;
return 0;
}
public abstract double getX();
}
The getX() method of Intersection returns a value dependent on the present height of the sweep line- so it changes partway through execution.
My first question:
(I'm fairly sure the answer is yes, but peace of mind would be nice)
If I ensure that for any two BeachNodes A and B, signum(A.compareTo(B)) remains constant while A and B are in the beach tree, will the TreeMap still function despite the changes underlying the compareTo?
My second question:
How can I ensure that such a contract is obeyed?
Note that in Fortune's algorithm, we track at what sweep line positions two intersections will meet- when the sweep line reaches one of these positions, one of the intersections is removed.
This means two intersections A and B in the beach tree will never cross positions, but during a CircleEvent A.compareTo(B) will return 0- interfering with processing the CircleEvent. The contract will be broken only briefly, during the CircleEvent that would remove one Intersection.
This is my first question on StackOverflow, if it is poorly posed or incomplete please inform me and I will do my best to rectify it.

According to the TreeMap documentation, the tree will be sorted according to the compareTo method, so any changes that are not reflected in the sign of a.compareTo(b) are allowed. However, you also need to implement an equals method with the same semantics as compareTo. This is really easy if you already have a compareTo method:
public boolean equals(Object object) {
if (!(object instanceof BeachNode)) {
return false;
}
BeachNode other = (BeachNode) object;
return this.compareTo(other) == 0;
}
And, since you're overriding equals, you should override hashCode. This is also pretty easy, since you are only using a couple fields to define equality.
public int hashCode() {
int hash = 1;
hash = (17 * hash) + (Double getX()).hashCode());
hash = (31 * hash) + id;
return hash;
}
I'm not sure about your second question, but since the id of the two intersections should stay different, wouldn't they not be equal? If I'm wrong, hopefully someone who understands the algorithm can help you work that out.

Not sure how to sort an ArrayList based on parts of Objects in that ArrayList (Java)

I have a Sorts class that sorts (based on insertion sort, which was the assignment's direction) any ArrayList of any type passed through it, and uses insertion sort to sort the items in the list lexicographically:
public class Sorts
{
public static void sort(ArrayList objects)
{
for (int i=1; i<objects.size(); i++)
{
Comparable key = (Comparable)objects.get(i);
int position = i;
while (position>0 && (((Comparable)objects.get(position)).compareTo(objects.get(position-1)) < 0))
{
objects.set(position, objects.get(position-1));
position--;
}
objects.set(position, key);
}
}
}
In one of my other files, I use a method (that is called in main later) that sorts objects of type Owner, and we have to sort them by last name (if they are the same, then first name):
Directions: "Sort the list of owners by last name from A to Z. If more than one owner have the same last name, compare their first names. This method calls the sort method defined in the Sorts class."
What I thought first was to get the last name of each owner in a for loop, add it to a temporary ArrayList of type string, call Sorts.sort(), and then re-add it back into the ArrayList ownerList:
public void sortOwners() {
ArrayList<String> temp = new ArrayList<String>();
for (int i=0; i<ownerList.size(); i++)
temp.add(((Owner)ownerList.get(i)).getLastName());
Sorts.sort(temp);
for (int i=0; i<temp.size(); i++)
ownerList.get(i).setLastName(temp.get(i));
}
I guess this was the wrong way to approach it, as it is not sorting when I compile.
What I now think I should do is create two ArrayLists (one is firstName, one is LastName) and say that, in a for loop, that if (lastName is the same) then compare firstName, but I'm not sure if I would need two ArrayLists for that, as it seems needlessly complicated.
So what do you think?
Edit: I am adding a version of compareTo(Object other):
public int compareTo(Object other)
{
int result = 0;
if (lastName.compareTo(((Owner)other).getLastName()) < 0)
result = -1;
else if (lastName.compareTo(((Owner)other).getLastName()) > 0)
result = 1;
else if (lastName.equals(((Owner)other).getLastName()))
{
if (firstName.compareTo(((Owner)other).getFirstName()) < 0)
result = -1;
else if (firstName.compareTo(((Owner)other).getFirstName()) > 0)
result = 1;
else if (firstName.equals(((Owner)other).getFirstName()))
result = 0;
}
return result;
}

I think the object should implement a compareTo method that follows the normal Comparable contract--search for sorting on multiple fields. You are correct that having two lists is unnecessary.

If you have control over the Owner code to begin with, then change the code so that it implements Comparable. Its compareTo() method performs the lastName / firstName test described in the assignment. Your sortOwners() method will pass a List<Owner> directly to Sorts.sort().
If you don't have control over Owner, then create a subclass of Owner that implements Comparable. Call it OwnerSortable or the like. It accepts a regular Owner object in its constructor and simply delegates all methods other than compareTo() to the wrapped object. Its compareTo() will function as above. Your sortOwners() method will create a new List<OwnerSortable> out of the Owner list. It can then pass this on to Sorts.sort().

Since you have an ArrayList of objects, ordinarily we would use the Collections.sort() method to accomplish this task. Note the method signature:
public static <T extends Comparable<? super T>> void sort(List<T> list)
What's important here is that all the objects being sorted must implement the Comparable interface, which allows objects to be compared to another in numerical fashion. To clarify, a Comparable object has a method called compareTo with the following signature:
int compareTo(T o)
Now we're getting to the good part. When an object is Comparable, it can be compared numerically to another object. Let's look at a sample call.
String a = "bananas";
String b = "zebras";
System.out.println(a.compareTo(b));
The result will be -24. Semantically, since zebras is farther in the back of the dictionary compared to bananas, we say that bananas is comparatively less than zebras (not as far in the dictionary).
So the solution should be clear now. Use compareTo to compare your objects in such a way that they are sorted alphabetically. Since I've shown you how to compare strings, you should hopefully have a general idea of what needs to be written.
Once you have numerical comparisons, you would use the Collections class to sort your list. But since you have your own sorting ability, not having access to it is no great loss. You can still compare numerically, which was the goal all along! So this should make the necessary steps clearer, now that I have laid them out.

Since this is homework, here's some hints:
Assuming that the aim is to implement a sort algorithm yourself, you will find that it is much easier (and more performant) to extract the list elements into an array, sort the array and then rebuild the list (or create a new one).
If that's not the aim, then look at the Collections class.
Implement a custom Comparator, or change the object class to implement Comparable.

Most efficient way to see if an ArrayList contains an object in Java

I have an ArrayList of objects in Java. The objects have four fields, two of which I'd use to consider the object equal to another. I'm looking for the most efficient way, given those two fields, to see if the array contains that object.
The wrench is that these classes are generated based on XSD objects, so I can't modify the classes themselves to overwrite the .equals.
Is there any better way than just looping through and manually comparing the two fields for each object and then breaking when found? That just seems so messy, looking for a better way.
Edit: the ArrayList comes from a SOAP response that is unmarshalled into objects.

It depends on how efficient you need things to be. Simply iterating over the list looking for the element which satisfies a certain condition is O(n), but so is ArrayList.Contains if you could implement the Equals method. If you're not doing this in loops or inner loops this approach is probably just fine.
If you really need very efficient look-up speeds at all cost, you'll need to do two things:
Work around the fact that the class
is generated: Write an adapter class which
can wrap the generated class and
which implement equals() based
on those two fields (assuming they
are public). Don't forget to also
implement hashCode() (*)
Wrap each object with that adapter and
put it in a HashSet.
HashSet.contains() has constant
access time, i.e. O(1) instead of O(n).
Of course, building this HashSet still has a O(n) cost. You are only going to gain anything if the cost of building the HashSet is negligible compared to the total cost of all the contains() checks that you need to do. Trying to build a list without duplicates is such a case.
*
() Implementing hashCode() is best done by XOR'ing (^ operator) the hashCodes of the same fields you are using for the equals implementation (but multiply by 31 to reduce the chance of the XOR yielding 0)

You could use a Comparator with Java's built-in methods for sorting and binary search. Suppose you have a class like this, where a and b are the fields you want to use for sorting:
class Thing { String a, b, c, d; }
You would define your Comparator:
Comparator<Thing> comparator = new Comparator<Thing>() {
public int compare(Thing o1, Thing o2) {
if (o1.a.equals(o2.a)) {
return o1.b.compareTo(o2.b);
}
return o1.a.compareTo(o2.a);
}
};
Then sort your list:
Collections.sort(list, comparator);
And finally do the binary search:
int i = Collections.binarySearch(list, thingToFind, comparator);

Given your constraints, you're stuck with brute force search (or creating an index if the search will be repeated). Can you elaborate any on how the ArrayList is generated--perhaps there is some wiggle room there.
If all you're looking for is prettier code, consider using the Apache Commons Collections classes, in particular CollectionUtils.find(), for ready-made syntactic sugar:
ArrayList haystack = // ...
final Object needleField1 = // ...
final Object needleField2 = // ...
Object found = CollectionUtils.find(haystack, new Predicate() {
public boolean evaluate(Object input) {
return needleField1.equals(input.field1) &&
needleField2.equals(input.field2);
}
});

If the list is sorted, you can use a binary search. If not, then there is no better way.
If you're doing this a lot, it would almost certainly be worth your while to sort the list the first time. Since you can't modify the classes, you would have to use a Comparator to do the sorting and searching.

Even if the equals method were comparing those two fields, then logically, it would be just the same code as you doing it manually. OK, it might be "messy", but it's still the correct answer

If you are a user of my ForEach DSL, it can be done with a Detect query.
Foo foo = ...
Detect<Foo> query = Detect.from(list);
for (Detect<Foo> each: query)
each.yield = each.element.a == foo.a && each.element.b == foo.b;
return query.result();

Is there any better way than just looping through and manually comparing the two fields for each object and then breaking when found? That just seems so messy, looking for a better way.
If your concern is maintainability you could do what Fabian Steeg suggest ( that's what I would do ) although it probably isn't the "most efficient" ( because you have to sort the array first and then perform the binary search ) but certainly the cleanest and better option.
If you're really concerned with efficiency, you can create a custom List implementation that uses the field in your object as the hash and use a HashMap as storage. But probably this would be too much.
Then you have to change the place where you fill the data from ArrayList to YourCustomList.
Like:
List list = new ArrayList();
fillFromSoap( list );
To:
List list = new MyCustomSpecialList();
fillFromSoap( list );
The implementation would be something like the following:
class MyCustomSpecialList extends AbstractList {
private Map<Integer, YourObject> internalMap;
public boolean add( YourObject o ) {
internalMap.put( o.getThatFieldYouKnow(), o );
}
public boolean contains( YourObject o ) {
return internalMap.containsKey( o.getThatFieldYouKnow() );
}
}
Pretty much like a HashSet, the problem here is the HashSet relies on the good implementation of the hashCode method, which probably you don't have. Instead you use as the hash "that field you know" which is the one that makes one object equals to the other.
Of course implementing a List from the scratch lot more tricky than my snippet above, that's why I say the Fabian Steeg suggestion would be better and easier to implement ( although something like this would be more efficient )
Tell us what you did at the end.

Maybe a List isn't what you need.
Maybe a TreeSet would be a better container. You get O(log N) insertion and retrieval, and ordered iteration (but won't allow duplicates).
LinkedHashMap might be even better for your use case, check that out too.

Building a HashMap of these objects based on the field value as a key could be worthwhile from the performance perspective, e.g. populate Maps once and find objects very efficiently

If you need to search many time in the same list, it may pay off to build an index.
Iterate once through, and build a HashMap with the equals value you are looking for as the key and the appropriate node as the value. If you need all instead of anyone of a given equals value, then let the map have a value type of list and build the whole list in the initial iteration.
Please note that you should measure before doing this as the overhead of building the index may overshadow just traversing until the expected node is found.

There are three basic options:
1) If retrieval performance is paramount and it is practical to do so, use a form of hash table built once (and altered as/if the List changes).
2) If the List is conveniently sorted or it is practical to sort it and O(log n) retrieval is sufficient, sort and search.
3) If O(n) retrieval is fast enough or if it is impractical to manipulate/maintain the data structure or an alternate, iterate over the List.
Before writing code more complex than a simple iteration over the List, it is worth thinking through some questions.
Why is something different needed? (Time) performance? Elegance? Maintainability? Reuse? All of these are okay reasons, apart or together, but they influence the solution.
How much control do you have over the data structure in question? Can you influence how it is built? Managed later?
What is the life cycle of the data structure (and underlying objects)? Is it built up all at once and never changed, or highly dynamic? Can your code monitor (or even alter) its life cycle?
Are there other important constraints, such as memory footprint? Does information about duplicates matter? Etc.

I would say the simplest solution would be to wrap the object and delegate the contains call to a collection of the wrapped class. This is similar to the comparator but doesn't force you to sort the resulting collection, you can simply use ArrayList.contains().
public class Widget {
private String name;
private String desc;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getDesc() {
return desc;
}
public void setDesc(String desc) {
this.desc = desc;
}
}
public abstract class EqualsHashcodeEnforcer<T> {
protected T wrapped;
public T getWrappedObject() {
return wrapped;
}
#Override
public boolean equals(Object obj) {
return equalsDelegate(obj);
}
#Override
public int hashCode() {
return hashCodeDelegate();
}
protected abstract boolean equalsDelegate(Object obj);
protected abstract int hashCodeDelegate();
}
public class WrappedWidget extends EqualsHashcodeEnforcer<Widget> {
#Override
protected boolean equalsDelegate(Object obj) {
if (obj == null) {
return false;
}
if (obj == getWrappedObject()) {
return true;
}
if (obj.getClass() != getWrappedObject().getClass()) {
return false;
}
Widget rhs = (Widget) obj;
return new EqualsBuilder().append(getWrappedObject().getName(),
rhs.getName()).append(getWrappedObject().getDesc(),
rhs.getDesc()).isEquals();
}
#Override
protected int hashCodeDelegate() {
return new HashCodeBuilder(121, 991).append(
getWrappedObject().getName()).append(
getWrappedObject().getDesc()).toHashCode();
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java: Equalator? (removing duplicates from a collection of objects) - java

Create a TreeSet using your Comparator Adds all elements into the set All duplicates are stripped out

Related

Add additional rules to the compare method of a Comparator

Why does the HashSet contains multiple the same objects? [duplicate]

Java TreeMap with variable Keys

Not sure how to sort an ArrayList based on parts of Objects in that ArrayList (Java)

Most efficient way to see if an ArrayList contains an object in Java

Categories

Resources