Customizing equals method as per the use case - java

I have a class defined as
class Book{
String author;
String title;
int id;
public boolean equals(Object o){
return id == ((Book)o).id;
}
public int hashCode{...}
}
In most of the cases uniqueness of the Books is determined by id, hence works properly. In one particular case, I want to merge two lists based on author and title value. I cannot directly use a Set and add the second list to the Set as comparison will happen on ids and not on author/title value. Only way for me is to have two nested for loops to compare each object's value.
List<Book> list1=...;
List<Book> list2 = ...;
Iterator<Book> iterator = list1.iterator();
while(iterator.hasNext()){
Book b1 = iterator.next();
for(Book b2:list2){
if(b1.getAuthor().equals(b2.getAuthor()) && b1.getTitle().equals(b2.getTitle())){
iterator.remove();
}
}
}
list2.addAll(list1);
Is there any way where we can override the equals method as per the use case (similar to Comparator where we can change the sort algorithm)?
Instead just have customized equals method that will check the author value and somehow following works
set.addAll(list2);

You can do something liek that with closures, but it is not overriding as such. The problem you have is this is a O(N*M) time complexity which is not idea. A better approach is O(N) is
Map<String, Book> books = new LinkedHashMap<>();
for (Book book : list1) books.put(book.author+"/"+book.title, book);
for (Book book : list2) books.remove(book.author+"/"+book.title);
list2.addAll(books.values());
For closures, you need a few functions I couldn't find.
static class MapStream<K, V> {
final Map<K, V> map;
final Function<V, K> func;
MapStream(Iterable<V> values, Function<V, K> func) {
map = new LinkedHashMap<>();
this.func = func;
addAll(values);
}
private void addAll(Iterable<V> values) {
for (V value : values)
map.put(func.apply(value), value);
}
public MapStream<K, V> removeAll(Iterable<V> values) {
for (V value : values) {
map.remove(func.apply(value));
}
return this;
}
public Collection<V> values() {
return map.values();
}
}
public static <T> Function<T, String> and(Function<T, String> func1, Function<T, String> func2) {
return (T t) -> func1.apply(t) + "\uffff" + func2.apply(t);
}
public static void main(String... ignored) {
List<Book> list1 = new ArrayList<>();
List<Book> list2 = new ArrayList<>();
Function<Book, String> commonKey = and((Book b) -> b.author, (Book b) -> b.title);
list2.addAll(new MapStream<>(list1, commonKey).removeAll(list2).values());
}
You can see that with some support you can see something with closures.

In general you need to externalize equals and hashCode methods. Therefore you could have something like:
class MyModelClass {
private EqualsImpl<MyModelClass> equalsImpl;
public MyModelClass(EqualsImpl<MyModelClass> equalsImpl) {
super();
this.equalsImpl = equalsImpl;
}
#Override
public boolean equals(Object obj) {
return equalsImpl.equals(this, obj);
}
#Override
public int hashCode() {
return equalsImpl.hashCode(this);
}
}
interface EqualsImpl<C> {
public boolean equals(C obj1, Object obj2);
public int hashCode(C obj);
}

If I understand your need correctly, you can use a TreeSet with a custom comparator:
TreeSet<Book> set = new TreeSet<Book>(new Comparator<Book>() {
public int compare(Book o1, Book o2) {
int r = o1.author.compareTo(o2.author);
if (r != 0)
return r;
return o1.title.compareTo(o2.title);
}
});
set.add(...);
SortedSet guarantees uniqueness, but it doesn't make use of equals() or hashcode(). Instead, it uses the comparator (or natural ordering) for determining equality.

Create a custom List which decorates the List.add() (or List.addAll()) using a Comparator like the TreeSet

Just in addition to answer of #Aniket Thakur (+1).
Yes, I also recommend you to use Comparator. You should define Comparator per your use case and use collections that work with comparators, e.g. TreeSet, TreeMap. This is the clearest way to achieve what you need: separation of comparison logic from model class itself.

public boolean equals(Object o){
return (author.equals(((Book)o).author) && title.equals(((Book)o).title));
}
why can't you override the equals method as above.

You can do
import java.util.Comparator;
public class Book {
String author;
String title;
int id;
public boolean equals(Object o) {
return id == ((Book) o).id;
}
//getters and setters
//other methods
}
class BookComparator implements Comparator<Book>{
#Override
public int compare(Book o1, Book o2) {
if(o1.getAuthor() == o2.getAuthor() && o1.getTitle() == o2.getTitle())
return 0;
return 1;
}
}
and then you can do something like
Set<Book> lSet = new TreeSet<Book>(new BookComparator());
lSet.addAll(list1);
lSet.addAll(list2);

Related

How to override equals(), hashcode() and compareTo() for a HashSet

I am trying to override the mentioned methods for my HashSet:
Set<MyObject> myObjectSet = new HashSet<MyObject>();
MyObject:
public class MyObject implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
Long id;
String name;
int number;
Map<String,String> myMap;
public MyObject(String name, int number, Map<String,String> myMap) {
this.name = name;
this.number = number;
this.myMap = myMap;
}
[...]
}
How do I override the hashcode(), equals() and compareTo() method?
Currently I have the following:
public int hashCode () {
return id.hashCode();
}
// override the equals method.
public boolean equals(MyObject s) {
return id.equals(s.id);
}
// override compareTo
public int compareTo(MyObject s) {
return id.compareTo(s.id);
}
I read that comparing by id is not enough this is object is a persistent entity for the DB (see here).
The name and number aren't unique across all objects of this type.
So how should I override it?
Do I also need to compare the hashMap inside it?
I am confused. The only unique thing about the object is the the map myMap which gets populated later in the lifecycle.
How do I check for its equality?
Based on all the responses I have changed the methods to the following
#Override
public boolean equals(final Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
final MyComplexObj myComplexObj = (MyComplexObj) o;
return myMap != null ? myMap.equals(myComplexObj.myMap) : myComplexObj.myMap == null;
}
#Override
public int hashCode() {
return myMap != null ? myMap.hashCode() : 0;
}
public int compareTo(MyComplexObj o) {
return myMap.compareTo(o.getMyMap()));
}
This fails at the compareTo method, "this method is undefined for the type Map
The basic question here is "How can you determine if two objects are equal to each other?"
This is a simple question for simple objects. However, it becomes increasingly difficult with even slightly more complex objects.
As stated in the original question:
The only unique thing about the object is the the map myMap which gets populated later in the lifecycle.
Given two instances of the type MyObject, the member variables myMap must be compared with each other. This map is of type Map<String, String>. A few questions immediately come to mind:
How do the keys & values define equality?
(does a key=value pair need to be compared as a unit?)
(or should only the values be compared to each other?)
How does the order of the keys in the map affect equality?
(should keys in the list be sorted, so that A-B-C is equivalent to B-C-A?)
(or does 1-2-3 mean something different than 3-2-1?)
Does upper/lower case make any different to the equality of the values?
Will these objects ever be stored in some kind of Java HashSet or Java TreeSet?
(do you need to store the same object several times in the same collection?)
(or should objects with equal hashcodes only be stored once?)
Will these objects ever require sorting as part of a list or Java Collection?
How should the comparison function arrange non-equal objects in a list?
(how should key order determine if an object will come earlier or later in a list?)
(how should values determine order, especially if several values are different?)
Answers to each of these questions will vary between applications. In order to keep this applicable to a general audience, the following assumptions are being made:
To maintain a deterministic comparison, keys will be sorted
Values will be considered to be case-sensitive
Keys and values are inseparable, and will be compared as a unit
The Map will be flattened into a single String, so results can be compared easily
The beauty of using equals(), hashCode(), and compareTo() is that once hashCode() is implemented properly, the other functions can be defined based on hashCode().
Considering all of that, we have the following implementation:
#Override
public boolean equals(final Object o)
{
if (o instanceof MyObject)
{
return (0 == this.compareTo(((MyObject) o)));
}
return false;
}
#Override
public int hashCode()
{
return getKeyValuePairs(this.myMap).hashCode();
}
// Return a negative integer, zero, or a positive integer
// if this object is less than, equal to, or greater than the other object
public int compareTo(final MyObject o)
{
return this.hashCode() - o.hashCode();
}
// The Map is flattened into a single String for comparison
private static String getKeyValuePairs(final Map<String, String> m)
{
final StringBuilder kvPairs = new StringBuilder();
final String kvSeparator = "=";
final String liSeparator = "^";
if (null != m)
{
final List<String> keys = new ArrayList<>(m.keySet());
Collections.sort(keys);
for (final String key : keys)
{
final String value = m.get(key);
kvPairs.append(liSeparator);
kvPairs.append(key);
kvPairs.append(kvSeparator);
kvPairs.append(null == value ? "" : value);
}
}
return 0 == kvPairs.length() ? "" : kvPairs.substring(liSeparator.length());
}
All the critical work is being done inside of hashCode(). For sorting, the compareTo() function only needs to return a negative/zero/positive number -- a simple hashCode() diff. And the equals() function only needs to return true/false -- a simple check that compareTo() equals zero.
For further reading, there is a famous dialogue by Lewis Carroll on the foundations of logic, which touches on the basic question of equality:
https://en.wikipedia.org/wiki/What_the_Tortoise_Said_to_Achilles
And, in regard to even simple grammatical constructs, there is a fine example of two "equal" sentences at the start of chapter 6, "Pig and Pepper", from Alice in Wonderland:
The Fish-Footman began by producing from under his arm a great letter, and this he handed over to the other, saying, in a solemn tone, "For the Duchess. An invitation from the Queen to play croquet." The Frog-Footman repeated, in the same solemn tone, "From the Queen. An invitation for the Duchess to play croquet." Then they both bowed low and their curls got entangled together.
compareTo() is relevant to sorting. It has no relevance to a HashSet or HashMap.
A properly working equals() and hashCode() are vital for members of hash-based collections. Read their specifications in the Javadoc for Object.
Possibly the definitive recommendations for implementing these are in Joshua Bloch's Effective Java. I recommend reading the relevant chapter -- it's easily Google-able. There's no point in trying to paraphrase it all here.
One thing that may have escaped your notice, is that your field myMap has a working equals() and hashCode() of its own, so you don't have to do anything special with it. If you can guarantee that none of the fields are null, a reasonable hashCode() would be (following Bloch's system):
public int hashCode() {
int result = 44; // arbitrarily chosen
result = 31 * result + (int) (id ^ (id >>> 32));
result = 31 * result + name.hashCode();
result = 31 * result + number;
result = 31 * result + myMap.hashCode();
return result;
}
(You'll need more code if any of these could be null)
Pretty much all IDEs will automatically generate both equals() and hashcode(), using all the fields in the class. They'll use something very similar to Bloch's recommendations. Hunt around the UI. You'll find it.
Another alternative is to use Apache ReflectionUtils, which allows you to simply use:
#Override
public int hashCode() {
return HashCodeBuilder.reflectionHashCode(this);
}
#Override
public boolean equals(final Object obj) {
return EqualsBuilder.reflectionEquals(this, obj);
}
This works out which fields to use at runtime, and applies Bloch's methods.
This is what intellij default option gives
import java.util.Map;
public class MyObject {
String name;
int number;
Map<String,String> myMap;
#Override
public boolean equals(final Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
final MyObject myObject = (MyObject) o;
if (number != myObject.number) return false;
if (name != null ? !name.equals(myObject.name) : myObject.name != null) return false;
return myMap != null ? myMap.equals(myObject.myMap) : myObject.myMap == null;
}
#Override
public int hashCode() {
int result = name != null ? name.hashCode() : 0;
result = 31 * result + number;
result = 31 * result + (myMap != null ? myMap.hashCode() : 0);
return result;
}
}
But, since you said
The only unique thing about the object is the the map myMap which gets
populated later in the lifecycle.
I would just keep myMap and skip both name and number (But this begs the question, why would you include a redundant data- name and number in all the elements of your collection?)
Then it becomes
import java.util.Map;
public class MyObject {
String name;
int number;
Map<String,String> myMap;
#Override
public boolean equals(final Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
final MyObject myObject = (MyObject) o;
return myMap != null ? myMap.equals(myObject.myMap) : myObject.myMap == null;
}
#Override
public int hashCode() {
return myMap != null ? myMap.hashCode() : 0;
}
}
Keep in mind that, there are other ways too for the equals and hashcode methods. For example, Here are the options that intelliJ gives for code generation
To Answer Further question about CompareTo
Unlike Equals and Hashcode, here is no contract exist between compareTo and any other behaviors. You don't really need to do anything with compareTo until you want to make use of it for say, sorting. To read more about CompareTo Why should a Java class implement comparable?
If you want to make myMap implements comparable, and any other methods that you want, create decorator that implement comparable interface and delegate all other methods to enclosing myMap instance.
public class ComparableMap implements Map<String, String>, Comparable<Map<String, String>> {
private final Map<String, String> map;
public ComparableMap(Map<String, String> map) {
this.map = map;
}
#Override
public int compareTo(Map<String, String> o) {
int result = 0;
//your implementation based on values on map on you consider one map bigger, less or as same as another
return result;
}
#Override
public boolean equals(Object obj) {
return map.equals(obj);
}
#Override
public int hashCode() {
return map.hashCode();
}
// map implementation methods
#Override
public int size() {
return map.size();
}
#Override
public boolean isEmpty() {
return map.isEmpty();
}
#Override
public boolean containsKey(Object key) {
return map.containsKey(key);
}
#Override
public boolean containsValue(Object value) {
return map.containsValue(value);
}
#Override
public String get(Object key) {
return map.get(key);
}
#Override
public String put(String key, String value) {
return map.put(key, value);
}
#Override
public String remove(Object key) {
return map.remove(key);
}
#Override
public void putAll(Map<? extends String, ? extends String> m) {
map.putAll(m);
}
#Override
public void clear() {
map.clear();
}
#Override
public Set<String> keySet() {
return map.keySet();
}
#Override
public Collection<String> values() {
return map.values();
}
#Override
public Set<Entry<String, String>> entrySet() {
return map.entrySet();
}
}
You may use this map in anywhere where you use myMap
public class MyObject implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
Long id;
String name;
int number;
ComparableMap myMap;
public MyObject(String name, int number, Map<String, String> myMap) {
this.name = name;
this.number = number;
this.myMap = new ComparablemyMap(myMap);
}
#Override
public boolean equals(final Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
final MyComplexObj myComplexObj = (MyComplexObj) o;
return myMap != null ? myMap.equals(myComplexObj.myMap) : myComplexObj.myMap == null;
}
#Override
public int hashCode() {
return myMap != null ? myMap.hashCode() : 0;
}
public int compareTo(MyComplexObj o) {
return myMap.compareTo(o.getMyMap())); //now it works
}
}

How to avoid having different methods looping the same list?

I have a class that is mapped from a xml. To make it simple, let's imagine this class is something like:
class Employee implements EmployeeIF {
Map<AttributeIF,Object> attribute = new HashMap<>();
#Override
public Map<AttributeIF,Object> getAttributes() { return attribute; }
}
This is something I cannot change.
Now, the existing code is full of methods like:
public int getSalary(EmployeeIF employee) {
for(Entry<AttributeIF,Object> entry : employee.getAttributes()) {
if(entry.getKey().getName().equals("salary")) return (Integer)entry.getValue();
}
return 0;
}
public int getAddress(EmployeeIF employee) {
for(Entry<AttributeIF,Object> entry : employee.getAttributes()) {
if(entry.getKey().getName().equals("address")) return (String)entry.getValue();
}
return "";
}
... and so on. Surely you got the idea.
I need to include a new method to return a new attribute from the employee, but as I feel this is horrible to mantain, I refuse to just add a new method there.
I am thinking on using the action pattern to somehow avoiding at least repeating againg and again the for loop but I have to say that I cannot find a smart solution for this.
What would be your choices?
Thanks,
Dani.
P.D Yes I tried something like
private Object getAttribute(EmployeeIF employee, String attribute)
Here is a tiny example how you could get, based on a object as key that you don´t have, the value.
public class TestObject {
public String val;
public TestObject(String val) {
this.val = val;
}
public static TestObject createDummy(String val) {
return new TestObject(val);
}
#Override
public boolean equals(Object obj) {
if (this == obj) return true;
if (!(obj instanceof TestObject)) return false;
return ((TestObject)obj).val.equals(this.val);
}
#Override
public int hashCode() {
System.out.println("THIS ONE IS IMPORTANT");
return val.hashCode();
}
}
public class TestMap {
public Map<TestObject, String> map = new HashMap<>();
public String get(String keyVal) {
return map.get(TestObject.createDummy(keyVal));
}
public static void main(String[] args) {
TestMap map = new TestMap();
TestObject o1 = new TestObject("A");
map.map.put(o1,"B");
TestObject o2 = new TestObject("B");
map.map.put(o2,"C");
TestObject o3 = new TestObject("C");
map.map.put(o3,"D");
System.out.println(map.get("B"));
}
}
The Key to it, is to override equals and hashCode in your AttributeIF class. So in case you are passing a dummy object of they AttributeIF you do want to have your map needs to identify this dummy object to be equal with the instance of an theoretcly "equal" key object instance inside your Map.

Set ordered by "add() count"

I'm trying to implement a Set which is ordered by the count of additions like this:
public class App {
public static void main(String args[]) {
FrequencyOrderedTreeSet<String> set = new FrequencyOrderedTreeSet<String>();
set.add("bar");
set.add("foo");
set.add("foo");
Iterator<String> i = set.iterator();
while (i.hasNext()) {
System.out.print(i.next());
}
// prints "foobar"
}
}
I've created a protected class FrequencyOrderedTreeSet.Element which implements Comparable and has a T entry and an int frequency property and extended TreeSet<FrequencyOrderedTreeSet.Element> with FrequencyOrderedTreeSet<T> and overrode the compareTo and equals methods on the Element.
One problem is that I can't override the add() method because of type erasure problems and also I can't call instanceof Element in the equals method, because in case object given to it is an Element, I have to compare their entries, but if it's not, I have to compare the object itself to this.entry.
In the add method I create a new element, find the element with the same entry in the set, set the frequency on the new element to "old+1", remove the old one and add the new one. I'm not even sure this is the best way to do this or if it would work even because the other problems I described.
The question is: what's the best way to implement such data structure? In case I'm somehow on the right track - how can I circumvent the problems I've mentioned above?
Here's a basic implementation. It's not the most optimal and will take some more work if you want to implement the full Set interface.
public class FrequencySet<T> implements Iterable<T>
{
private TreeSet<T> set;
private HashMap<T, Integer> elements = new HashMap<T, Integer>();
public FrequencySet()
{
set = new TreeSet<T>(new Comparator<T>()
{
public int compare(T o1, T o2)
{
return elements.get(o2)-elements.get(o1);
}
});
}
public void add(T t)
{
Integer i = elements.get(t);
elements.put(t, i == null ? 1 : i+1);
set.remove(t);
set.add(t);
}
public Iterator<T> iterator() {return set.iterator();}
public static void main(String [] args)
{
FrequencySet<String> fset = new FrequencySet<String>();
fset.add("foo");
fset.add("bar");
fset.add("foo");
for (String s : fset)
System.out.print(s);
System.out.println();
fset.add("bar");
fset.add("bar");
for (String s : fset)
System.out.print(s);
}
}
The key is in the add method. We change the counter for the given object (which changes the relation order), remove it from the backing set and put it back in.
This works the other way (count is increased when you use GET)
#SuppressWarnings("rawtypes")
final class Cache implements Comparable {
private String key;
private String value;
private int counter;
public String getValue() {
counter++;
return value;
}
private void setValue(String value) { this.value = value; }
public String getKey() { return key; }
private void setKey(String key) { this.key = key; }
public int getCounter() { return counter; }
public void setCounter(int counter) { this.counter = counter; }
public Cache(String key, String value) {
this.setKey(key);
this.setValue(value);
setCounter(0);
}
#Override
public int compareTo(Object arg0) {
if(!(arg0 instanceof Cache)) {
throw new ClassCastException();
}
return this.getCounter() - ((Cache) arg0).getCounter();
}
}

Case insensitive string as HashMap key

I would like to use case insensitive string as a HashMap key for the following reasons.
During initialization, my program creates HashMap with user defined String
While processing an event (network traffic in my case), I might received String in a different case but I should be able to locate the <key, value> from HashMap ignoring the case I received from traffic.
I've followed this approach
CaseInsensitiveString.java
public final class CaseInsensitiveString {
private String s;
public CaseInsensitiveString(String s) {
if (s == null)
throw new NullPointerException();
this.s = s;
}
public boolean equals(Object o) {
return o instanceof CaseInsensitiveString &&
((CaseInsensitiveString)o).s.equalsIgnoreCase(s);
}
private volatile int hashCode = 0;
public int hashCode() {
if (hashCode == 0)
hashCode = s.toUpperCase().hashCode();
return hashCode;
}
public String toString() {
return s;
}
}
LookupCode.java
node = nodeMap.get(new CaseInsensitiveString(stringFromEvent.toString()));
Because of this, I'm creating a new object of CaseInsensitiveString for every event. So, it might hit performance.
Is there any other way to solve this issue?
Map<String, String> nodeMap =
new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
That's really all you need.
As suggested by Guido García in their answer here:
import java.util.HashMap;
public class CaseInsensitiveMap extends HashMap<String, String> {
#Override
public String put(String key, String value) {
return super.put(key.toLowerCase(), value);
}
// not #Override because that would require the key parameter to be of type Object
public String get(String key) {
return super.get(key.toLowerCase());
}
}
Or
https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/map/CaseInsensitiveMap.html
One approach is to create a custom subclass of the Apache Commons AbstractHashedMap class, overriding the hash and isEqualKeys methods to perform case insensitive hashing and comparison of keys. (Note - I've never tried this myself ...)
This avoids the overhead of creating new objects each time you need to do a map lookup or update. And the common Map operations should O(1) ... just like a regular HashMap.
And if you are prepared to accept the implementation choices they have made, the Apache Commons CaseInsensitiveMap does the work of customizing / specializing AbstractHashedMap for you.
But if O(logN) get and put operations are acceptable, a TreeMap with a case insensitive string comparator is an option; e.g. using String.CASE_INSENSITIVE_ORDER.
And if you don't mind creating a new temporary String object each time you do a put or get, then Vishal's answer is just fine. (Though, I note that you wouldn't be preserving the original case of the keys if you did that ...)
Subclass HashMap and create a version that lower-cases the key on put and get (and probably the other key-oriented methods).
Or composite a HashMap into the new class and delegate everything to the map, but translate the keys.
If you need to keep the original key you could either maintain dual maps, or store the original key along with the value.
Two choices come to my mind:
You could use directly the s.toUpperCase().hashCode(); as the key of the Map.
You could use a TreeMap<String> with a custom Comparator that ignore the case.
Otherwise, if you prefer your solution, instead of defining a new kind of String, I would rather implement a new Map with the required case insensibility functionality.
Wouldn't it be better to "wrap" the String in order to memorize the hashCode. In the normal String class hashCode() is O(N) the first time and then it is O(1) since it is kept for future use.
public class HashWrap {
private final String value;
private final int hash;
public String get() {
return value;
}
public HashWrap(String value) {
this.value = value;
String lc = value.toLowerCase();
this.hash = lc.hashCode();
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o instanceof HashWrap) {
HashWrap that = (HashWrap) o;
return value.equalsIgnoreCase(that.value);
} else {
return false;
}
}
#Override
public int hashCode() {
return this.hash;
}
//might want to implement compare too if you want to use with SortedMaps/Sets.
}
This would allow you to use any implementation of Hashtable in java and to have O(1) hasCode().
You can use a HashingStrategy based Map from Eclipse Collections
HashingStrategy<String> hashingStrategy =
HashingStrategies.fromFunction(String::toUpperCase);
MutableMap<String, String> node = HashingStrategyMaps.mutable.of(hashingStrategy);
Note: I am a contributor to Eclipse Collections.
Based on other answers, there are basically two approaches: subclassing HashMap or wrapping String. The first one requires a little more work. In fact, if you want to do it correctly, you must override almost all methods (containsKey, entrySet, get, put, putAll and remove).
Anyway, it has a problem. If you want to avoid future problems, you must specify a Locale in String case operations. So you would create new methods (get(String, Locale), ...). Everything is easier and clearer wrapping String:
public final class CaseInsensitiveString {
private final String s;
public CaseInsensitiveString(String s, Locale locale) {
this.s = s.toUpperCase(locale);
}
// equals, hashCode & toString, no need for memoizing hashCode
}
And well, about your worries on performance: premature optimization is the root of all evil :)
Instead of creating your own class to validate and store case insensitive string as a HashMap key, you can use:
LinkedCaseInsensitiveMap wraps a LinkedHashMap, which is a Map based on a hash table and a linked list. Unlike LinkedHashMap, it doesn't allow null key inserting. LinkedCaseInsensitiveMap preserves the original order as well as the original casing of keys while allowing calling functions like get and remove with any case.
Eg:
Map<String, Integer> linkedHashMap = new LinkedCaseInsensitiveMap<>();
linkedHashMap.put("abc", 1);
linkedHashMap.put("AbC", 2);
System.out.println(linkedHashMap);
Output: {AbC=2}
Mvn Dependency:
Spring Core is a Spring Framework module that also provides utility classes, including LinkedCaseInsensitiveMap.
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
<version>5.2.5.RELEASE</version>
</dependency>
CaseInsensitiveMap is a hash-based Map, which converts keys to lower case before they are being added or retrieved. Unlike TreeMap, CaseInsensitiveMap allows null key inserting.
Eg:
Map<String, Integer> commonsHashMap = new CaseInsensitiveMap<>();
commonsHashMap.put("ABC", 1);
commonsHashMap.put("abc", 2);
commonsHashMap.put("aBc", 3);
System.out.println(commonsHashMap);
Output: {abc=3}
Dependency:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>
TreeMap is an implementation of NavigableMap, which means that it always sorts the entries after inserting, based on a given Comparator. Also, TreeMap uses a Comparator to find if an inserted key is a duplicate or a new one.
Therefore, if we provide a case-insensitive String Comparator, we'll get a case-insensitive TreeMap.
Eg:
Map<String, Integer> treeMap = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
treeMap.put("ABC", 1);
treeMap.put("ABc", 2);
treeMap.put("cde", 1);
System.out.println(treeMap);
Output: {ABC=2, cde=1}
You can use CollationKey objects instead of strings:
Locale locale = ...;
Collator collator = Collator.getInstance(locale);
collator.setStrength(Collator.SECONDARY); // Case-insensitive.
collator.setDecomposition(Collator.FULL_DECOMPOSITION);
CollationKey collationKey = collator.getCollationKey(stringKey);
hashMap.put(collationKey, value);
hashMap.get(collationKey);
Use Collator.PRIMARY to ignore accent differences.
The CollationKey API does not guarantee that hashCode() and equals() are implemented, but in practice you'll be using RuleBasedCollationKey, which does implement these. If you're paranoid, you can use a TreeMap instead, which is guaranteed to work at the cost of O(log n) time instead of O(1).
This is an adapter for HashMaps which I implemented for a recent project. Works in a way similart to what #SandyR does, but encapsulates conversion logic so you don't manually convert strings to a wrapper object.
I used Java 8 features but with a few changes, you can adapt it to previous versions. I tested it for most common scenarios, except new Java 8 stream functions.
Basically it wraps a HashMap, directs all functions to it while converting strings to/from a wrapper object. But I had to also adapt KeySet and EntrySet because they forward some functions to the map itself. So I return two new Sets for keys and entries which actually wrap the original keySet() and entrySet().
One note: Java 8 has changed the implementation of putAll method which I could not find an easy way to override. So current implementation may have degraded performance especially if you use putAll() for a large data set.
Please let me know if you find a bug or have suggestions to improve the code.
package webbit.collections;
import java.util.*;
import java.util.function.*;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
public class CaseInsensitiveMapAdapter<T> implements Map<String,T>
{
private Map<CaseInsensitiveMapKey,T> map;
private KeySet keySet;
private EntrySet entrySet;
public CaseInsensitiveMapAdapter()
{
}
public CaseInsensitiveMapAdapter(Map<String, T> map)
{
this.map = getMapImplementation();
this.putAll(map);
}
#Override
public int size()
{
return getMap().size();
}
#Override
public boolean isEmpty()
{
return getMap().isEmpty();
}
#Override
public boolean containsKey(Object key)
{
return getMap().containsKey(lookupKey(key));
}
#Override
public boolean containsValue(Object value)
{
return getMap().containsValue(value);
}
#Override
public T get(Object key)
{
return getMap().get(lookupKey(key));
}
#Override
public T put(String key, T value)
{
return getMap().put(lookupKey(key), value);
}
#Override
public T remove(Object key)
{
return getMap().remove(lookupKey(key));
}
/***
* I completely ignore Java 8 implementation and put one by one.This will be slower.
*/
#Override
public void putAll(Map<? extends String, ? extends T> m)
{
for (String key : m.keySet()) {
getMap().put(lookupKey(key),m.get(key));
}
}
#Override
public void clear()
{
getMap().clear();
}
#Override
public Set<String> keySet()
{
if (keySet == null)
keySet = new KeySet(getMap().keySet());
return keySet;
}
#Override
public Collection<T> values()
{
return getMap().values();
}
#Override
public Set<Entry<String, T>> entrySet()
{
if (entrySet == null)
entrySet = new EntrySet(getMap().entrySet());
return entrySet;
}
#Override
public boolean equals(Object o)
{
return getMap().equals(o);
}
#Override
public int hashCode()
{
return getMap().hashCode();
}
#Override
public T getOrDefault(Object key, T defaultValue)
{
return getMap().getOrDefault(lookupKey(key), defaultValue);
}
#Override
public void forEach(final BiConsumer<? super String, ? super T> action)
{
getMap().forEach(new BiConsumer<CaseInsensitiveMapKey, T>()
{
#Override
public void accept(CaseInsensitiveMapKey lookupKey, T t)
{
action.accept(lookupKey.key,t);
}
});
}
#Override
public void replaceAll(final BiFunction<? super String, ? super T, ? extends T> function)
{
getMap().replaceAll(new BiFunction<CaseInsensitiveMapKey, T, T>()
{
#Override
public T apply(CaseInsensitiveMapKey lookupKey, T t)
{
return function.apply(lookupKey.key,t);
}
});
}
#Override
public T putIfAbsent(String key, T value)
{
return getMap().putIfAbsent(lookupKey(key), value);
}
#Override
public boolean remove(Object key, Object value)
{
return getMap().remove(lookupKey(key), value);
}
#Override
public boolean replace(String key, T oldValue, T newValue)
{
return getMap().replace(lookupKey(key), oldValue, newValue);
}
#Override
public T replace(String key, T value)
{
return getMap().replace(lookupKey(key), value);
}
#Override
public T computeIfAbsent(String key, final Function<? super String, ? extends T> mappingFunction)
{
return getMap().computeIfAbsent(lookupKey(key), new Function<CaseInsensitiveMapKey, T>()
{
#Override
public T apply(CaseInsensitiveMapKey lookupKey)
{
return mappingFunction.apply(lookupKey.key);
}
});
}
#Override
public T computeIfPresent(String key, final BiFunction<? super String, ? super T, ? extends T> remappingFunction)
{
return getMap().computeIfPresent(lookupKey(key), new BiFunction<CaseInsensitiveMapKey, T, T>()
{
#Override
public T apply(CaseInsensitiveMapKey lookupKey, T t)
{
return remappingFunction.apply(lookupKey.key, t);
}
});
}
#Override
public T compute(String key, final BiFunction<? super String, ? super T, ? extends T> remappingFunction)
{
return getMap().compute(lookupKey(key), new BiFunction<CaseInsensitiveMapKey, T, T>()
{
#Override
public T apply(CaseInsensitiveMapKey lookupKey, T t)
{
return remappingFunction.apply(lookupKey.key,t);
}
});
}
#Override
public T merge(String key, T value, BiFunction<? super T, ? super T, ? extends T> remappingFunction)
{
return getMap().merge(lookupKey(key), value, remappingFunction);
}
protected Map<CaseInsensitiveMapKey,T> getMapImplementation() {
return new HashMap<>();
}
private Map<CaseInsensitiveMapKey,T> getMap() {
if (map == null)
map = getMapImplementation();
return map;
}
private CaseInsensitiveMapKey lookupKey(Object key)
{
return new CaseInsensitiveMapKey((String)key);
}
public class CaseInsensitiveMapKey {
private String key;
private String lookupKey;
public CaseInsensitiveMapKey(String key)
{
this.key = key;
this.lookupKey = key.toUpperCase();
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
CaseInsensitiveMapKey that = (CaseInsensitiveMapKey) o;
return lookupKey.equals(that.lookupKey);
}
#Override
public int hashCode()
{
return lookupKey.hashCode();
}
}
private class KeySet implements Set<String> {
private Set<CaseInsensitiveMapKey> wrapped;
public KeySet(Set<CaseInsensitiveMapKey> wrapped)
{
this.wrapped = wrapped;
}
private List<String> keyList() {
return stream().collect(Collectors.toList());
}
private Collection<CaseInsensitiveMapKey> mapCollection(Collection<?> c) {
return c.stream().map(it -> lookupKey(it)).collect(Collectors.toList());
}
#Override
public int size()
{
return wrapped.size();
}
#Override
public boolean isEmpty()
{
return wrapped.isEmpty();
}
#Override
public boolean contains(Object o)
{
return wrapped.contains(lookupKey(o));
}
#Override
public Iterator<String> iterator()
{
return keyList().iterator();
}
#Override
public Object[] toArray()
{
return keyList().toArray();
}
#Override
public <T> T[] toArray(T[] a)
{
return keyList().toArray(a);
}
#Override
public boolean add(String s)
{
return wrapped.add(lookupKey(s));
}
#Override
public boolean remove(Object o)
{
return wrapped.remove(lookupKey(o));
}
#Override
public boolean containsAll(Collection<?> c)
{
return keyList().containsAll(c);
}
#Override
public boolean addAll(Collection<? extends String> c)
{
return wrapped.addAll(mapCollection(c));
}
#Override
public boolean retainAll(Collection<?> c)
{
return wrapped.retainAll(mapCollection(c));
}
#Override
public boolean removeAll(Collection<?> c)
{
return wrapped.removeAll(mapCollection(c));
}
#Override
public void clear()
{
wrapped.clear();
}
#Override
public boolean equals(Object o)
{
return wrapped.equals(lookupKey(o));
}
#Override
public int hashCode()
{
return wrapped.hashCode();
}
#Override
public Spliterator<String> spliterator()
{
return keyList().spliterator();
}
#Override
public boolean removeIf(Predicate<? super String> filter)
{
return wrapped.removeIf(new Predicate<CaseInsensitiveMapKey>()
{
#Override
public boolean test(CaseInsensitiveMapKey lookupKey)
{
return filter.test(lookupKey.key);
}
});
}
#Override
public Stream<String> stream()
{
return wrapped.stream().map(it -> it.key);
}
#Override
public Stream<String> parallelStream()
{
return wrapped.stream().map(it -> it.key).parallel();
}
#Override
public void forEach(Consumer<? super String> action)
{
wrapped.forEach(new Consumer<CaseInsensitiveMapKey>()
{
#Override
public void accept(CaseInsensitiveMapKey lookupKey)
{
action.accept(lookupKey.key);
}
});
}
}
private class EntrySet implements Set<Map.Entry<String,T>> {
private Set<Entry<CaseInsensitiveMapKey,T>> wrapped;
public EntrySet(Set<Entry<CaseInsensitiveMapKey,T>> wrapped)
{
this.wrapped = wrapped;
}
private List<Map.Entry<String,T>> keyList() {
return stream().collect(Collectors.toList());
}
private Collection<Entry<CaseInsensitiveMapKey,T>> mapCollection(Collection<?> c) {
return c.stream().map(it -> new CaseInsensitiveEntryAdapter((Entry<String,T>)it)).collect(Collectors.toList());
}
#Override
public int size()
{
return wrapped.size();
}
#Override
public boolean isEmpty()
{
return wrapped.isEmpty();
}
#Override
public boolean contains(Object o)
{
return wrapped.contains(lookupKey(o));
}
#Override
public Iterator<Map.Entry<String,T>> iterator()
{
return keyList().iterator();
}
#Override
public Object[] toArray()
{
return keyList().toArray();
}
#Override
public <T> T[] toArray(T[] a)
{
return keyList().toArray(a);
}
#Override
public boolean add(Entry<String,T> s)
{
return wrapped.add(null );
}
#Override
public boolean remove(Object o)
{
return wrapped.remove(lookupKey(o));
}
#Override
public boolean containsAll(Collection<?> c)
{
return keyList().containsAll(c);
}
#Override
public boolean addAll(Collection<? extends Entry<String,T>> c)
{
return wrapped.addAll(mapCollection(c));
}
#Override
public boolean retainAll(Collection<?> c)
{
return wrapped.retainAll(mapCollection(c));
}
#Override
public boolean removeAll(Collection<?> c)
{
return wrapped.removeAll(mapCollection(c));
}
#Override
public void clear()
{
wrapped.clear();
}
#Override
public boolean equals(Object o)
{
return wrapped.equals(lookupKey(o));
}
#Override
public int hashCode()
{
return wrapped.hashCode();
}
#Override
public Spliterator<Entry<String,T>> spliterator()
{
return keyList().spliterator();
}
#Override
public boolean removeIf(Predicate<? super Entry<String, T>> filter)
{
return wrapped.removeIf(new Predicate<Entry<CaseInsensitiveMapKey, T>>()
{
#Override
public boolean test(Entry<CaseInsensitiveMapKey, T> entry)
{
return filter.test(new FromCaseInsensitiveEntryAdapter(entry));
}
});
}
#Override
public Stream<Entry<String,T>> stream()
{
return wrapped.stream().map(it -> new Entry<String, T>()
{
#Override
public String getKey()
{
return it.getKey().key;
}
#Override
public T getValue()
{
return it.getValue();
}
#Override
public T setValue(T value)
{
return it.setValue(value);
}
});
}
#Override
public Stream<Map.Entry<String,T>> parallelStream()
{
return StreamSupport.stream(spliterator(), true);
}
#Override
public void forEach(Consumer<? super Entry<String, T>> action)
{
wrapped.forEach(new Consumer<Entry<CaseInsensitiveMapKey, T>>()
{
#Override
public void accept(Entry<CaseInsensitiveMapKey, T> entry)
{
action.accept(new FromCaseInsensitiveEntryAdapter(entry));
}
});
}
}
private class EntryAdapter implements Map.Entry<String,T> {
private Entry<String,T> wrapped;
public EntryAdapter(Entry<String, T> wrapped)
{
this.wrapped = wrapped;
}
#Override
public String getKey()
{
return wrapped.getKey();
}
#Override
public T getValue()
{
return wrapped.getValue();
}
#Override
public T setValue(T value)
{
return wrapped.setValue(value);
}
#Override
public boolean equals(Object o)
{
return wrapped.equals(o);
}
#Override
public int hashCode()
{
return wrapped.hashCode();
}
}
private class CaseInsensitiveEntryAdapter implements Map.Entry<CaseInsensitiveMapKey,T> {
private Entry<String,T> wrapped;
public CaseInsensitiveEntryAdapter(Entry<String, T> wrapped)
{
this.wrapped = wrapped;
}
#Override
public CaseInsensitiveMapKey getKey()
{
return lookupKey(wrapped.getKey());
}
#Override
public T getValue()
{
return wrapped.getValue();
}
#Override
public T setValue(T value)
{
return wrapped.setValue(value);
}
}
private class FromCaseInsensitiveEntryAdapter implements Map.Entry<String,T> {
private Entry<CaseInsensitiveMapKey,T> wrapped;
public FromCaseInsensitiveEntryAdapter(Entry<CaseInsensitiveMapKey, T> wrapped)
{
this.wrapped = wrapped;
}
#Override
public String getKey()
{
return wrapped.getKey().key;
}
#Override
public T getValue()
{
return wrapped.getValue();
}
#Override
public T setValue(T value)
{
return wrapped.setValue(value);
}
}
}
Because of this, I'm creating a new object of CaseInsensitiveString for every event. So, it might hit performance.
Creating wrappers or converting key to lower case before lookup both create new objects. Writing your own java.util.Map implementation is the only way to avoid this. It's not too hard, and IMO is worth it. I found the following hash function to work pretty well, up to few hundred keys.
static int ciHashCode(String string)
{
// length and the low 5 bits of hashCode() are case insensitive
return (string.hashCode() & 0x1f)*33 + string.length();
}
I like using ICU4J’s CaseInsensitiveString wrap of the Map key because it takes care of the hash\equals and issue and it works for unicode\i18n.
HashMap<CaseInsensitiveString, String> caseInsensitiveMap = new HashMap<>();
caseInsensitiveMap.put("tschüß", "bye");
caseInsensitiveMap.containsKey("TSCHÜSS"); # true
I find solutions which require you to change the key (e.g., toLowerCase) very unwelcome and solutions which require TreeMap also unwelcome.
Since TreeMap changes the time complexity (compared to other HashMaps), I think it's more viable to simply go with a utility method that is O(n):
public static <T> T getIgnoreCase(Map<String, T> map, String key) {
for(Entry<String, T> entry : map.entrySet()) {
if(entry.getKey().equalsIgnoreCase(key))
return entry.getValue();
}
return null;
}
This is that method. Since the sacrifice to performance (time complexity) looks inevitable, at least this doesn't require you to change the underlying map to suit the lookup.

Keeping mutable objects sorted in TreeSets at all times

It came to my notice that a TreeSet doesn't keep the mutable objects in sorted order if object attribute values are changed later on. For example,
public class Wrap {
static TreeSet<Student> ts = new TreeSet<Student>(new Comparator<Student>(){
#Override
public int compare(Student o1, Student o2) {
return o1.age - o2.age;
}
});
public static void main(String []args){
Student s = new Student(10);
ts.add(s);
ts.add(new Student(50));
ts.add(new Student(30));
ts.add(new Student(15));
System.out.println(ts);
s.age = 24; //Here I change the age of a student in the TreeSet
System.out.println(ts);
}
}
class Student{
int age;
Student(int age){
this.age = age;
}
#Override
public String toString() {
return "Student [age=" + age + "]";
}
}
The output is :
[Student [age=10], Student [age=15], Student [age=30], Student [age=50]]
[Student [age=24], Student [age=15], Student [age=30], Student [age=50]]
After I change the age of a particular student, and then print the TreeSet, the Set seems no longer in sorted order. Why does this happen? and how to keep it sorted always?
Why does this happen?
Because the set cannot monitor all its objects for changes... How would it be able to do that?!
Same problem arises for HashSets. You can't change values affecting an objects hash-code when a HashSet holds the object.
and how to keep it sorted always?
You typically remove the element from the set, modify it, and then reinsert it. In other words, change
s.age = 24; //Here I change the age of a student in the TreeSet
to
ts.remove(s);
s.age = 24; //Here I change the age of a student in the TreeSet
ts.add(s);
You can also use for example a list, and call Collections.sort on the list each time you've modified an object.
You could make use of the observer pattern. Let your TreeSet implement Observer and let your Student extend Observable. The only change you need to make is to hide the age field by encapsulation so that you have more internal control over the change.
Here's a kickoff example:
public class ObservableTreeSet<O extends Observable> extends TreeSet<O> implements Observer {
public ObservableTreeSet(Comparator<O> comparator) {
super(comparator);
}
#Override
public boolean add(O element) {
element.addObserver(this);
return super.add(element);
}
#Override
#SuppressWarnings("unchecked")
public void update(Observable element, Object arg) {
remove(element);
add((O) element);
}
}
and
public class Student extends Observable {
private int age;
Student(int age) {
this.age = age;
}
public int getAge() {
return age;
}
public void setAge(int age) {
if (this.age != age) {
setChanged();
}
this.age = age;
if (hasChanged()) {
notifyObservers();
}
}
#Override
public String toString() {
return "Student [age=" + age + "]";
}
}
Now do a new ObservableTreeSet instead of new TreeSet.
static TreeSet<Student> ts = new ObservableTreeSet<Student>(new Comparator<Student>() {
#Override
public int compare(Student o1, Student o2) {
return o1.getAge() - o2.getAge();
}
});
It's ugly at first sight, but you end up with no changes in the main code. Just do a s.setAge(24) and the TreeSet will "reorder" itself.
This is a generic problem with Maps and Sets. The values are inserted using the hashCode/equals/compare at the moment of insertion, and if the values on which these methods are based change, then the structures can screw up.
One way would be to remove the item from the set and re-add it after the value has been changed. Then it would be correct.
Glazed Lists can help: http://www.glazedlists.com/
I use it for its EventList and haven't tried sorting. But on their home page they list the main features:
Live Sorting means your table stays sorted as your data changes.
Generally, it is best to manually keep your sorted Set/Map continuously consistent (see the strategy mentioned by #aioobe).
However, sometimes this is not an option. In these cases we can try this:
if (treeSet.contains(item)) {
treeSet.remove(item);
treeSet.add(item);
}
or with a map:
if (treeMap.containsKey(key)) {
Value value = treeMap.get(key);
treeMap.remove(key);
treeMap.put(key, value);
}
But this will not work correctly, because even containsKey can result with an incorrect result.
So what can we do with a dirty map? How can we refresh a single key without having to rebuild the entire map? Here is a utility class to solve this problem (can be easily converted to handle sets):
public class MapUtil {
/**
* Rearranges a mutable key in a (potentially sorted) map
*
* #param map
* #param key
*/
public static <K, V> void refreshItem(Map<K, V> map, K key) {
SearchResult<K, V> result = MapUtil.searchMutableKey(map, key);
if (result.found) {
result.iterator.remove();
map.put(key, result.value);
}
}
/**
* Searches a mutable key in a (potentially sorted) map
*
* Warning: currently this method uses equals() to check equality.
* The returned object contains three fields:
* - `found`: true iff the key found
* - `value`: the value under the key or null if `key` not found
* - `iterator`: an iterator pointed to the key or null if `key` not found
*
* #param map
* #param key
* #return
*/
public static <K, V> SearchResult<K, V> searchMutableKey(Map<K, V> map, K key) {
Iterator<Map.Entry<K, V>> entryIterator = map.entrySet().iterator();
while (entryIterator.hasNext()) {
Map.Entry<K, V> entry = entryIterator.next();
if (key.equals(entry.getKey())) {
return new SearchResult<K, V>(true, entry.getValue(), entryIterator);
}
}
return new SearchResult<K, V>(false, null, null);
}
public static class SearchResult<K, V> {
final public boolean found;
final public V value;
final public Iterator<Map.Entry<K, V>> iterator;
public SearchResult(boolean found, V value, Iterator<Map.Entry<K, V>> iterator) {
this.found = found;
this.value = value;
this.iterator = iterator;
}
}
}
If your problem is the iteration order, and you don't want to use the extra functionality of TreeSet (headSet() etc.), then use HashSet with custom iterator. Also, there is a major problem with your example: two students of the same age (often it happens) make conflict.
A possible solution:
public class Main {
public static void main(final String[] args) {
MagicSet<Student> ts = new MagicSet<Student>(new Comparator<Student>() {
#Override
public int compare(Student student1, Student student2) {
return student1.age - student2.age;
}
});
Student s = new Student(10);
ts.add(s);
ts.add(new Student(50));
ts.add(new Student(30));
ts.add(new Student(15));
System.out.println(ts); // 10, 15, 30, 50
s.age = 24;
System.out.println(ts); // 15, 24, 30, 50
}
public static class Student {
public int age;
public Student(int age) {
this.age = age;
}
#Override
public String toString() {
return "Student [age=" + age + "]";
}
}
public static class MagicSet<T> extends HashSet<T> {
private static final long serialVersionUID = -2736789057225925894L;
private final Comparator<T> comparator;
public MagicSet(Comparator<T> comparator) {
this.comparator = comparator;
}
#Override
public Iterator<T> iterator() {
List<T> sortedList = new ArrayList<T>();
Iterator<T> superIterator = super.iterator();
while (superIterator.hasNext()) {
sortedList.add(superIterator.next());
}
Collections.sort(sortedList, comparator);
return sortedList.iterator();
}
}
}

Categories