TreeSet ignoring values - java

I am trying to create a class that is given pairs of strings & told one is 'greater" than the other then keeps a running order of all known strings.
To do this I keep a Map<String, Set<String>> that maps a given String to all the values it's greater than, then I created a TreeSet with a comparator that uses that data to compare two strings.
Here's my class:
public class StringSorter {
private Map<String, Set<String>> greaterThan = new HashMap<>();
private SortedSet<String> order;
public StringSorter() {
order = new TreeSet<>((o1, o2) -> {
if (greaterThan.getOrDefault(o1, Collections.emptySet()).contains(o2))
return 1;
else if (greaterThan.getOrDefault(o2, Collections.emptySet()).contains(o1))
return -1;
return 0;
});
}
public void addRule(String bigger, String smaller) {
if (!greaterThan.containsKey(bigger))
greaterThan.put(bigger, new HashSet<>());
greaterThan.get(bigger).add(smaller);
order.add(bigger);
order.add(smaller);
}
public SortedSet<String> getOrder() {
return order;
}
}
However, for some reason, the TreeSet seems to just ignore many of the values added to it.
Example:
StringSorter sorter = new StringSorter();
sorter.addRule("one", "two");
sorter.addRule("two", "three");
sorter.addRule("three", "four");
System.out.println(sorter.getOrder());
Output:
[two, one]
What happened to the strings three & four?

The problem is that set collections keep unique values.
After debugging your comparator you will see that "three" was evaluated as equal to "one", thus forbidding it to be added to the set.
Consider this modification:
public StringSorter() {
order = new TreeSet<>((o1, o2) -> {
if (greaterThan.getOrDefault(o1, Collections.emptySet()).contains(o2))
return 1;
else if (greaterThan.getOrDefault(o2, Collections.emptySet()).contains(o1))
return -1;
else if(o1.equals(o2)) return 0;
else return -1; //or 1, or o1.compareTo(o2)
});
}
Instead of just returning 0 we first check if objects are equal, if not than the comparison itself is irrelevant and result may be arbitrary.
Here is the output when using updated comparator:
[four, three, two, one]
[Edit]
I would consider changing internal representation of rules to custom oriented tree data structure, represented by sparse adjacency matrix.

You can answer this yourself by adding this line to your Comparator.compare lambda:
System.out.printf("(%s, %s)%n", o1, o2);
As you’ll see, there is no guarantee that adjacent values are passed to the Comparator. When o1 is "three" and o2 is "one", the comparison falls back to returning zero, which tells the TreeSet that the two values are equal, and obviously it won’t add a value it perceives as equal to a value that’s already in the set.
You would need to make your traversal of greaterThan transitive. I’m pretty sure it will require recursion:
private boolean isGreater(String o1,
String o2,
Set<String> keysTried) {
Set<String> greaterSet = greaterThan.get(o1);
if (greaterSet == null) {
return false;
}
if (greaterSet.contains(o2)) {
return true;
}
for (String g : greaterSet) {
if (keysTried.add(g) && isGreater(g, o2, keysTried)) {
return true;
}
}
return false;
}
public StringSorter() {
order = new TreeSet<>((o1, o2) -> {
if (isGreater(o1, o2, new HashSet<>())) {
return 1;
} else if (isGreater(o2, o1, new HashSet<>())) {
return -1;
} else {
return 0;
}
});
}
The purpose of keysTried is to prevent infinite recursion. (In theory, that should never happen anyway, if greaterThan is a directed graph.)

Related

Java TreeMap custom comparator weird behaviour

I am trying to create a Map with sorted keys, sorted according to alphabetically first, and numerical last. For this I am using a TreeMap with a custom Comparator:
public static Comparator<String> ALPHA_THEN_NUMERIC_COMPARATOR =
new Comparator<String> () {
#Override
public int compare(String first, String second) {
if (firstLetterIsDigit(first)) {
return 1;
} else if (firstLetterIsDigit(second)) {
return -1;
}
return first.compareTo(second);
}
};
private static boolean firstLetterIsDigit(String string) {
return (string == null) ? false : Character.isDigit(string.charAt(0));
}
I've wrote the following unit test to illustrate what goes wrong:
#Test
public void testNumbericallyKeyedEntriesCanBeStored() {
Map<String, String> map = new HashMap<>();
map.put("a", "some");
map.put("0", "thing");
TreeMap<String, String> treeMap = new TreeMap<>(ALPHA_THEN_NUMERIC_COMPARATOR);
treeMap.putAll(map);
assertEquals("some", treeMap.get("a"));
assertEquals("thing", treeMap.get("0"));
}
With result:
java.lang.AssertionError:
Expected :thing
Actual :null
Check your comparator code. Does comparing "0" and "0" return 0, as it should? No it doesn't, since you don't check for equality if your string starts with a digit. You also don't return proper ordering if two strings both start with digits.
There are some requirements for a valid implementation of a Comparator. Quoting from the documentation:
The ordering imposed by a comparator c on a set of elements S is said to be consistent with equals if and only if c.compare(e1, e2)==0 has the same boolean value as e1.equals(e2) for every e1 and e2 in S.
This is not the case for your comparator: comparator.compare("0","0") will return 1 in your case.
And further:
Caution should be exercised when using a comparator capable of imposing an ordering inconsistent with equals to order a sorted set (or sorted map). Suppose a sorted set (or sorted map) with an explicit comparator c is used with elements (or keys) drawn from a set S. If the ordering imposed by c on S is inconsistent with equals, the sorted set (or sorted map) will behave "strangely." In particular the sorted set (or sorted map) will violate the general contract for set (or map), which is defined in terms of equals.
(emphasis by me - you may replace "strangely" with "weird", for your case ;-))
There are some degrees of freedom regarding the details of how such a comparator could be implemented. E.g. what should happen for keys like "123isNotNumeric"? Should the "numbers" always be single digits? Should they always be integers?
However, one possible implementation may look like this:
public class SpacialTreeSetComparator
{
public static void main(String[] args)
{
TreeMap<String, String> map = new TreeMap<String, String>(
ALPHA_THEN_NUMERIC_COMPARATOR);
map.put("b", "x");
map.put("a", "x");
map.put("1", "x");
map.put("0", "x");
System.out.println(map.keySet());
}
public static Comparator<String> ALPHA_THEN_NUMERIC_COMPARATOR =
new Comparator<String> () {
#Override
public int compare(String first, String second) {
Double firstNumber = asNumber(first);
Double secondNumber = asNumber(second);
if (firstNumber != null && secondNumber != null)
{
return firstNumber.compareTo(secondNumber);
}
if (firstNumber != null)
{
return 1;
}
if (secondNumber != null)
{
return -1;
}
return first.compareTo(second);
}
private Double asNumber(String string)
{
try
{
return Double.parseDouble(string);
}
catch (NumberFormatException e)
{
return null;
}
}
};
}
Printing the keySet() of the map prints the keys in the desired order:
[a, b, 0, 1]
Compactor code is not correct. In case of treeMap.get("0") equality is not satisfied.
The following code in compactor is not correct and causing issue for you. The compactor is also called when you fetch some element from MAP(to find matching key ). In case of "0" your alphanumeric code return true and following if condition return 1 , So it never found "0" equality to true for "0" that is why return NULL.
if (firstLetterIsDigit(first)) {
return 1;
} else if (firstLetterIsDigit(second)) {
return -1;
}

Best way to prioritize certain Strings while sorting a Map

I'd like to sort a map with strings, so that certain strings are prioritized and the rest is ordered as usual.
Like this:
"Dan", "value" //priority 1
"Eric", "value" //priority 2
"Ann", "value" //priority 3
"Bella", "value" //no priority
"Chris", "value" //no priority
Just like in this question.
I'm using a TreeMap and my current compare method looks like this:
public int compare(String o1, String o2) {
if (o1.equals(o2)) return 0;
if (o1.equals("Dan")) return -1;
if (o2.equals("Dan")) return 1;
if (o1.equals("Eric")) return -1;
if (o2.equals("Eric")) return 1;
if (o1.equals("Ann")) return -1;
if (o2.equals("Ann")) return 1;
else return o1.compareTo(o2);
}
As you can see, this gets quite cumbersome with more prioritized Strings.
Is there a better way to do this?
Solution (thanks to amit for the idea):
Use a 2nd map to store the priorities:
TreeMap<String, Integer> prio = new TreeMap<>();
prio.put("Dan", 1);
prio.put("Eric", 2);
prio.put("Ann", 3);
comparator = new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
if (prio.containsKey(o1)) {
if (prio.containsKey(o2)) {
return prio.get(o1).compareTo(prio.get(o2));
} else return -1;
} else if (prio.containsKey(o2)) {
return 1;
} else return o1.compareTo(o2);
}
};
Use a 2nd map:
Map<String,Integer> prio where the value is the priority of each string.
In your comparator - first compare according to prio.get(o1).compareTo(prio.get(o2))1, and only if the result is 0, fall back to regular String's compareTo().
It is important that prio does not change after your map is created, otherwise your map will be a complete chaos without an ability to find and insert elements properly.
(1) Make sure both elements exist first in prio, and if one does not - resolve.

While sorting the map based on value, some values are missing. What causes this weird behaviour?

I am trying to sort a map based on word frequency (i.e., based on value). For that I have overridden comparator and passed to TreeMap, but I am getting this weird output.
public class WordFrequency {
public static String sentence = "one three two two three three four four four";
public static Map<String, Integer> map;
public static void main(String[] args) {
map = new HashMap<>();
String[] words = sentence.split("\\s");
for (String word : words) {
Integer count = map.get(word);
if (count == null) {
count = 1;
} else {
++count;
}
map.put(word, count);
}
Comparator<String> myComparator = new Comparator<String>() {
#Override
public int compare(String s1, String s2) {
if (map.get(s1) < map.get(s2)) {
return -1;
} else if (map.get(s1) > map.get(s2)) {
return 1;
} else {
return 0;
}
}
};
SortedMap<String, Integer> sortedMap = new TreeMap<String, Integer>(myComparator);
System.out.println("Before sorting: " + map);
sortedMap.putAll(map);
System.out.println("After Sorting based on value:" + sortedMap);
}
}
Output:
Before sorting: {two=2, one=1, three=3, four=3}
After sorting based on value:{one=1, two=2, three=3}
Expected Output:
{one=1, two=2, four=3,three=3}
Your compare method fails to obey the contract of the Map interface, since it compares values instead of keys. Your implementation causes two keys with the same value to be considered the same key. Therefore your sortedMap doesn't contain the "four" key, which has the same value as the "three" key.
Note that the ordering maintained by a tree map, like any sorted map, and whether or not an explicit comparator is provided, must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a sorted map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.
TreeMap reference
You can fix this problem by comparing the keys when the values are equal :
Comparator<String> myComparator = new Comparator<String>() {
#Override
public int compare(String s1, String s2) {
if (map.get(s1) < map.get(s2)) {
return -1;
} else if (map.get(s1) > map.get(s2)) {
return 1;
} else {
return s1.compareTo(s2);
}
}
};
This should give you an output of :
After sorting based on value:{one=1, two=2, four=3, three=3}
Since four<three based on the natural ordering of Strings.
Because of your compare() is consider values only in the Map. Then three=3, four=3 has same value 3. Then those consider as duplicates when they add to TreeMap.
That's because your implementation is telling TreeMap that map[three] and map[four] are essentially the same element, because they are "equal" to each other according to your comparator.
Change "return 0" in Comparator to "return s1.compareTo(s2)", and you'll have
Before sorting: {two=2, one=1, three=3, four=3}
After Sorting based on value:{one=1, two=2, four=3, three=3}
(I believe you can figure out why "four" comes before "three" in this case)

Finding if Multiple Keys Map to the Same Value

In this problem, I have to have a map with keys and values of strings to see if multiple keys map to the same value. In other words, my method should return true of no two keys map to the same value while false if it does. My attempt to approach this was to put all the maps into a collection and examine each elem to see if there are two copies of the same value; that doesn't seem to be working for me however. Any suggestions will be appreciated, thanks.
The prompt:
Write a method isUnique that accepts a Map from strings to strings as a parameter and returns true if no two keys map to the same value (and false if any two or more keys do map to the same value). For example, calling your method on the following map would return true:
{Marty=Stepp, Stuart=Reges, Jessica=Miller, Amanda=Camp, Hal=Perkins}
Calling it on the following map would return false, because of two mappings for Perkins and Reges:
{Kendrick=Perkins, Stuart=Reges, Jessica=Miller, Bruce=Reges, Hal=Perkins}
The empty map is considered to be unique, so your method should return true if passed an empty map.
My attempt:
public static boolean isUnique(Map<String, String> input) {
Collection<String> values = input.values(); // stores all the values into a collection
for (String names: values) { // goes through each string to see if any duplicates
Iterator<String> wordList = values.iterator(); // iterates words in values collection
int repeat = 0; // counts number of repeats
// goes through each elem to compare to names
if (wordList.hasNext()) {
if (wordList.next().equals(names)) {
repeat++;
}
}
if (repeat > 1) { // if more than one copy of the value exists = multiple keys to same value
return false; // If multiple copies of same value exists
}
}
return true; // all unique values
}
If I understand your question, then I would implement your method generically like so -
public static <K, V> boolean isUnique(Map<K, V> input) {
if (input == null || input.isEmpty()) {
return true;
}
Set<V> set = new HashSet<V>();
for (V value : input.values()) {
set.add(value);
}
return set.size() == input.size();
}
One solution can be during iterating through the Map, you can store the values in Set of Strings. So if the size of original Map and Set is same, then there is no value that maps to two or more Key of Map.
As far as implementation goes, it can be done as follows:
public boolean checkMap(Map<String, String> map) {
Set<String> set = new HashSet<String>();
for(Entry<String, String> entry:map.entrySet()) {
set.add(entry.getValue);
}
if(map.size == set.size)
return true;
return false;
}
The shortest way that I can think of to do this is
public static boolean valuesAreUnique(Map<K,V> input) {
Collection<V> values = input.values();
return (new HashSet<V>(values)).size() == values.size();
}
However, it's not the most performant way of doing this, because as it builds the set, it will keep adding elements even after a duplicate has been found. So it would most likely perform better if you do the following, which takes advantage of the return value from the add method of the Set interface.
public static boolean valuesAreUnique(Map<K,V> input) {
Set<V> target = new HashSet<V>();
for (V value: input.values()) {
boolean added = target.add(value);
if (! added) {
return false;
}
}
return true;
}
Shrikant Kakani's and Elliott Frisch's approach are correct. But, we can make it more efficient by stopping the iteration once we have found a duplicate:
public static boolean isUnique(Map<String, String> input) {
Set<String> uniqueValues = new HashSet<String>();
for (String value : input.values()) {
if (uniqueValues.contains(value)) {
return false;
}
uniqueValues.add(value);
}
return true;
}
The exercises from the book are specific to the chapter, and as far as I understand, it is expected to have a solution per the topic covered. Its understandable that there are multiple and better solutions, which have been submitted above, but the given exercise covers the Map, keys, values, and methods related to them. Using below method stops as soon as the Value is used the second time.
public static boolean isUnique(Map<String, String> map){
Map<String, Integer> check = new HashMap<String, Integer>();
for (String v : map.values()){
if (check.containsKey(v)){
return false;
} else {
check.put(v, 1);
}
}
return true;
}

Sorting of Map based on keys

This is not basically how to sort the HashMap based on keys. For that I could directly use TreeMap without a wink :)
What I have at the moment is
Map<String, Object> favoritesMap = new HashMap<String, Object>();
and its contents can be
["Wednesdays" : "abcd"]
["Mondays" : "1234"]
["Not Categorized" : "pqrs"]
["Tuesdays" : "5678"]
I want to sort the HashMap based on keys and additional to this I need "Not Categorized" to be the last one to retrieve.
So expected while iterating over keySet is
["Mondays", "Tuesdays", "Wednesdays", "Not Categorized"] i.e. sorted on keys and "Not Categorized" is the last one
Thought of going for HashMap while creating and at the end add ["Not Categorized" : "pqrs"] but HashMap does not guarantee the order :)
Any other pointers for the solution?
Are you specifically excluding TreeMap for some external reason? If not you could obviously use TreeMap with a specially made Comparator.
Have you considered any of the other SortedMaps?
If TreeMap is definitely out I would extend HashMap and make it look like there is always one more entry but that is certainly not a trivial piece of work. You should have a very good reason not to use a SortedMap before going down this road.
Added
Here is an example of how you can make a particular entry always sort to the end using a TreeMap:
// This key should always appear at the end of the list.
public static final String AtEnd = "Always at the end";
// A sample map.
SortedMap<String, String> myMap =
new TreeMap<>(
new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
return o1.equals(AtEnd) ? 1 : o2.equals(AtEnd) ? -1 : o1.compareTo(o2);
}
});
private void test() {
myMap.put("Monday", "abc");
myMap.put("Tuesday", "def");
myMap.put("Wednesday", "ghi");
myMap.put(AtEnd, "XYZ");
System.out.println("myMap: "+myMap);
// {Monday=abc, Tuesday=def, Wednesday=ghi, Always at the end=XYZ}
}
I wonder if you are looking for some variant of that?
You can achieve this by using LinkedHashMap as it guarantees to return results in the order of insertion.
Also check the following post to understand difference between map types.
Difference between HashMap, LinkedHashMap and TreeMap
Or just a create a custom class which holds a different key than the value. Sort according to the key of that class. For your case make the key same value as the day, and for "Not Categorized" case ensure that its key starts later than any of the other keys, for example make it "Z_Not Categorized".
public ComplexKey
{
String key;
String value;
}
ComplexKey monday = new ComplexKey("monday", "monday");
ComplexKey notCategorized = new ComplexKey("Z_Not Categorized", "Not Categorized");
Then you can write a custom comparator which sort the values according to the key of complexKey class.
In your case I would use a TreeMap:
Map<DayOfWeek, Object> favoritesMap = new TreeMap<>();
where DayOfWeek is a class you declare like:
class DayOfWeek implements Comparable<DayOfWeek> {
as it's not convenient to sort days of wooks as strings.
In fact, the keys are always sorted. If you output the map a couple of times, you will find that the result remains the same.
First I'll gossip again on hashing:
The reason is hashing. Each object has hashCode() method. The hash space is like a large array which contains all the possible hash values as indices. When a new element is inserted into a HashSet or a new pair is put into a HashMap, it is placed in the hash space according to its hash code. If two elements have the same hash code, they will be compared with equals() method, if unequal, then the new element will be placed next to it.
Then if you know what happens there, you can implement some code like below:
import java.util.*;
class MyString {
private String str;
public MyString (String str) {
this.str = str;
}
public String toString () {
return str;
}
public boolean equals (Object obj) {
if (obj.getClass().equals(MyString.class)) {
return obj.toString().equals(str);
}
return false;
}
public int hashCode () {
if (str.equalsIgnoreCase("Not Categorized")) {
return Integer.MAX_VALUE;
} else if (str.hashCode() == Integer.MAX_VALUE) {
return 0;
}
return str.hashCode();
}
}
public class Test {
public static void main (String args[]) {
Map<MyString, String> m = new HashMap<MyString, String>();
m.put(new MyString("a"), "a");
m.put(new MyString("c"), "c");
m.put(new MyString("Not Categorized"), "NC");
m.put(new MyString("b"), "b");
Set<MyString> keys = m.keySet();
for (MyString k : keys) {
System.out.println(m.get(k));
}
}
}
The result is "Not Categorized" always comes at last. The reason is simple: it's hash value is always the maximum of integer.
The reason I create a String wrapper class is String class is final, it can't be extended. So in this way, you would have your class structure a little change, but not much.
It is possible to use TreeMap, though it would be less efficient:
public static void main (String args[]) {
Map<String, String> m = new TreeMap<String, String>(new Comparator<String>() {
public int compare (String s1, String s2) {
if (s1.equals(s2)) {
return 0;
}
if (s1.equalsIgnoreCase("Not Categorized")) {
return 1;
}
if (s2.equalsIgnoreCase("Not Categorized")) {
return -1;
}
if (s1.hashCode() > s2.hashCode()) {
return 1;
} else if (s1.hashCode() < s2.hashCode()) {
return -1
} else {
return 0;
}
}
public boolean equals (Object obj) {
return false;
}
});
m.put("a", "a");
m.put("c", "c");
m.put("Not Categorized", "NC");
m.put("b", "b");
Set<String> keys = m.keySet();
for (String k : keys) {
System.out.println(m.get(k));
}
}
The result is the same. It will sort all the elements, but it won't change the hashing order of other strings, it only ensures "Not Categorized" always comes to be the largest one.

Categories