Get value from a HashMap using a byte[] key [duplicate] - java

This question already has answers here:
Using a byte array as Map key
(13 answers)
Closed 4 years ago.
I'm trying to add some values to a map as
Map<byte[], byte[]> samplerMap = new HashMap<>();
samplerMap.put(Bytes.toBytes("key"), Bytes.toBytes("value"));
To fetch the value from map,
samplerMap.get(Bytes.toBytes("key"))
When I'm debugging this, I'm getting a null value. Are there any special case when using a byte[] as a key of a map. How may I fix this?

The problem is that two byte[] use the Object.hashCode, which tests for the instance of the object. Two array instances as created by new byte[...] will yield two different hash codes, keys, and hence almost always null is returned.
Furthermore equals does not work too, hence byte[] as key is no option.
You could use the string itself, as you are actual doing "key".getBytes(StandardCharsets.UTF_8).
Or create wrapper class:
public class ByteArray {
public final byte[] bytes;
public ByteArray(byte[] bytes) {
this.bytes = bytes;
}
#Override
public boolean equals(Object rhs) {
return rhs != null && rhs instanceof ByteArray
&& Arrays.equals(bytes, ((ByteArray)rhs).bytes);
}
#Override
public int hashCode() {
return Arrays.hashCode(bytes);
}
}

You can't use an array as a key of a HashMap since arrays don't override the default implementation of equals and hashCode. Therefore two different array instances that contain the exact same elements will be considered as different keys.
You can use a List<Byte> as key instead.

Related

Custom key generation and collision in a hashMap

I have a method that is expected to save an object in a hashmap (used as a cache) that has as a key a String.
The objects that are passed in the method have either fields that are “volatile” i.e. they can change on a next refresh from the data store or are the same across all objects except for 3 fields.
Those fields are 2 of type double and 1 field of type String.
What I am doing now is I use:
Objects.hash(doubleField1, doubleField2, theString)
and convert the result to String.
Basically I am generating a key for that object based on the immutable state.
This seems to work but I have the following question:
If we exclude the case that we have 2 objects with the exact same fields (which is impossible) how likely is that I could end up with a collision that won’t be able to be verified properly?
I mean that if I have a hashmap with keys e.g. strings etc if there is a collision on the hashCode the actual value of the key is compared to verify if it is the same object and not a collision.
Would using keys the way I have described create problems in such verification?
Update:
If I have e.g. a hashmap with key a Person and the hashCode is generated using fullName and ssn or dateOfBirth if there is a collision then the hashmap implementation uses equals to verify if it is the actual object being searched for.
I was wondering if the approach I describe could have some issue in that part because I generate the actual key directly
Here is a simple demo for a hashMap key implementation. When retrieving the object I construct the fields piecemeal to avoid any possibility of using cached Strings or Integers. It makes a more convincing demo.
Map<MyKey, Long> map = new HashMap<>();
map.put(new MyKey(10,"abc"), 1234556L);
map.put(new MyKey(400,"aefbc"), 548282L);
int n = 380;
long v = map.get(new MyKey(n + 20, "ae" + "fbc")); // Should get 548282
System.out.println(v);
prints
548282
The key class
class MyKey {
privat eint v;
private String s;
private int hashcode;
public MyKey(int v, String s) {
Objects.requireNonNull(s, "String must be provided");
this.v = v;
this.s = s;
// this class is immutable so no need to keep
// computing hashCode
hashcode = Objects.hash(s,v);
}
#Override
public int hashCode() {
return hashcode;
}
#Override
public boolean equals(Object o) {
if (o == this) {
return true;
}
if (o == null) {
return false;
}
if (o instanceof MyKey) {
MyKey mk = (MyKey)o;
return v == mk.v && s.equals(mk.s);
}
return false;
}
}

Restricting the Object Type for the get method in a Java HashMap [duplicate]

This question already has answers here:
What are the reasons why Map.get(Object key) is not (fully) generic
(11 answers)
Closed 3 years ago.
I have instantiated my HashMap like this:
Map<String, Integer> myHashMap = new HashMap<String, Integer>();
The datatype of the Key is String, so when I try to insert a new key-value pair in the map keeping the Key as Integer, it throws an error.
myHashMap.put(1L, "value");
That means in the put method they have restricted the datatype of the Key. But while fetching the value from the map using the get method it is not checking for the datatype of the Key. So if I write something like this, it doesn't give a compilation error.
myHashMap.get(1L);
I checked the get method in the Java Map interface and its parameter type is Object, so that's why it is allowing any Object as the put method argument.
V get(Object key)
Is there a way I can restrict the datatype which I pass as an argument in the get method?
The argument that I pass should have the same datatype as the datatype of the Key which I use while instantiating my hashmap.
It is designed that way, since during the get operation only the equals and hashCode is used to determine the object to be returned. The implementation of the get method does not check for the type of the Object used as the key.
In your example you are trying to get the value by passing a long like myHashMap.get(1L);, firstly the hash code of the object Long having the value 1L will be used to determine the bucket from which to look for. Next the equals method of the key is used to find out the exact entry of the map from which to return the value. And in a well-defined equals method there is always a check for the type:
public boolean equals(Object obj) {
if (obj instanceof Long) { //here type is checked
return value == ((Long)obj).longValue();
}
return false;
}
So if the types are not equal, the equals method returns false and hence get also will return null.
In some cases such as when using List as a key, it may happen that you put an item in the map using an instance of say an ArrayList but you can successfully retrieve the same value with an instance of an LinkedList. As both implement the List interface.
Map<List<String>, String> myHashMap = new HashMap<>();
List<String> arrayList = new ArrayList<>();
List<String> linkedList = new LinkedList<>();
myHashMap.put(arrayList, "foo");
System.out.println(myHashMap.get(linkedList));
The above code will output in the console foo.
Here although the implementations are different but if you examine the equals method of ArrayList, it is only checking if the type is a List:
public boolean equals(Object o) {
if (o == this) {
return true;
}
if (!(o instanceof List)) { //checking type of super interface
return false;
}
...
}
The same is true for LinkedList.
I think if it is very important in a project that we control type in HashMap, we could extend HashMap and force using this class instead of HashMap like the below code.
We have all HashMap capabilities, and we should just use the getValue method instead of the get method.
import java.util.HashMap;
public class MyHashMap<K,V> extends HashMap<K,V> {
public V getValue(K key) {
return super.get(key);
}
}
Test class:
public class Test {
public static void main(String[] args) {
MyHashMap<String,Integer> map = new MyHashMap();
}
}

Why doesn't HashSet maintain uniqueness? [duplicate]

This question already has answers here:
Java HashSet contains duplicates if contained element is modified
(7 answers)
Mutable objects and hashCode
(6 answers)
What happens to the lookup in a Hashmap or Hashset when the objects Hashcode changes
(4 answers)
Java: Modify id that changes hashcode
(1 answer)
mutable fields for objects in a Java Set
(4 answers)
Closed 5 years ago.
Consider the employee class -
public class Employer implements Serializable{
private Long id;
private String name;
#Override
public boolean equals(Object obj) {
if (obj == null)
return false;
if (obj instanceof Employer) {
Employer employer = (Employer) obj;
if (this.id == employer.id) {
return true;
}
}
return false;
}
//Idea from effective Java : Item 9
#Override
public int hashCode() {
int result = 17;
result = 31 * result + id.hashCode();
//result = 31 * result + name.hashCode();
return result;
}
}
With 2 employee objects created -
Employer employer1 = new Employer();
employer1.setId(10L);
Employer employer2 = new Employer();
employer2.setId(11L);
After adding them to the hashset, the size will be 2.
HashSet internally uses a hashmap to maintain the uniqueness-
private transient HashMap<E,Object> map;
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
Now, if I set the id for the second employee to be same as that of the first, i.e-
employer2.setId(10L);
the size still remains 2.
Why is it not 1? Does the in-variants get destroyed?
All hash-based containers, including HashSet<T>, make a very important assumption about hash code of their keys: they assume that hash code never changes while the object is inside the container.
Your code violates this assumption by modifying the instance while it is still in the hash set. There is no practical way for HashSet<T> to react to this change, so you must pick one of two ways to deal with this issue:
Never modify keys of hash-based containers - This is by far the most common approach, often achieved by making hash keys immutable.
Keep track of modifications, and re-hash objects manually - essentially, your code makes sure that all modifications to hash keys happen while they are outside containers: you remove the object from the container, make modifications, and then put it back.
The second approach often becomes a source of maintenance headaches. When you need to keep mutable data in a hash-based container, a good approach is to use only final fields in the computation of your hash code and equality checks. In your example this would mean making id field final, and removing setId method from the class.
the size still remains 2. Why is it not 1? Does the in-variants get destroyed?
If you modify any of the properties used to compute hashCode and equals for an instance already in the HashSet, the HashSet implementation is not aware of that change.
Therefore it will keep the two instances, even though they are now equal to each other.
You shouldn't make such updates for instances that are members or HashSets (or keys in HashMaps). If you must make such changes, remove the instance from the Set before mutating it and re-add it later.

Java Map<> with intrisic types possible?

How is this possible:
HashMap<byte[], byte[]> and what is hash() of byte[]?
Yes, it is possible (with a big caveat, see below), but byte[] is not an "intrinsic type". First, there's no such thing, you probably mean a "primitive type". Second: byte[] is not a primitive type, byte is. An array is always a reference type.
Arrays don't have specific hashCode implementations, so they'll just use the hashCode of Object, which means that the hashCode will be the indentity-hashCode, which is independent from the actual content.
In other words: a byte[] is a very bad Map key, because you can only retrieve the value with the exact same instance.
If you need a content-based hashCode() based on an array, you can use Arrays.hashCode(), but that won't help you (directly) with the Map. There's also Arrays.equals() to check for content equality.
You could wrap your byte[] in a thin wrapper object that implements hashCode() and equals() (using the methods mentioned above):
import java.util.Arrays;
public final class ArrayWrapper {
private final byte[] data;
private final int hash;
public ArrayWrapper(final byte[] data) {
// strictly speaking we should make a defensive copy here,
// but I *assume* (and should document) that the argument
// passed in here should not be changed
this.data = data;
this.hash = Arrays.hashCode(data);
}
#Override
public int hashCode() {
return hash
}
#Override
public boolean equals(Object o) {
if (!(o instanceof ArrayWrapper)) {
return false;
}
ArrayWrapper other = (ArrayWrapper) o;
return this.hash == other.hash && Arrays.equals(this.data, other.data);
}
// don't add getData to prevent having to do a defensive copy of data
}
Using this class you can then use a Map<ArrayWrapper,byte[]>.
For arrays hashCode() uses the default implementation from Object - typically some form of internal object address. As a result, key in this HashMap is considered unique if it is a different array, not if array contents are equal.
byte[] a = { 2, 3 };
byte[] b = { 2, 3 };
System.out.println(a.equals(b)); // false
Map<byte[], String> map = new HashMap<byte[], String>();
map.put(a, "A");
map.put(b, "B");
System.out.println(map); // {[B#37d2068d=B, [B#7ecec0c5=A}

How to check for key in a Map irrespective of the case? [duplicate]

This question already has answers here:
Is there a good way to have a Map<String, ?> get and put ignoring case? [duplicate]
(8 answers)
Closed 7 years ago.
I want to know whether a particular key is present in a HashMap, so i am using containsKey(key) method. But it is case sensitive ie it does not returns true if there is a key with Name and i am searching for name. So is there any way i can know without bothering the case of the key?
thanks
Not with conventional maps.
"abc" is a distinct string from "ABC", their hashcodes are different and their equals() methods will return false with respect to each other.
The simplest solution is to simply convert all inputs to uppercase (or lowercase) before inserting/checking. You could even write your own Map wrapper that would do this to ensure consistency.
If you want to maintain the case of the key as provided, but with case-insensitive comparison, you could look into using a TreeMap and supplying your own Comparator that will compare case-insensitively. However, think hard before going down this route as you will end up with some irreconcilable inconsistencies - if someone calls map.put("abc", 1) then map.put("ABC", 2), what case is the key stored in the map? Can you even make this make sense? Are you comfortable with the fact that if someone wraps your map in a standard e.g. HashMap you'll lose functionality? Or that if someone happens to be iterating through your keyset anyway, and does their own quick "contains" check by using equals() you'll get inconsistent results? There will be lots of other cases like this too. Note that you're violating the contract of Map by doing this (as key equality is defined in terms of the equals() method on the keys) so it's really not workable in any sense.
Maintaining a strict uppercase map is much easier to work with and maintain, and has the advantage of actually being a legal Map implementation.
Use a TreeMap which is constructed with String#CASE_INSENSITIVE_ORDER.
Map<String, String> map = new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER);
map.put("FOO", "FOO");
System.out.println(map.get("foo")); // FOO
System.out.println(map.get("Foo")); // FOO
System.out.println(map.get("FOO")); // FOO
You can use a TreeMap with a custom, case-insensitive Comparator (that uses String.compareToIgnoreCase())
For example:
Map<String, Something> map =
new TreeMap<String, Something>(CaseInsensitiveComparator.INSTANCE);
class CaseInsensitiveComparator implements Comparator<String> {
public static final CaseInsensitiveComparator INSTANCE =
new CaseInsensitiveComparator();
public int compare(String first, String second) {
// some null checks
return first.compareToIgnoreCase(second);
}
}
Update: it seems that String has already defined this Comparator as a constant.
There's a CaseInsensitiveMap class in Apache commons
http://commons.apache.org/collections/
To preserve the Map invariants, you could just make your own keys. Implement sensible hashCode/equals and you're good to go:
final class CaseInsensitive {
private final String s;
private final Local lc;
public CaseInsensitive (String s, Locale lc) {
if (lc == null) throw new NullPointerException();
this.s = s;
this.lc = lc;
}
private s(){ return s == null ? null : s.toUpperCase(lc); }
#Override
public int hashCode(){
String u = s();
return (u == null) ? 0 : u.hashCode();
}
#Override
public boolean equals(Object o){
if (!getClass().isInstance(o)) return false;
String ts = s(), os = ((CaseInsensitive)other).s();
if (ts == null) return os == null;
return ts.equals(os);
}
}
// Usage:
Map<CaseInsensitive, Integer> map = ...;
map.put(new CaseInsensitive("hax", Locale.ROOT), 1337);
assert map.get(new CaseInsensitive("HAX", Locale.ROOT) == 1337;
Note: Not everyone in the whole world agrees about what is uppercase of what - a famous example is that the upper-case version of "i" in Turkish is "İ", not "I".
Map uses equals and hashCode to test for key equality, and you can't overwrite these for String. What you could do is define your own Key class which contains a string value, but implements equals and hashCode in a case insensitive way.
The easiest way is to fold the keys yourself when inserting them and looking them up. I.e.
map.put(key.toLowerCase(), value);
and
map.get(key.toLowerCase());
You could subclass e.g. HashMap to get your own class with these, if you want this automatically done.
create your own wrapper of string class, implement equals and hashcode, use this as the key in the hashmap:
class MyStringKey
{
private String string;
public String getString()
{
return string;
}
public void setString(String string)
{
this.string = string;
}
public boolean equals(Object o)
{
return o instanceof MyStringKey && this.equalsIgnoreCase(((MyStringKey)o).getString());
}
public boolean hashCode()
{
return string.toLowerCase().hashcode(); //STRING and string may not have same hashcode
}
}
In an attempt to present an answer that matches your question's requirement "without bothering the case of the key"...
This answer may be tedious if you add into your map in many, many places. In my example it only happens when a user creates a new character (in my game). Here is how I handled this:
boolean caseInsensitiveMatch = false;
for (Map.Entry<String, Character> entry : MyServer.allCharacterMap.entrySet()) {
if (entry.getKey().toLowerCase().equals(charNameToCreate.toLowerCase())){
caseInsensitiveMatch = true;
break;
}
}
Of course this requires looping through my large ConcurrentHashMap, but works for me.

Categories