java.util.Map values() method performance - java

I have a map like this with several million entries:
private final Map<String, SomeItem> tops = new HashMap<>();
I need to get list of values, which could be done by calling java.util.Map values() method.
Is Collection of values created every time I call values() method or is it pre-computed from performance perspective?
As my Map has several millions elements, I do not want to create new list object every time values() is called.

Below is the copied implementation of Map.values() in java.util.HashMap:
public Collection<V> values() {
Collection<V> vs = values;
if (vs == null) {
vs = new Values();
values = vs;
}
return vs;
}
This clearly shows that the value collection isn't created unless necessary. So, there should not be additional overhead caused by calls to values()

One important point here may be: It does not matter!
But first, referring to the other answers so far: The collection that is returned there is usually "cached", in that it is lazily created, and afterwards, the same instance will be returned. For example, considering the implementation in the HashMap class:
public Collection<V> values() {
Collection<V> vs;
return (vs = values) == null ? (values = new Values()) : vs;
}
This is even specified (as part of the contract, as an implementation specification) in the documentation of the AbstractMap class (which most Map implementations are based on) :
The collection is created the first time this method is called, and returned in response to all subsequent calls. No synchronization is performed, so there is a slight chance that multiple calls to this method will not all return the same collection.
But now, one could argue that the implementation might change later. The implementation of the HashMap class could change, or one might switch to another Map implementation that does not extend AbstractMap, and which is implemented differently. The fact that it is currently implemented like this is (for itself) no guarantee that it will always be implemented like this.
So the more important point (and the reason why it does not matter) is that the values() method is indeed supposed to return a collection view. As stated in the documentation of the Map interface :
The Map interface provides three collection views, which allow a map's contents to be viewed as a set of keys, collection of values, or set of key-value mappings.
and specifically, the documentation of the Map#values() method :
Returns a Collection view of the values contained in this map. The collection is backed by the map, so changes to the map are reflected in the collection, and vice-versa.
I cannot imagine a reasonable way of implementing such a view that involves processing all values of the Map.
So for example, imagine the implementation in HashMap was like this:
public Collection<V> values() {
return new Values();
}
Then it would return a new collection each time that it was called. But creating this collection does not involve processing the values at all.
Or to put it that way: The cost of calling this method is independent of the size of the map. It basically has the cost of a single object creation, regardless of whether the map contains 10 or 10000 elements.

As others have mentioned you can see this by looking at the code. You can also code up a quick example to prove it to yourself. The code below will print true 10 times as the object identity will always be the same for values.
public static void main(String[] args) {
Map<String, String> myMap = new HashMap();
Collection<String> lastValues = myMap.values();
for (int i=0; i < 10; i++) {
System.out.println(lastValues == myMap.values());
lastValues = myMap.values();
}
}
The following code will print true the first time and then false the next 9 times.
public static void main(String[] args) {
Map<String, String> myMap = new HashMap();
Collection<String> lastValues = myMap.values();
for (int i=0; i < 10; i++) {
System.out.println(lastValues == myMap.values());
lastValues = myMap.values();
myMap = new HashMap();
}
}

One more suggestion after reading this thread, if the Map tops declared contents are not changed - you could use google guava ImmutableMap object. For more info- UnmodifiableMap (Java Collections) vs ImmutableMap (Google)

Related

What does HashMap(Map m) constructor does?

I have come across a piece of code where I found
public class MapImpl {
private static MapImpl mpl = new MapImpl();
Map<String,String> hm;
private MapImpl() {
hm = new HashMap<>();
}
public addContentsToMap(Map<String,String> m){
this.hm=m;
}
public Map returnMap(){
new HashMap<>(hm);
}
}
I like to know here that when the default constructor is called the map is initialized to hashmap, and when addContentsToMap is called a map is formed with values.
I see that the returnMap uses the constructor of the HashMap(Map m). I have gone through the source code of HashMap but was clueless.
It takes any implementation of Map interface and constructs a HashMap which also is an implementation of Map interface.
Developers like Hash-Collections (HashSet, HashMap etc.) including HashMap because they provide expected O(1) get and contains time.
It can be useful, once you have a Map which isn't a HashMap (e.g. Properties) and you know that it'll be large and you will read from it many times, it's useful to switch to a different implementation of a Map.
Documentation:
public HashMap(Map<? extends K,? extends V> m)
Constructs a new HashMap with the same mappings as the specified Map. The HashMap is created with default load factor (0.75) and an initial capacity sufficient to hold the mappings in the specified Map.
Parameters:
m - the map whose mappings are to be placed in this map
Throws:
NullPointerException - if the specified map is null

Comparison Error when Storing values in a List, Boolean Map

I have a fully working version of MineSweeper implemented in Java. However, I am trying to add an additional feature that updates a Map to store the indexes of the locations of the mines within a 2D array. For example, if location [x][y] holds a mine, I am storing a linked list containing x and y, which maps to a boolean that is true to indicate that the space holds a mine. (This feature is seemingly trivial, but I am just doing this to practice with Collections in Java.)
My relevant private instance variables include:
public Class World{ ...
private LinkedList<Integer> index;
private Map<LinkedList<Integer>, Boolean> revealed;
"index" is the list to be stored in the map as the key for each boolean.
In my constructor I have:
public World(){ ...
tileArr = new Tile[worldWidth][worldHeight];
revealed = new TreeMap<LinkedList<Integer>, Boolean>();
index = new LinkedList<Integer>();
... }
Now, in the method in which I place the mines, I have the following:
private void placeBomb(){
int x = ran.nextInt(worldWidth); //Random stream
int y = ran.nextInt(worldHeight); //Random stream
if (!tileArr[x][y].isBomb()){
tileArr[x][y].setBomb(true);
index.add(x); //ADDED COMPONENT
index.add(y);
revealed.put(index, true);
index.remove(x);
index.remove(y); //END OF ADDED COMPONENT
} else placeBomb();
}
Without the marked added component my program runs fine, and I have a fully working game. However, this addition gives me the following error.
Exception in thread "main" java.lang.ClassCastException: java.util.LinkedList
cannot be cast to java.lang.Comparable
If anyone could help point out where this error might be, it would be very helpful! This is solely for additional practice with collections and is not required to run the game.
There are actually about 3 issues here. One that you know about, one that you don't and a third which is just that using LinkedList as a key for a map is clunky.
The ClassCastException happens because TreeMap is a sorted set and requires that every key in it implement the Comparable interface, or else you have to provide a custom Comparator. LinkedList doesn't implement Comparable, so you get an exception. The solution here could be to use a different map, like HashMap, or you could write a custom Comparator.
A custom Comparator could be like this:
revealed = new TreeMap<List<Integer>, Boolean>(
// sort by x value first
Comparator.comparing( list -> list.get(0) )
// then sort by y if both x values are the same
.thenComparing( list -> list.get(1) )
);
(And I felt compelled to include this, which is a more robust example that isn't dependent on specific elements at specific indexes):
revealed = new TreeMap<>(new Comparator<List<Integer>>() {
#Override
public int compare(List<Integer> lhs, List<Integer> rhs) {
int sizeComp = Integer.compare(lhs.size(), rhs.size());
if (sizeComp != 0) {
return sizeComp;
}
Iterator<Integer> lhsIter = lhs.iterator();
Iterator<Integer> rhsIter = rhs.iterator();
while ( lhsIter.hasNext() && rhsIter.hasNext() ) {
int intComp = lhsIter.next().compareTo( rhsIter.next() );
if (intComp != 0) {
return intComp;
}
}
return 0;
}
});
The issue that you don't know about is that you're only ever adding one LinkedList to the map:
index.add(x);
index.add(y);
// putting index in to the map
// without making a copy
revealed.put(index, true);
// modifying index immediately
// afterwards
index.remove(x);
index.remove(y);
This is unspecified behavior, because you put the key in, then modify it. The documentation for Map says the following about this:
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.
What will actually happen (for TreeMap) is that you are always erasing the previous mapping. (For example, the first time you call put, let's say x=0 and y=0. Then the next time around, you set the list so that x=1 and y=1. This also modifies the list inside the map, so that when put is called, it finds there was already a key with x=1 and y=1 and replaces the mapping.)
So you could fix this by saying something like either of the following:
// copying the List called index
revealed.put(new LinkedList<>(index), true);
// this makes more sense to me
revealed.put(Arrays.asList(x, y), true);
However, this leads me to the 3rd point.
There are better ways to do this, if you want practice with collections. One way would be to use a Map<Integer, Map<Integer, Boolean>>, like this:
Map<Integer, Map<Integer, Boolean>> revealed = new HashMap<>();
{
revealed.computeIfAbsent(x, HashMap::new).put(y, true);
// the preceding line is the same as saying
// Map<Integer, Boolean> yMap = revealed.get(x);
// if (yMap == null) {
// yMap = new HashMap<>();
// revealed.put(x, yMap);
// }
// yMap.put(y, true);
}
That is basically like a 2D array, but with a HashMap. (It could make sense if you had a very, very large game board.)
And judging by your description, it sounds like you already know that you could just make a boolean isRevealed; variable in the Tile class.
From the spec of a treemap gives me this:
The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.
The Java Linkedlist can not be compared just like that. You have to give it a way to compare them or just use another type of map, that does not need sorting.

Reset all values in hashmap without iterating?

I am trying reset all values in a HashMap to some default value if a condition fails.
Currently i am doing this by iterating over all the keys and individually resetting the values.Is there any possible way to set a same value to all the keys without iterating?
Something like:
hm.putAll("some val") //hm is hashmap object
You can't avoid iterating but if you're using java-8, you could use the replaceAll method which will do that for you.
Apply the specified function to each entry in this map, replacing each
entry's value with the result of calling the function's Function#map
method with the current entry's key and value.
m.replaceAll((k,v) -> yourDefaultValue);
Basically it iterates through each node of the table the map holds and affect the return value of the function for each value.
#Override
public void replaceAll(BiFunction<? super K, ? super V, ? extends V> function) {
Node<K,V>[] tab;
if (function == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (Node<K,V> e = tab[i]; e != null; e = e.next) {
e.value = function.apply(e.key, e.value); //<-- here
}
}
if (modCount != mc)
throw new ConcurrentModificationException();
}
}
Example:
public static void main (String[] args){
Map<String, Integer> m = new HashMap<>();
m.put("1",1);
m.put("2",2);
System.out.println(m);
m.replaceAll((k,v) -> null);
System.out.println(m);
}
Output:
{1=1, 2=2}
{1=null, 2=null}
You can't avoid iterating in some fashion.
You could get the values via Map.values() and iterate over those. You'll bypass the lookup by key and it's probably the most efficient solution (although I suspect generally that would save you relatively little, and perhaps it's not the most obvious to a casual reader of your code)
IMHO You must create your own Data Structure that extends from Map. Then you can write your method resetAll() and give the default value. A Map is a quick balanced tree that allows you to walk quick in the structure and set the value. No worries about the speed, because the tree will have the same structure before and after the reset.
Only, be carefull with concurrent threads. Maybe you should use ConcurrentHashMap.
public class MyMap<K,V> extends ConcurrentHashMap<K, V>{
public void resetAll(V value){
Iterator<Entry<K, V>> it = this.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry)it.next();
pairs.setValue( value );
}
}
}
Regards
If you're willing to make a copy of it ( a hasmap with default values )
You can first clear your hashmap and then move over the default values
hm.keySet().removeAll();
hm.putAll(defaultMap);
It is not possible to apply an operation to all values in a collection in less than O(n) time, however if your objection is truly with iteration itself, there are some possible alternatives, notably functional programming.
This is made most easy by the Guava library (or natively in Java 8), and their functional programming utilities. Their Maps.transformValues() provides a view of the map, with the provided function applied. This means that the function returns in O(1) time, unlike your iteration, but that the computation is done on the fly whenever you .get() from the returned map. This is obviously a tradeoff - if you only need to .get() certain elements from the transformed map, you save time by avoiding computing unnecessary values. On the other hand, if you know you'll later hit every element at least once, using this behavior means you'll actually waste time. In essence, this approach is O(k) where k is the number of lookups you plan to do. If k is always less than n, then using the transformation approach is optimal.
Read carefully however the caveat at the top of the page; iteration is a simple, easy, and generally ideally efficient way to work with the members of a map. You should only try to optimize past that when absolutely necessary.
Assuming that your problem is not with doing the iteration yourself, but with the fact that O(n) is going on at some point, I would suggest a couple of alternative approaches. Bear in mind I have no idea what you are using this for, so it might not make any sense to you.
Case A: If your set of keys is known and fixed beforehand, keep a copy (not a reference, an actual clone) somewhere with the values reset to the one you want. Then on that condition you mention, simply switch the references to use the default one.
Case B: If they keys change over time, use the idea from case A but add new entries with the default value for every new key added (or remove accordingly). Your updates should hardly notice but you can still switch back to the default in O(1).

Most efficient way to clear a Java HashMap [duplicate]

This question already has answers here:
Map.clear() vs new Map : Which one will be better? [duplicate]
(7 answers)
Fastest way to recreate the ArrayList in a for loop
(4 answers)
Closed 8 years ago.
Using Java, I have a Map interface which has some items. I want to clear all the data in it to use it again. Which method is more efficient?
params.clear()
or
params = new HashMap();
I would prefer clear() because you can have the Map as final member.
class Foo {
private final Map<String, String> map = new HashMap<String, String>();
void add(String string) {
map.put(string, "a value");
}
void clear() {
map.clear();
}
}
If you assign a new Map every time you can run into multithreading issues.
Below is an almost threadsafe example for using a Map wrapped in Collections.synchronizedMap but it assigns a new map every time you clear it.
class MapPrinter {
private static Map<String, String> createNewMap() {
return Collections.synchronizedMap(new HashMap<String, String>());
}
private Map<String, String> map = createNewMap();
void add(String key, String value) {
// put is atomic due to synchronizedMap
map.put(key, value);
}
void printKeys() {
// to iterate, we need to synchronize on the map
synchronized (map) {
for (String key : map.values()) {
System.out.println("Key:" + key);
}
}
}
void clear() {
// hmmm.. this does not look right
synchronized(map) {
map = createNewMap();
}
}
}
The clear method is responsible for a big problem: synchonized(map) will no longer work as intended since the map object can change and now two threads can simultanously be within those synchronized blocks since they don't lock the same object. To make that actually threadsafe we would either have to synchronize completely externally (and .synchronizedMap would be useless) or we could simply make it final and use Map.clear().
void clear() {
// atomic via synchronizedMap
map.clear();
}
Other advantages of a final Map (or anything final)
No extra logic to check for null or to create a new one. The overhead in code you may have to write to change the map can be quite a lot.
No accidential forgetting to assign a Map
"Effective Java #13: Favor Immutability" - while the map is mutable, our reference is not.
In general:
if you don't know how clear() is implemented, you can't guess which one would be more performant. I can come up with synthetic use-cases where one or another would definitely win.
If your map does not hold millions and millions or records you can go either way. Performance would be the same.
Specifically:
HashMap clears by wiping content of the inner array. Making old map content available for GC immediately. When you create a new Hashmap it also makes old map content available for GC + the HashMap object itself. You are trading a few CPU cycles for slightly less memory to GC
You need to consider other issue:
Do you pass this reference to some other code/component? You might want to use clear() so that this other code sees your changes, reverse is also true
Do you want no-hassle, no side-effect new map? I'd go with creating a new one.
etc

Java - HashMap confusion about collision handling and the get() method

I'm using a HashMap and I haven't been able to get a straight answer on how the get() method works in the case of collisions.
Let's say n > 1 objects get placed in the same key. Are they stored in a LinkedList? Are they overwritten so that only the last object placed in that key exists there anymore? Are they using some other collision method?
If they are placed in a LinkedList, is there a way to retrieve that entire list? If not, is there some other built in map for Java in which I can do this?
For my purposes, separate chaining would be ideal, as if there are collisions, I need to be able to look through the list and get information about all the objects in it. What would be the best way to do this in Java?
Thanks for all your help!
The documentation for Hashmap.put() clearly states, "Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced"
If you would like to have a list of objects associated with a key, then store a list as the value.
Note that 'collision' generally refers to the internal working of the HashMap, where two keys have the same hash value, not the use of the same key for two different values.
Are they overwritten so that only the last object placed in that key exists there anymore?
Yes, assuming you're putting multiple values with the same key (according to Object.equals, not Object.hashCode.) That's specified in the Map.put javadoc:
If the map previously contained a mapping for the key, the old value is replaced by the specified value.
If you want to map a key to multiple values, you're probably better off using something like Guava's ListMultimap, ArrayListMultimap in specific, which maps keys to lists of values. (Disclosure: I contribute to Guava.) If you can't tolerate a third-party library, then really you have to have a Map<Key, List<Value>>, though that can get a bit unwieldy.
Let's say n > 1 objects get placed in the same key. Are they stored in a linked list? Are they overwritten so that only the last object placed in that key exists there anymore? Are they using some other collision method?
There could be single instance for the same key so the last one overrides the prior one
Map<String, Integer> map = new HashMap<String, Integer>();
map.put("a", 1);
map.put("a", 2);// it overrides 1 and puts 2 there
chaining comes where there turns the same hash for different keys
See
Java papers hash table working
Cite: "Let's say n > 1 objects get placed in the same key. Are they stored in a linked list? Are they overwritten so that only the last object placed in that key exists there anymore? Are they using some other collision method?"
Yes, if the hashmap contained something under this key, it will override it.
You can implement your own class to handle that or more simple use a HashMap> in where K is your Key Object and V the object value.
Have in mind that with last solution when you do a map.get(K) will retrieve a List or the implementation that you choose (i.e: ArrayList) so all the methods of this implementation are available for you and perhaps fulfils your requirements. For example if you used Arraylist you have the size, trimToSize, removeRange, etc.
collision resolution for hashing in java is not based on chaining. To my understanding, JDK uses double hashing which is one of the best way of open addressing. So there's no list going to be associated with a hash slot.
You might put the objects for which the hash function resolves to the same key can be put in list and this list can be updated in the table/map.
package hashing;
import java.util.HashMap;
import java.util.Map;
public class MainAnimal {
/**
* #param args
*/
public static void main(String[] args) {
Animal a1 = new Animal(1);
Animal a2 = new Animal(2);
Map<Animal, String> animalsMap = new HashMap<Animal, String>();
animalsMap.put(a1,"1");
animalsMap.put(a2,"2");
System.out.println(animalsMap.get(a1));
Map<String, Integer> map = new HashMap<String, Integer>();
map.put("a", 1);
map.put("a", 2);// it overrides 1 and puts 2 there
System.out.println(map.get("a"));
}
}
class Animal {
private int index = 0;
Animal(int index){
this.index = index;
}
public boolean equals(Object obj){
if(obj instanceof Animal) {
Animal animal = (Animal) obj;
if(animal.getIndex()==this.getIndex())
return true;
else
return false;
}
return false;
}
public int hashCode() {
return 0;
}
public int getIndex() {
return index;
}
public void setIndex(int index) {
this.index = index;
}
}
In the above code, am showing two different things.
case 1 - two different instances resolving to same hashkey
case 2 - two same instances acting as keys for two different entries.
Animal instances, a1 & a2 resolves to same key. But they are not overriden. Hashing mechanism probes through the hash slots and places the entries on different slots.
with the second case, keys resolve to same hash key and also the equals method satisfies. Hence overriding happens.
Now if in the animal class I override the equals method this way -
public boolean equals(Object obj){
// if(obj instanceof Animal) {
// Animal animal = (Animal) obj;
// if(animal.getIndex()==this.getIndex())
// return true;
// else
// return false;
// }
// return false;
return true;
}
Overriding happens. The behavior is like using same instance. Since a1 and a2 are in the same bucket and equals return true as well.

Categories