Concurrent Reads from Unmodfiable Map - java

If I statically initialize a map and set the reference to a Collections.unmodifiableMap(Map m). Do I need to synchronize reads?
private static final Map<String,String> staticMap;
static{
Map<String,String> tempMap = new HashMap<String,String>();
tempMap.put("key 1","value 1");
tempMap.put("key 2","value 2");
tempMap.put("key 3","value 3");
staticMap = Collections.unmodifiableMap(tempMap);
}

No, the map you're creating there is effectively immutable (since nothing has a reference to the mutable backing map) and safe for concurrent access. If you want a clearer guarantee of that along with making it easier to create the map, Guava's ImmutableMap type is designed for just this sort of use (among other things):
private static final ImmutableMap<String, String> staticMap = ImmutableMap.of(
"key1", "value1",
"key2", "value2",
"key3", "value3");

Nope, reads don't modify the map so i wouldn't worry about it at all. it's only the write+write or write+read that requires synchronization around it.

It depends on the Map implementation. HashMap, TreeMap and the like all have reads that are modification free and so are fine, but implementations that track usage may perform updates internally.
An example is a LinkedHashMap with access ordering:
new LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder)
This will actually reorder the elements on each read such that iteration over the keys, values or entries will be in last accessed first order. Another map that may modify is WeakHashMap.
An excellent alternative \would be the ImmutableMap found in Google's guava library.

I disagree with the above answers. Contained in the map implementations are non-volatile fields (like HashMap.entrySet. In the unmodifiable case: UnmodifiableMap.keySet, UnmodifiableMap.entrySet and UnmodifiableMap.values). These fields are lazily initialized, so NULL after the static initializer. If one thread then calls entrySet(), this initializes the entrySet field. Access to this field is unsafe from all other threads. The field may be seen from another thread in an inconsistent state or not at all.

The short answer is: no. You don't need to lock if there is no read-write contention. You only lock if whatever you're sharing might change, if it doesn't change then it's basically immutable and immutables are considered thread safe.

I think others have covered the answer already (yes in the case of the HashMap implementation). If you don't necessarily always need the map to be created, you can make it lazy using the Initialize-On-Demand Holder idiom:
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
public class YourClass {
// created only if needed, thread-safety still preserved
private static final class MapHolder {
private static final Map<String,String> staticMap;
static{
System.out.println("Constructing staticMap");
Map<String,String> tempMap = new HashMap<String,String>();
tempMap.put("key 1","value 1");
tempMap.put("key 2","value 2");
tempMap.put("key 3","value 3");
staticMap = Collections.unmodifiableMap(tempMap);
}
}
// use this to actually access the instance
public static Map<String,String> mapGetter() {
return MapHolder.staticMap;
}
public static void main(String[] arg) {
System.out.println("Started, note that staticMap not yet created until...");
Map<String,String> m = mapGetter();
System.out.println("we get it: " + m);
}
}
which will print:
Started, note that staticMap not yet created until...
Constructing staticMap
we get it: {key 1=value 1, key 2=value 2, key 3=value 3}

Related

java.util.Map values() method performance

I have a map like this with several million entries:
private final Map<String, SomeItem> tops = new HashMap<>();
I need to get list of values, which could be done by calling java.util.Map values() method.
Is Collection of values created every time I call values() method or is it pre-computed from performance perspective?
As my Map has several millions elements, I do not want to create new list object every time values() is called.
Below is the copied implementation of Map.values() in java.util.HashMap:
public Collection<V> values() {
Collection<V> vs = values;
if (vs == null) {
vs = new Values();
values = vs;
}
return vs;
}
This clearly shows that the value collection isn't created unless necessary. So, there should not be additional overhead caused by calls to values()
One important point here may be: It does not matter!
But first, referring to the other answers so far: The collection that is returned there is usually "cached", in that it is lazily created, and afterwards, the same instance will be returned. For example, considering the implementation in the HashMap class:
public Collection<V> values() {
Collection<V> vs;
return (vs = values) == null ? (values = new Values()) : vs;
}
This is even specified (as part of the contract, as an implementation specification) in the documentation of the AbstractMap class (which most Map implementations are based on) :
The collection is created the first time this method is called, and returned in response to all subsequent calls. No synchronization is performed, so there is a slight chance that multiple calls to this method will not all return the same collection.
But now, one could argue that the implementation might change later. The implementation of the HashMap class could change, or one might switch to another Map implementation that does not extend AbstractMap, and which is implemented differently. The fact that it is currently implemented like this is (for itself) no guarantee that it will always be implemented like this.
So the more important point (and the reason why it does not matter) is that the values() method is indeed supposed to return a collection view. As stated in the documentation of the Map interface :
The Map interface provides three collection views, which allow a map's contents to be viewed as a set of keys, collection of values, or set of key-value mappings.
and specifically, the documentation of the Map#values() method :
Returns a Collection view of the values contained in this map. The collection is backed by the map, so changes to the map are reflected in the collection, and vice-versa.
I cannot imagine a reasonable way of implementing such a view that involves processing all values of the Map.
So for example, imagine the implementation in HashMap was like this:
public Collection<V> values() {
return new Values();
}
Then it would return a new collection each time that it was called. But creating this collection does not involve processing the values at all.
Or to put it that way: The cost of calling this method is independent of the size of the map. It basically has the cost of a single object creation, regardless of whether the map contains 10 or 10000 elements.
As others have mentioned you can see this by looking at the code. You can also code up a quick example to prove it to yourself. The code below will print true 10 times as the object identity will always be the same for values.
public static void main(String[] args) {
Map<String, String> myMap = new HashMap();
Collection<String> lastValues = myMap.values();
for (int i=0; i < 10; i++) {
System.out.println(lastValues == myMap.values());
lastValues = myMap.values();
}
}
The following code will print true the first time and then false the next 9 times.
public static void main(String[] args) {
Map<String, String> myMap = new HashMap();
Collection<String> lastValues = myMap.values();
for (int i=0; i < 10; i++) {
System.out.println(lastValues == myMap.values());
lastValues = myMap.values();
myMap = new HashMap();
}
}
One more suggestion after reading this thread, if the Map tops declared contents are not changed - you could use google guava ImmutableMap object. For more info- UnmodifiableMap (Java Collections) vs ImmutableMap (Google)

What does HashMap(Map m) constructor does?

I have come across a piece of code where I found
public class MapImpl {
private static MapImpl mpl = new MapImpl();
Map<String,String> hm;
private MapImpl() {
hm = new HashMap<>();
}
public addContentsToMap(Map<String,String> m){
this.hm=m;
}
public Map returnMap(){
new HashMap<>(hm);
}
}
I like to know here that when the default constructor is called the map is initialized to hashmap, and when addContentsToMap is called a map is formed with values.
I see that the returnMap uses the constructor of the HashMap(Map m). I have gone through the source code of HashMap but was clueless.
It takes any implementation of Map interface and constructs a HashMap which also is an implementation of Map interface.
Developers like Hash-Collections (HashSet, HashMap etc.) including HashMap because they provide expected O(1) get and contains time.
It can be useful, once you have a Map which isn't a HashMap (e.g. Properties) and you know that it'll be large and you will read from it many times, it's useful to switch to a different implementation of a Map.
Documentation:
public HashMap(Map<? extends K,? extends V> m)
Constructs a new HashMap with the same mappings as the specified Map. The HashMap is created with default load factor (0.75) and an initial capacity sufficient to hold the mappings in the specified Map.
Parameters:
m - the map whose mappings are to be placed in this map
Throws:
NullPointerException - if the specified map is null

Most efficient way to clear a Java HashMap [duplicate]

This question already has answers here:
Map.clear() vs new Map : Which one will be better? [duplicate]
(7 answers)
Fastest way to recreate the ArrayList in a for loop
(4 answers)
Closed 8 years ago.
Using Java, I have a Map interface which has some items. I want to clear all the data in it to use it again. Which method is more efficient?
params.clear()
or
params = new HashMap();
I would prefer clear() because you can have the Map as final member.
class Foo {
private final Map<String, String> map = new HashMap<String, String>();
void add(String string) {
map.put(string, "a value");
}
void clear() {
map.clear();
}
}
If you assign a new Map every time you can run into multithreading issues.
Below is an almost threadsafe example for using a Map wrapped in Collections.synchronizedMap but it assigns a new map every time you clear it.
class MapPrinter {
private static Map<String, String> createNewMap() {
return Collections.synchronizedMap(new HashMap<String, String>());
}
private Map<String, String> map = createNewMap();
void add(String key, String value) {
// put is atomic due to synchronizedMap
map.put(key, value);
}
void printKeys() {
// to iterate, we need to synchronize on the map
synchronized (map) {
for (String key : map.values()) {
System.out.println("Key:" + key);
}
}
}
void clear() {
// hmmm.. this does not look right
synchronized(map) {
map = createNewMap();
}
}
}
The clear method is responsible for a big problem: synchonized(map) will no longer work as intended since the map object can change and now two threads can simultanously be within those synchronized blocks since they don't lock the same object. To make that actually threadsafe we would either have to synchronize completely externally (and .synchronizedMap would be useless) or we could simply make it final and use Map.clear().
void clear() {
// atomic via synchronizedMap
map.clear();
}
Other advantages of a final Map (or anything final)
No extra logic to check for null or to create a new one. The overhead in code you may have to write to change the map can be quite a lot.
No accidential forgetting to assign a Map
"Effective Java #13: Favor Immutability" - while the map is mutable, our reference is not.
In general:
if you don't know how clear() is implemented, you can't guess which one would be more performant. I can come up with synthetic use-cases where one or another would definitely win.
If your map does not hold millions and millions or records you can go either way. Performance would be the same.
Specifically:
HashMap clears by wiping content of the inner array. Making old map content available for GC immediately. When you create a new Hashmap it also makes old map content available for GC + the HashMap object itself. You are trading a few CPU cycles for slightly less memory to GC
You need to consider other issue:
Do you pass this reference to some other code/component? You might want to use clear() so that this other code sees your changes, reverse is also true
Do you want no-hassle, no side-effect new map? I'd go with creating a new one.
etc

Custom map entry set - Is it possible?

I am making a class that maps Strings to Integers. I want to be able to get the Integer associated with a particular String and iterate through the entries, which are defined as another class that implements Map.Entry<String, Integer>.
Currently I have this:
public class MyMap implements Iterable<MyEntry> {
private final Map<String, Integer> wrappedMap =
new HashMap<String, Integer>();
#Override
public Iterator<MyEntry> iterator() {
return wrappedMap.entrySet().iterator();
}
//more methods
}
It's not compiling because of a type mismatch even though MyEntry implements Map.Entry<String, Integer>.
Is there a way to make a custom implementation of Map.Entry? Is there an easier way to do this that I'm overlooking? Thanks in advanced!
It's not compiling because MyEntry is not a part of the hashmap at all. If you want to return a list of MyEntry then you need to copy the data elements into a MyEntry instance and load that into a collection. Which is going to be slow and consume a considerable amount of memory.
It should be:
#Override
public Iterator<Map.Entry<String,Integer>> iterator() {
return wrappedMap.entrySet().iterator();
}
The call to entrySet() returns a Set which contains the mappings in the hashmap. So the iterator needs to iterate over Entry objects
Why not use just a regular map?
Map<String, MyEntry> map = new HashMap<String, MyEntry>();
Then you iterator will be simply this:
Iterator<MyEntry> iter = map.values().iterator();
Even though MyEntry implements Map.Entry<K,V>, it is not the case that an Iterator<MyEntry> implements Iterator<Map.Entry<K,V>>. For a class like Iterator, that distinction doesn't make intuitive sense to a human being, so let's instead think of a Box<E> class, which has .put(E) and .contains(E) methods. Is a Box<Dinosaur> a subclass of Box<Animal>? You might think so, but that's not the case: in a Box<Animal> it's legal to call .put(someMammal), but in a Box<Dinosaur> that is clearly illegal. Since Box<Dinosaur> can't support all actions that are legal on a Box<Animal>, it is definitely not a subclass and cannot be substituted in at will.
From the compiler's point of view, the same concern might apply to iterators, and so you can't overload .iterator() to return an object which is not an instance of Iterator<K,V>.

Collections.unmodifiablemap() and collections where reads also modify

This is more a curiosity question than anything. Say I supply a LinkedHashMap with access ordering set to true to Collections.unmodifiableMap(). Since reads are actually modifying the map. Does it mean there are cases where the view returned by unmodifiableMap() is actually modifiable?
public class MyApp {
/**
* #param args
*/
public static void main(String[] args) {
Map<String, String> m = new LinkedHashMap<String,
String>(16,.75f,true);
Collections.unmodifiableMap(m);
}
}
The Map is modifying itself. Collections.unmodifiableMap() only provides a decorator for the Map which disallows modifications, it does not make the Map itself unmodifiable.
Collections.unmodifiableMap returns a new Map that throws exceptions when you try to modify it, using the existing Map that you passed in as a backing collection. It doesn't change the semantics of the existing Map.

Categories