Most efficient way to clear a Java HashMap [duplicate] - java

This question already has answers here:
Map.clear() vs new Map : Which one will be better? [duplicate]
(7 answers)
Fastest way to recreate the ArrayList in a for loop
(4 answers)
Closed 8 years ago.
Using Java, I have a Map interface which has some items. I want to clear all the data in it to use it again. Which method is more efficient?
params.clear()
or
params = new HashMap();

I would prefer clear() because you can have the Map as final member.
class Foo {
private final Map<String, String> map = new HashMap<String, String>();
void add(String string) {
map.put(string, "a value");
}
void clear() {
map.clear();
}
}
If you assign a new Map every time you can run into multithreading issues.
Below is an almost threadsafe example for using a Map wrapped in Collections.synchronizedMap but it assigns a new map every time you clear it.
class MapPrinter {
private static Map<String, String> createNewMap() {
return Collections.synchronizedMap(new HashMap<String, String>());
}
private Map<String, String> map = createNewMap();
void add(String key, String value) {
// put is atomic due to synchronizedMap
map.put(key, value);
}
void printKeys() {
// to iterate, we need to synchronize on the map
synchronized (map) {
for (String key : map.values()) {
System.out.println("Key:" + key);
}
}
}
void clear() {
// hmmm.. this does not look right
synchronized(map) {
map = createNewMap();
}
}
}
The clear method is responsible for a big problem: synchonized(map) will no longer work as intended since the map object can change and now two threads can simultanously be within those synchronized blocks since they don't lock the same object. To make that actually threadsafe we would either have to synchronize completely externally (and .synchronizedMap would be useless) or we could simply make it final and use Map.clear().
void clear() {
// atomic via synchronizedMap
map.clear();
}
Other advantages of a final Map (or anything final)
No extra logic to check for null or to create a new one. The overhead in code you may have to write to change the map can be quite a lot.
No accidential forgetting to assign a Map
"Effective Java #13: Favor Immutability" - while the map is mutable, our reference is not.

In general:
if you don't know how clear() is implemented, you can't guess which one would be more performant. I can come up with synthetic use-cases where one or another would definitely win.
If your map does not hold millions and millions or records you can go either way. Performance would be the same.
Specifically:
HashMap clears by wiping content of the inner array. Making old map content available for GC immediately. When you create a new Hashmap it also makes old map content available for GC + the HashMap object itself. You are trading a few CPU cycles for slightly less memory to GC
You need to consider other issue:
Do you pass this reference to some other code/component? You might want to use clear() so that this other code sees your changes, reverse is also true
Do you want no-hassle, no side-effect new map? I'd go with creating a new one.
etc

Related

java.util.Map values() method performance

I have a map like this with several million entries:
private final Map<String, SomeItem> tops = new HashMap<>();
I need to get list of values, which could be done by calling java.util.Map values() method.
Is Collection of values created every time I call values() method or is it pre-computed from performance perspective?
As my Map has several millions elements, I do not want to create new list object every time values() is called.
Below is the copied implementation of Map.values() in java.util.HashMap:
public Collection<V> values() {
Collection<V> vs = values;
if (vs == null) {
vs = new Values();
values = vs;
}
return vs;
}
This clearly shows that the value collection isn't created unless necessary. So, there should not be additional overhead caused by calls to values()
One important point here may be: It does not matter!
But first, referring to the other answers so far: The collection that is returned there is usually "cached", in that it is lazily created, and afterwards, the same instance will be returned. For example, considering the implementation in the HashMap class:
public Collection<V> values() {
Collection<V> vs;
return (vs = values) == null ? (values = new Values()) : vs;
}
This is even specified (as part of the contract, as an implementation specification) in the documentation of the AbstractMap class (which most Map implementations are based on) :
The collection is created the first time this method is called, and returned in response to all subsequent calls. No synchronization is performed, so there is a slight chance that multiple calls to this method will not all return the same collection.
But now, one could argue that the implementation might change later. The implementation of the HashMap class could change, or one might switch to another Map implementation that does not extend AbstractMap, and which is implemented differently. The fact that it is currently implemented like this is (for itself) no guarantee that it will always be implemented like this.
So the more important point (and the reason why it does not matter) is that the values() method is indeed supposed to return a collection view. As stated in the documentation of the Map interface :
The Map interface provides three collection views, which allow a map's contents to be viewed as a set of keys, collection of values, or set of key-value mappings.
and specifically, the documentation of the Map#values() method :
Returns a Collection view of the values contained in this map. The collection is backed by the map, so changes to the map are reflected in the collection, and vice-versa.
I cannot imagine a reasonable way of implementing such a view that involves processing all values of the Map.
So for example, imagine the implementation in HashMap was like this:
public Collection<V> values() {
return new Values();
}
Then it would return a new collection each time that it was called. But creating this collection does not involve processing the values at all.
Or to put it that way: The cost of calling this method is independent of the size of the map. It basically has the cost of a single object creation, regardless of whether the map contains 10 or 10000 elements.
As others have mentioned you can see this by looking at the code. You can also code up a quick example to prove it to yourself. The code below will print true 10 times as the object identity will always be the same for values.
public static void main(String[] args) {
Map<String, String> myMap = new HashMap();
Collection<String> lastValues = myMap.values();
for (int i=0; i < 10; i++) {
System.out.println(lastValues == myMap.values());
lastValues = myMap.values();
}
}
The following code will print true the first time and then false the next 9 times.
public static void main(String[] args) {
Map<String, String> myMap = new HashMap();
Collection<String> lastValues = myMap.values();
for (int i=0; i < 10; i++) {
System.out.println(lastValues == myMap.values());
lastValues = myMap.values();
myMap = new HashMap();
}
}
One more suggestion after reading this thread, if the Map tops declared contents are not changed - you could use google guava ImmutableMap object. For more info- UnmodifiableMap (Java Collections) vs ImmutableMap (Google)

Bug in HashMap / ArrayList or wrong code? [duplicate]

This question already has answers here:
ArrayList as key in HashMap
(9 answers)
Closed 6 years ago.
Tired up with trying to resolve the problem with this code:
public class MapTest {
static class T{
static class K{}
}
static Map<List<T.K>, List<String>> map = new HashMap<>();
static List<String> test(List<T.K> list, String s){
List<String> l = map.get(list);
if (l == null){
l = new ArrayList<String>();
System.out.println("New value()");
map.put(list, l);
}
l.add(s);
return l;
}
public static void main(String s[]){
ArrayList<T.K> list = new ArrayList<T.K>();
test(list, "TEST");
list.add(new T.K());
List<String> l = test(list, "TEST1");
System.out.println(l.size());
}
}
It should create a new list-value for the map only once, but output is as follows:
New value
New value
1
it is something wrong happen with hashcode of the list after I insert value in it.
I expect "new value" show up only once, and size will be 2, not 1.
is it just JVM problem or something more general?
mine one is Oracle JVM 1.8.0_65
The hashcode of the list object changes when you put an item in it. You can see how the hashcode is calculated in the ArrayList.hashCode documentation.
In general, using a mutable object as the key for a map isn't going to work well. Per the Map documentation:
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.
Thus, when you add the list to the map a second time, the map doesn't see it as being "equal" to the first list (since it isn't according to .equals), so it adds it again.
If you want a map where keys are looked up by identity rather than by value, you can use the IdentityHashMap class.

Why does this throw Concurrent Modification Exception

The following code works fine when there are more than one modification in a particular map. But when there is only one modification it throws concurrent modification exception
for(Map.Entry<String, List<String>> mapEntry : beanMap.entrySet()) {
for(String dateSet : dateList) {
String mName = mapEntry.getKey();
boolean dateFound = false;
if(beanMap.containsKey(dateSet)) {
dateFound = true;
System.out.println(" Found : "+mapEntry.getKey());
}
if(!dateFound)
{
Map<String, List<String>> modifiedMap = beanMap;
List<String> newBeanList = new ArrayList<String>();
dBean beanData = new Bean(dateSet+"NA","NA","NA",0,0,0);
newBeanList.add(beanData);
System.out.println(" Adding : "+dateSet+" "+"NA");
modifiedMap.put(mName, newBeanList);
}
}
}
In the above code it throws ConcurrentModificationException when modifying the "modifiedMap" only once. May be there is more to it but couldn't find out why.
When you use an enhanced for loop, there is an implicit Iterator working behind the scenes. You attempt to make a copy of beanMap with this line:
Map<String, List<String>> modifiedMap = beanMap;
However, this only creates another reference variable that also refers to the same map object. There is still only one map, and you are modifying it:
modifiedMap.put(mName, newBeanList);
The Iterator then detects that the map is modified when it attempts to iterate to the next entry, resulting in the ConcurrentModificationException.
You can create another Map with new, and put all your modifications into that map while you're iterating the original map.
After you're done iterating the original map, you can call the putAll method on it, passing your new map, to apply all of the modifications you want.
You are not allowed to change the underlying collection while iterating over it using this syntax. The collections are implemented in a fail-fast way, so even a single change will raise the exception.
If you need to change the collection while visiting the elements, use an Iterator.
modifiedMap is the reference to same Map beanMap on which you are iterating. You are modifying the Collection modifiedMap while iteration hence the Exception.

Getting ConcurrentModificationException while modifying a HashMap in a thread class

Hi I am running a thread service, the job of this thread is to check the age of a list items in a HashMap. When an item is older than say 5 seconds, I will have to delete the item from the HashMap. The below is the simplified code. But when the code attempts to delete the item from the HashMap, I get a java.util.ConcurrentModificationException.
I am populating the HashMap in the main() method in the original program.
Can somebody please help me out with this ? PS: The deleteFromTrackList() is being called by different clients across a network through RMI.
import java.util.*;
public class NotifierThread extends Thread {
private HashMap<Integer, ArrayList> NotificationTrackList = new HashMap<Integer, ArrayList>();
#Override
public void run() {
while (true) { // this process should run continuously
checkNotifierList(getNotificationTrackList());
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public HashMap<Integer, ArrayList> getNotificationTrackList() {
return NotificationTrackList;
}
public void deleteFromTrackList(Integer messageID) {
NotificationTrackList.remove(messageID);
}
public synchronized void checkNotifierList(HashMap list) {
Set entries = list.entrySet();
for (Iterator iterator = entries.iterator(); iterator.hasNext();) {
Map.Entry<Integer, ArrayList> entry = (Map.Entry) iterator.next();
ArrayList messageInfo = entry.getValue();
Integer messageID = entry.getKey();
messageInfo = new ArrayList((ArrayList) list.get(messageID));
Long curTime = new Date().getTime();
Long refTime = (Long) messageInfo.get(1);
Long timeDiff = curTime - refTime;
if (timeDiff > 5000) {
// delete the entry if its older than 5 milliseconds and update
// internal entry list
deleteFromTrackList(messageID);
}
}
}
public static void main(String[] args) {
new NotifierThread().start();
}
}
This is the stacktrace I am getting at the console
Exception in thread "tracker" java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(Unknown Source)
at java.util.HashMap$EntryIterator.next(Unknown Source)
at java.util.HashMap$EntryIterator.next(Unknown Source)
at NotifierThread.checkNotifierList(NotifierThread.java:32)
at NotifierThread.run(NotifierThread.java:10)
The only way to remove an entry from a map while iterating over it is to remove it using the iterator. Use
iterator.remove();
instead of
deleteFromTrackList(messageID);
Note that the same applies to all the collections (List, Set, etc.)
Also, note that your design is not thread-safe, because you let other threads access the map in an unsynchronized way.
Your code isn't complectly synchronized. Try to change
public void deleteFromTrackList(Integer messageID) {
to
public synchronized void deleteFromTrackList(Integer messageID) {
Correct. You cannot modify a Map while iterating over it without using the iterator directly. There are a couple main options.
Create a List of elements that should be removed. Add each expired element to the List in the loop. After the loop, remove the elements in the List from the Map.
Use Guava's filter capability.
Maps.filterEntries This creates a new Map however which may not work for what you are trying to do.
Since you have a multi-threaded system. You may want to consider immutability as your friend. Rather than blocking threads over your entire check for stale loop, you could use an ImmutableMap which would be more thread-safe with better performance.
Thanks for your answers guys... I have found the solution for my question, instead of using HashMap, I am using ConcurrentHashMap. This solved my problem. Thanks again !
In reality you do not even need concurrent access to an hash map to get a concurrency exception.
In fact, a single thread is quite enough.
For example,
You may create a loop based on the hash map map.keySet().iterator(),
And, while you are within this loop, your (single) thread decides to remove an element from the hash map. (Not a good idea while the iterator is open.)
In the next request to the iterator().next() you will get your concurrency exception.
So careful with that.

Concurrent Reads from Unmodfiable Map

If I statically initialize a map and set the reference to a Collections.unmodifiableMap(Map m). Do I need to synchronize reads?
private static final Map<String,String> staticMap;
static{
Map<String,String> tempMap = new HashMap<String,String>();
tempMap.put("key 1","value 1");
tempMap.put("key 2","value 2");
tempMap.put("key 3","value 3");
staticMap = Collections.unmodifiableMap(tempMap);
}
No, the map you're creating there is effectively immutable (since nothing has a reference to the mutable backing map) and safe for concurrent access. If you want a clearer guarantee of that along with making it easier to create the map, Guava's ImmutableMap type is designed for just this sort of use (among other things):
private static final ImmutableMap<String, String> staticMap = ImmutableMap.of(
"key1", "value1",
"key2", "value2",
"key3", "value3");
Nope, reads don't modify the map so i wouldn't worry about it at all. it's only the write+write or write+read that requires synchronization around it.
It depends on the Map implementation. HashMap, TreeMap and the like all have reads that are modification free and so are fine, but implementations that track usage may perform updates internally.
An example is a LinkedHashMap with access ordering:
new LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder)
This will actually reorder the elements on each read such that iteration over the keys, values or entries will be in last accessed first order. Another map that may modify is WeakHashMap.
An excellent alternative \would be the ImmutableMap found in Google's guava library.
I disagree with the above answers. Contained in the map implementations are non-volatile fields (like HashMap.entrySet. In the unmodifiable case: UnmodifiableMap.keySet, UnmodifiableMap.entrySet and UnmodifiableMap.values). These fields are lazily initialized, so NULL after the static initializer. If one thread then calls entrySet(), this initializes the entrySet field. Access to this field is unsafe from all other threads. The field may be seen from another thread in an inconsistent state or not at all.
The short answer is: no. You don't need to lock if there is no read-write contention. You only lock if whatever you're sharing might change, if it doesn't change then it's basically immutable and immutables are considered thread safe.
I think others have covered the answer already (yes in the case of the HashMap implementation). If you don't necessarily always need the map to be created, you can make it lazy using the Initialize-On-Demand Holder idiom:
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
public class YourClass {
// created only if needed, thread-safety still preserved
private static final class MapHolder {
private static final Map<String,String> staticMap;
static{
System.out.println("Constructing staticMap");
Map<String,String> tempMap = new HashMap<String,String>();
tempMap.put("key 1","value 1");
tempMap.put("key 2","value 2");
tempMap.put("key 3","value 3");
staticMap = Collections.unmodifiableMap(tempMap);
}
}
// use this to actually access the instance
public static Map<String,String> mapGetter() {
return MapHolder.staticMap;
}
public static void main(String[] arg) {
System.out.println("Started, note that staticMap not yet created until...");
Map<String,String> m = mapGetter();
System.out.println("we get it: " + m);
}
}
which will print:
Started, note that staticMap not yet created until...
Constructing staticMap
we get it: {key 1=value 1, key 2=value 2, key 3=value 3}

Categories