Set a HashMap to null or to a new HashMap? - java

In my project I have a few HashMaps that I will reuse frequently, I have been informed that HashMaps can cause memory leaks and that Map#clear is not very effective and that I should set my HashMap to null, however, I think putting a null check before every use of the HashMap looks ugly, so would setting the HashMap to Maps#newHashMap accomplish the same goal, or should I set it to null and perform a null check before every use?

If you want to throw away all the entries in the map at a certain point of time, calling clear will remove all the entries. This is O(n) because it has to remove all the entries from the map.
The only reason (as far as I can imagine) to set it to null is to remove the references to the contents of the map. This will cause garbage collection to collect the objects in the map if there are no other active references to them.
But if the map is being used, initializing it with an empty map is the best thing to do as it will achieve the same purpose.

Related

HashMap have old value around till they stop being referenced

I have a situation where I load some data at application level inside a HashMapin my android application. I use one of the entries (with a particular keyA in the HashMap) in this map to initialise some data inside my Activity and this Activity hangs around for a while. While the user is on this activity, the HashMap from which I referenced the object for keyA might change. The code to update the HashMap is written by me. When I want to update the HashMap, I want to clear the entire HashMap(so that its size() returns 0) and want to populate everything again. If I call HashMap.clear(), would it make the old objects to be garbage collected?
If yes, what is the best way to clear the entire HashMap so that I don't loose the old objects if they are being referred to anywhere else in the code. What would be the best to reassign values to the HashMap in this case?
PS: I am using ReentrantReadWriteLock for maintaining the access if that might help.
I call HashMap.clear(), would it make the old objects to be garbage collected?
No. No Java object will ever be garbage collected if it is reachable. If there is any reference (or chain of references) that allows your code to access an object, then the object is safe from garbage collection.
If I call HashMap.clear(), would it make the old objects to be garbage collected?
A HashMap is just storing a single reference to an object. When you call clear(), then the object can be gc'd if (and only if) the only reference to the object was the HashMap. If other parts of the code have references to the object then it won't be. That's how gc works. HashMap isn't magic.
What would be the best to reassign values to the HashMap in this case?
If you are updating the values in a shared map then your code is going to have to re-get the values from the map whenever they use them – or maybe every so often. I'd first re-get the value each time you use it and then prove to yourself that it's too expensive before doing anything else.
PS: I am using ReentrantReadWriteLock for maintaining the access if that might help.
I'd switch to using a ConcurrentHashMap and not both with the expense and code maintenance of doing the locking yourself.
If you need to update specifically keyA alone then you can use HashMap.put("keyA","value");
This will replace the value of keyA in the HashMap object with the value you specify, also it will not affect the other values saved in the HashMap object.

Does calling HashMap.entrySet() instantiate and populate a new set every time?

I have need of iterating over some HashMaps on each frame of an OpenGL loop. I do it like this:
for (Map.Entry<MyKey, MyValue> entry : myMap.entrySet(){...}
My concern is about whether this call to entrySet() actually instantiates and populates a brand new Map.Entry object every time it's called, because if it is, the GC will be more busy than I'd like when animating in OpenGL. My gut says no, because the HashMap documentation says that you can directly modify the HashMap using the returned entry set, but I don't know how to tell for sure.
And I'd also like to know about other Map implementations as well, like Hashtable, TreeMap, and LinkedHashMap.
After reviewing the source, the answer is no. It lazily instantiates the entry set on the first call to entrySet() and then returns a reference to the same object on each subsequent call.
The same is true for LinkedHashMap, Hashtable, and TreeMap.
No, it does not create a new Set, rather a light-weight wrapper, like eg. Arrays.asList does, API says The set is backed by the map, so changes to the map are reflected in the set, and vice-versa.
The implementation of HashMap.entrySet is simply a cheap view, and takes O(1) time to create every time you call it. The entry objects you get by iterating over it are, in fact, the objects used by the map to implement its internal data structures.
(That, and there's really no other way to do the things you would want to do with it.)

Map clear vs null

I have a map that I use to store dynamic data that are discarded as soon as they are created (i.e. used; they are consumed quickly). It responds to user interaction in the sense that when user clicks a button the map is filled and then the data is used to do some work and then the map is no longer needed.
So my question is what's a better approach for emptying the map? should I set it to null each time or should I call clear()? I know clear is linear in time. But I don't know how to compare that cost with that of creating the map each time. The size of the map is not constant, thought it may run from n to 3n elements between creations.
If a map is not referenced from other objects where it may be hard to set a new one, simply null-ing out an old map and starting from scratch is probably lighter-weight than calling a clear(), because no linear-time cleanup needs to happen. With the garbage collection costs being tiny on modern systems, there is a good chance that you would save some CPU cycles this way. You can avoid resizing the map multiple times by specifying the initial capacity.
One situation where clear() is preferred would be when the map object is shared among multiple objects in your system. For example, if you create a map, give it to several objects, and then keep some shared information in it, setting the map to a new one in all these objects may require keeping references to objects that have the map. In situations like that it's easier to keep calling clear() on the same shared map object.
Well, it depends on how much memory you can throw at it. If you have a lot, then it doesn't matter. However, setting the map itself to null means that you have freed up the garbage collector - if only the map has references to the instances inside of it, the garbage collector can collect not only the map but also any instances inside of it. Clear does empty the map but it has to iterate over everything in the map to set each reference to null, and this takes place during your execution time that you can control - the garbage collector essentially has to do this work anyways, so let it do its thing. Just note that setting it to null doesn't let you reuse it. A typical pattern to reuse a map variable may be:
Map<String, String> whatever = new HashMap<String, String();
// .. do something with map
whatever = new HashMap<String, String>();
This allows you to reuse the variable without setting it to null at all, you silently discard the reference to the old map. This is atrocious practice in non-memory managed applications since they must reference the old pointer to clear it (this is a dangling pointer in other langauges), but in Java since nothing references this the GC marks it as eligible for collection.
I feel nulling the existing map is more cheaper than clear(). As creation of object is very cheap in modern JVMs.
Short answer: use Collection.clear() unless it is too complicated to keep the collection arround.
Detailed answer: In Java, the allocation of memory is almost instantaneous. It is litle more than a pointer that gets moved inside the VM. However, the initialization of those objects might add up to something significant. Also, all objects that use an internal buffer are sensible to resizing and copying of their content. Using clear() make sure that buffers eventually stabilize to some dimension, so that reallocation of memory and copying if old buffer to new buffer will never be necessary.
Another important issue is that reallocating then releasing a lot of objects will require more frequent execution of the Garbage collector, which might cause suddenly lag.
If you always holds the map, it will be prompted to the old generation. If each user has one corresponding map, the number of map in the old generation is proportionate to the number of the user. It may trigger Full GC more frequently when the number of users increase.
You can use both with similar results.
One prior answer notes that clear is expected to take constant time in a mature map implementation. Without checking the source code of the likes of HashMap, TreeMap, ConcurrentHashMap, I would expect their clear method to take constant time, plus amortized garbage collection costs.
Another poster notes that a shared map cannot be nulled. Well, it can if you want it, but you do it by using a proxy object which encapsulates a proper map and nulls it out when needed. Of course, you'd have to implement the proxy map class yourself.
Map<Foo, Bar> myMap = new ProxyMap<Foo, Bar>();
// Internally, the above object holds a reference to a proper map,
// for example, a hash map. Furthermore, this delegates all calls
// to the underlying map. A true proxy.
myMap.clear();
// The clear method simply reinitializes the underlying map.
Unless you did something like the above, clear and nulling out are equivalent in the ways that matter, but I think it's more mature to assume your map, even if not currently shared, may become shared at a later time due to forces you can't foresee.
There is another reason to clear instead of nulling out, even if the map is not shared. Your map may be instantiated by an external client, like a factory, so if you clear your map by nulling it out, you might end up coupling yourself to the factory unnecessarily. Why should the object that clears the map have to know that you instantiate your maps using Guava's Maps.newHashMap() with God knows what parameters? Even if this is not a realistic concern in your project, it still pays off to align yourself to mature practices.
For the above reasons, and all else being equal, I would vote for clear.
HTH.

Replace a big hashmap in AS

I have a hashmap which stores around 1 G of data is terms of key value pairs. This hashmap changes every 15 days. It will be loaded into memory and used from there.
When a new hashmap has to be loaded into the memory, there would be several transactions already accessing the hashmap in memory. How can I replace the old hashmap with the new one without effecting the current transactions accessing the old hashmap. If there a way to hot swap the hashmap in memory?
Use an AtomicReference<Map<Foo, Bar>> rather than exposing a direct (hard) reference to the map. Consumers of the map will use #get(), and when you're ready to swap out the map, your "internal" code will use #set() or #getAndSet().
Provide a getter to the map
Mark the map private and volatile
When updating the map, create a new one, populate it and when it is ready, assign it to your private map variable.
Reference assignments are atomic in Java and volatile ensures visibility.
Caveats:
you will have two maps in memory at some stage
if some code keeps a reference to the old map it will access stale data. If that is an issue you can completely hide the map and provide a get(K key) instead so that users always access the latest map.
I will suggest to use caching tools like memcached if the data size is large like yours. This way you can invalidate individual items or entire cache as per your requirement.

When should I use weakValues() of the MapMaker class?

When a entry in a map has weak key reference, the entry will be removed at the next garbage collection, right?
I can understand that the MapMaker class provides the weakKeys method. But I am confused with the weakValue(). when should I use weakValue or softValue in MapMaker?
You'd use weakValues() when you want entries whose values are weakly reachable to be garbage collected. For an example of when this might be useful... say you have a class that allows users to add objects to it and stores them as values in a Map for whatever reason. This class is typically used as a singleton, so it'll stick around the whole time your application is running. However, the objects the user adds to it aren't necessarily so long-lived. The application will be done with them long before it finishes. You don't want the user to have to manually remove these objects from your class when it is finished with them, but you don't want a memory leak by keeping references to them in your class forever (in other words garbage collection should just work like normal, ignoring your class). The solution is to give the map weakValues() and everything will work as you want.
softValues() is good for caching... if you have a Map<Integer, Foo> and you want entries to to be removable in response to memory demand, you'd want to use it. You wouldn't want to use weakKeys() or softKeys() because they both use == identity, which would cause problems for you (wouldn't be able to get a value with key 300 out because the key you pass in probably wouldn't == the key in the map).

Categories