Is it a good idea to use WeakHashMap in ThreadLocal

Is it a good idea to use WeakHashMap in ThreadLocal - java

For my usecase, I have to pass quite a few context information from different layers/components of the application. Since few of the components are discrete, I am thinking to use ThreadLocal to store such context information. I have an interceptor/filter in place to clean it before the the response is written back to the user. Now, my question is, is it a good idea to use WeakHashMap inside ThreadLocal (see the code snippet below)?
private static final ThreadLocal<Map<String, Object>> context = new ThreadLocal<WeakHashMap<String, Object>>();
The doubt in my mind (with my limited knowledge of Weak references in Java) is, the weak references can return NULL (because GC collects them as per its own will).
Please help me in understanding this. Should I use a strong reference like HashMap or ConcurrentHashMap or my implementation is good to go?

The Javadoc for WeakHashMap states:
This class is intended primarily for use with key objects whose equals methods test for object identity using the == operator. Once such a key is discarded it can never be recreated, so it is impossible to do a lookup of that key in a WeakHashMap at some later time and be surprised that its entry has been removed. This class will work perfectly well with key objects whose equals methods are not based upon object identity, such as String instances. With such recreatable key objects, however, the automatic removal of WeakHashMap entries whose keys have been discarded may prove to be confusing.
So if you can't tolerate entries randomly disappearing, then you really shouldn't be using WeakHashMap with String keys.

Related

HashMap have old value around till they stop being referenced

I have a situation where I load some data at application level inside a HashMapin my android application. I use one of the entries (with a particular keyA in the HashMap) in this map to initialise some data inside my Activity and this Activity hangs around for a while. While the user is on this activity, the HashMap from which I referenced the object for keyA might change. The code to update the HashMap is written by me. When I want to update the HashMap, I want to clear the entire HashMap(so that its size() returns 0) and want to populate everything again. If I call HashMap.clear(), would it make the old objects to be garbage collected?
If yes, what is the best way to clear the entire HashMap so that I don't loose the old objects if they are being referred to anywhere else in the code. What would be the best to reassign values to the HashMap in this case?
PS: I am using ReentrantReadWriteLock for maintaining the access if that might help.

I call HashMap.clear(), would it make the old objects to be garbage collected?
No. No Java object will ever be garbage collected if it is reachable. If there is any reference (or chain of references) that allows your code to access an object, then the object is safe from garbage collection.

If I call HashMap.clear(), would it make the old objects to be garbage collected?
A HashMap is just storing a single reference to an object. When you call clear(), then the object can be gc'd if (and only if) the only reference to the object was the HashMap. If other parts of the code have references to the object then it won't be. That's how gc works. HashMap isn't magic.
What would be the best to reassign values to the HashMap in this case?
If you are updating the values in a shared map then your code is going to have to re-get the values from the map whenever they use them – or maybe every so often. I'd first re-get the value each time you use it and then prove to yourself that it's too expensive before doing anything else.
PS: I am using ReentrantReadWriteLock for maintaining the access if that might help.
I'd switch to using a ConcurrentHashMap and not both with the expense and code maintenance of doing the locking yourself.

If you need to update specifically keyA alone then you can use HashMap.put("keyA","value");
This will replace the value of keyA in the HashMap object with the value you specify, also it will not affect the other values saved in the HashMap object.

javax.cache store by reference vs. store by value

I am new to java caching, I try to understand the difference between store by value vs. store by reference.
I have below cited paragraph in java.cache documentation
"
The purpose of copying entries as they are stored in a Cache and again when they are returned from a Cache is to allow applications to continue mutating the state of the keys and values without causing side-effects to entries held by a Cache.
"
What is the "side-effects" mentioned above? And how do we choose how to store in practice?

The question is great, since the answer isn't an easy one. The real semantics vary slightly across cache implementations.
store by reference:
The cache stores and returns the identical object references.
Object key = ...
Object value = ...
cache.put(key, value);
assert cache.get(key) == value;
assert cache.iterator().next().getKey() == key;
If you mutate the key after storing the value, you have an ambiguous situation. The effect is the same when using a HashMap or ConcurrentHashMap.
Use store by reference, to:
Maximize performance / minimize processing overhead
When the data is fitting into the Java heap
If you want to mutate a value after storing it. This can be useful for performance, but isn't a recommended practice, since you have to take care of concurrency issues and the usage relies on the store by reference semantics.
store by value:
Also it seems obvious, things are not so clear what store by value really means. According to the Spec leads of JCache: Brian Oliver said it's protection against cache data corruption, Greg Luck said it's everything but not store by reference.
For that matter I did analyze different compliant (means passing the TCK) JCache implementations. Key and value objects are copied when passed to the cache, but you cannot rely on the fact that an object in the cache is copied when returned to the application.
So this assumption isn't true for all JCache implementations:
assert cache.get(key) != cache.get(key);
JCache implementations may even vary more, when it gets into detail. An example:
Map map = cache.getAll(...);
assert map.get(key) != map.get(key);
Here is a contradiction in the expected semantics. We would expect that the map contents are stable, OTOH the cache would need to return a copy of the value on every access. The JCache spec doesn't enforce concrete semantics for this. The devil is in the details.
Since the key is copied upon storage by every cache implementation you will get additional safety that the cache internal data structures are sane, but applications still have the chance to break because of shared value references.
My personal conclusion (I am open for discussion):
Since store by reference is an optional JCache feature, requesting it, would mean you limit the number of cache implementations your application works with. Use store by value always, if you don't rely on store by reference semantics.
However, don't make your application depend on the semantics you think you might get with store by value. Never mutate any object after handing its reference to the cache or after retrieving its reference from the cache.
If there is still doubt, ask your cache vendor. IMHO its good practice to document implementation details. A good example (since I spent much thought in it...) is the JCache chapter in the cache2k user guide

It is to prevent concurrent modification of mutable objects. The side effect is to other threads that are using that object for something.
An example would be if you had a bank program with multiple threads with a cache of Integer objects representing bank account numbers shared between them. Suppose thread one retrieves an number from the cache, and then starts to perform an operation on it. While thread 1 is manipulated the object thread 2 retrieves the same object, and starts to manipulate it as well. Since they are simultaneously manipulating the same object in an uncoordinated way the result is unpredictable. The object itself can even become corrupted.
Storing by value eliminate this common problem in concurrent programming if it simply stores a copy of the object when an object is saved to the cache, and handing out a copy of the object when the object is retrieved from the cache.

How to cache with weak references when values refer back to keys?

I'm using Guava's Cache<Key, Value>. Whenever Key is no more strongly reachable, the cache entry should be garbage collected (someday...). Using CacheBuilder.weakKeys() would do exactly that, if there weren't a reference from Value back to Key.
I could make this reference weak, but this could anytime make my Value quite invalid. I could handle it, but I'd prefer not to.
I could use weakValues(), but this could lead to very early evictions, as my values are only referenced for a short time.
Maybe I could use softValues(), but SoftReferences are quite broken.
Probably I'm getting something wrongly.... what is the right solution?
Update
What I need could be achieved simply by putting a reference to Value into each Key, but this is not possible as Key is not under my control. If it was, then I'd need no cache, no weak references, nothing.
This way, each Key would keep its corresponding Value reachable, which is fine1. Also each Value would keep its Key reachable, but this is no problem as there're no long existing references to Value.
1 Some expiration would be better but it's not necessary.

Unfortunately, this is unsolvable without ephemerons.

The pointer from Value -> Key doesn't matter as long as nothing else is holding on to Value.
When the Cache dumps Key, it will be collected.
If you have System->Cache->Key<-Value, when Cache drops key you get System->Cache Key<-Value. The link from Key back up to System (the memory root for this example) is broken, and Key will be recovered.

If you really want weakKeys, then having a weak reference from the value to the key is the right thing to do.
If that doesn't feel right to you, then please provide more info about what you're trying to accomplish.

Do you think it might be possible to create a copy of key, and use that as the key in the map? I am thinking you might have something like
Value v = SomeLibrary.giveMeSomething();
String k = v.getName();
String k1 = new String(k);
cache.put(k1,v);
This will work b/c k.equals(k1) and k != k1. Hopefully you can create a copy or clone of the type used for Key (which probably isn't String in your case).
However, this changes the lifecycle of the key -- since it is no longer the one in Value. If you have control over the lifecycle of the particular object you've put in the map, then you're OK.
Do you think that might work?

When should I use weakValues() of the MapMaker class?

When a entry in a map has weak key reference, the entry will be removed at the next garbage collection, right?
I can understand that the MapMaker class provides the weakKeys method. But I am confused with the weakValue(). when should I use weakValue or softValue in MapMaker?

You'd use weakValues() when you want entries whose values are weakly reachable to be garbage collected. For an example of when this might be useful... say you have a class that allows users to add objects to it and stores them as values in a Map for whatever reason. This class is typically used as a singleton, so it'll stick around the whole time your application is running. However, the objects the user adds to it aren't necessarily so long-lived. The application will be done with them long before it finishes. You don't want the user to have to manually remove these objects from your class when it is finished with them, but you don't want a memory leak by keeping references to them in your class forever (in other words garbage collection should just work like normal, ignoring your class). The solution is to give the map weakValues() and everything will work as you want.
softValues() is good for caching... if you have a Map<Integer, Foo> and you want entries to to be removable in response to memory demand, you'd want to use it. You wouldn't want to use weakKeys() or softKeys() because they both use == identity, which would cause problems for you (wouldn't be able to get a value with key 300 out because the key you pass in probably wouldn't == the key in the map).

Is this scenario suitable for WeakReferences?

I am working on querying the address book via J2ME and returning a custom
Hashtable which I will call pimList. The keys in pimList {firstname, lastname} maps to an object (we'll call this object ContactInfo) holding (key, value) pairs e.g. work1 -> 44232454545, home1 -> 44876887787
Next I take firstName and add it into a tree.
The nodes of the tree contains the characters from the firstName.
e.g. "Tom" would create a tree with nodes:
"T"->"o"->"m"-> ContactInfo{ "work1" -> "44232454545", "home1" -> "44876887787" }
So the child of the last character m points to the same object instance in pimList.
As I understand it, the purpose of WeakReferences is to that its pointer is weak and the object it points to can be easily GC'ed. In a memory constraint device like mobile phones, I would like to ensure I don't leak or waste memory. Thus, is it appropriate for me to make:
pimList's values to be a WeakReference
The child of node "m" to point to WeakReference
?

It should work. You will need to handle the case where you are using the returned Hashtable and the items are collected however... which might mean you want to rethink the whole thing.
If the Hashtable is short lived then there likely isn't a need for the weak references.
You can remove the items out of the Hashtable when you are done with them if you want them to be possibly cleaned up while the rest of the Hashtable is stll being used.

Not sure I exactly understood what you try to do but an objects reachability is determined by the strongest reference to it (hard reference is stronger than soft reference which is stronger than weak reference which is stronger than phantom reference).
Hard referenced objects won't be garbage collected. Soft referenced objects will be garbage collected only if JVM runs out of memory, weak referenced objects will be garbage collected as soon as possible (this is theory it depends on the JVM and GC implementation).
So usually you use softreference to build a cache (you want to reference information as long as possible). You use weakreference to associate information to an object that is hard referenced somewhere, so if the hardreferenced object is no longer referenced the associated information can be garbage collected - use weakhashmap for that.
hope this helps...

I am not sure if the WeakMap is the right thing here. If you do not hold strong references anywhere in your application, the data in the map will disappear nearly immediately, because nobody is referencing it.
A weak map is a nice thing, if you want to find things again, that are still in use elsewhere and you only want to have one instance of it.
But I might not get your data setup right... to be honest.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.