Java:How can I populate map if I use callables? - java

I want to use a Map as a form of small database "cache" in my application.
I thought that it would be better to use something like:
ConcurrentHashMap<K,Callable<V>>
So that I have a single cache for many kind of database objects (and not 1 for each kind i.e. `ConcurrentHashMap<K,V> where V would be some specific object).
My problem now (assuming all the above thoughts are reasonable) is how would I pre-load this cache on start up from DB?
I mean using callable if I need something in the cache and is not there the callable would get it the first time and have it ready on the next get.
But how can I pre-load the cache if I use callables?
Note:I am not interested in using some library since my needs are small.

You might have better luck with ConcurrentHashMap<K, Future<V>>, since Future better matches the concept of "something in the process of being computed, or possibly already computed." You could just initialize some elements of the cache with a Future that's already computed.

Couldn't you just do something simple like this?
for (Callable<V> c : map.values()) {
c.call();
}

You probably should use interfaces on your objects:
public interface Cacheable{}
public MyObject implements Cacheable{...}
ConcurrentHashMap<K, Cacheable> = ...

Related

Is it safe to use HashSet as a very simple cache in a multi-threaded environment?

I realize there are similar questions/answers to this, but I can't seem to find exactly what I'm looking for.
I'm looking to implement an in-memory cache that caches the result of a call to the DB, which is doing an existence check. This existence check is quite expensive and once the object exists in the DB it will never be removed, so a very simple cache in-memory (in-process even) is all I need. Once the DB call is made the process remembers the result of the existence check for that ID and shouldn't call the DB again.
(Maybe another thing to mention is if the object doesn't exist in the DB, it will be created).
I'm using a HashSet (java) for this and just adding the ID to the set when the check/create is done, however this is a highly concurrent environment and I am unsure about the implications of HashSet's lack-of-threadsafeness.
The code only ever uses the add() and contains() methods (no iteration).
I don't really care about a cache miss (and a resulting extra DB call) here and there, but what I'm wondering is if this pattern of add() and contains() being called on the set in concurrent threads could lead to more catastrophic errors.
You can use ConcurrentHashMap for multi threaded access and write operation. If you want only HashSet, you can ConcurrentHashSet derived from ConcurrentHashMap. You can use like this.
Set<String> myConcurrentSet = ConcurrentHashMap.newKeySet();
No. If you want to use a Map in a multithreaded environment, use Collections.synchronizedMap(<map object>) or ConcurrentHashMap.newKeySet();:
Set<String> concurrentSet = Collections.synchronizedSet(new HashSet<>());
or...
Set<String> concurrentSet = ConcurrentHashMap.newKeySet();

Java streams map with sideeffect and collect or foreach and populate a result-list

I have a piece of code that looks something like this.
I have read two contradicting(?) "rules" regarding this.
That .map should not have side effects
That .foreach should not
update a mutable variable (so if I refactor to use foreach and
populate a result list, then that breaks that) as mentioned in http://files.zeroturnaround.com/pdf/zt_java8_streams_cheat_sheet.pdf
How can I solve it so I use streams and still returns a list, or should I simply skip streams?
#Transactional
public Collection<Thing> save(Collection<Thing> things) {
return things.stream().map(this::save).collect(Collectors.toList());
}
#Transactional
public Thing save(Thing thing) {
// org.springframework.data.repository.CrudRepository.save
// Saves a given entity. Use the returned instance for further operations as the save operation might have changed the entity instance completely.
Thing saved = thingRepo.save(thing);
return saved;
}
Doesn't that paper say shared mutable state? In your case if you declare the list inside the method and then use forEach, everything is fine. The second answer here mentions exactly what you are trying to do.
There is little to no reason to collect a entirely new List if you don't mutate it at all. Besides that your use case is basically iterating over every element in a collection and save that which could simply be achieved by using a for-each.
If for some reason thingRepo.save(thing) mutates the object you can still return the same collection, but at this point it's a hidden mutation which is not clearly visible at all since thingRepo.save(thing) does not suggest that.

Should I use a ConcurrentHashMap?

A quick question about ConcurrentHashMap:
public Map<String, String> getA(){
get something from db in a HashMap lets call it x
....
do some operations in on x
....
put the result in ConcurrentHashMap lets call it A
.....
return A
}
Does it make sense to have a ConcurrentHashMap or should I go with a HashMap?
1.HashMap
2.ConsurentHashMap
If you are on different threads or otherwise the data will be operated on at the same time (multithreaded delegate or the alike) , yes, use ConcurrentHashMap. Otherwise, HashMap should do (given the information you've provided).
Based on reading your pseudo code, I get the impression that you are not working on different threads and therefore HashMap should suffice.
You might do better wrapping it in Collections.unmodifiableMap() if you don't want to worry about the clients of this method getting into race conditions when modifying/reading the map.

Initialize entry in Java Map with single look up (like in C++)

In C++, I can look up a key in a map and insert it if it's not there for the cost of a single look up. Can I do the same in Java?
Update:
(For those of you who must see code.)
long id = 0xabba;
int version = 0xb00b;
for (List<Object> key : keys) {
if (!index.containsKey(key)) {
index.put(key, Maps.<Long,Integer>newHashMap());
}
index.get(key).put(id, version);
}
There are two look ups when the key is first inserted into the map. In C++, I could do it with a single look up.
Concurrent maps have an atomic putIfAbsent method, if this is what you mean.
I am not entirely familiar with C++ intrinsic implementation, but I have some doubts about it being a single operation in terms of performance/efficiency.
Even if it was, why would you necessarily need one in Java? Or even want one?
Assuming that it looks something like:
lookup(object) // side effect of object insertion
I wouldn't want something like this in Java for anything other than concurrency.
EDIT: clarification

my ideal cache using guava

Off and on for the past few weeks I've been trying to find my ideal cache implementation using guava's MapMaker. See my previous two questions here and here to follow my thought process.
Taking what I've learned, my next attempt is going to ditch soft values in favor of maximumSize and expireAfterAccess:
ConcurrentMap<String, MyObject> cache = new MapMaker()
.maximumSize(MAXIMUM_SIZE)
.expireAfterAccess(MINUTES_TO_EXPIRY, TimeUnit.MINUTES)
.makeComputingMap(loadFunction);
where
Function<String, MyObject> loadFunction = new Function<String, MyObject>() {
#Override
public MyObject apply(String uidKey) {
return getFromDataBase(uidKey);
}
};
However, the one remaining issue I'm still grappling with is that this implementation will evict objects even if they are strongly reachable, once their time is up. This could result in multiple objects with the same UID floating around in the environment, which I don't want (I believe what I'm trying to achieve is known as canonicalization).
So as far as I can tell the only answer is to have an additional map which functions as an interner that I can check to see if a data object is still in memory:
ConcurrentMap<String, MyObject> interner = new MapMaker()
.weakValues()
.makeMap();
and the load function would be revised:
Function<String, MyObject> loadFunction = new Function<String, MyObject>() {
#Override
public MyObject apply(String uidKey) {
MyObject dataObject = interner.get(uidKey);
if (dataObject == null) {
dataObject = getFromDataBase(uidKey);
interner.put(uidKey, dataObject);
}
return dataObject;
}
};
However, using two maps instead of one for the cache seems inefficient. Is there a more sophisticated way to approach this? In general, am I going about this the right way, or should I rethink my caching strategy?
Whether two maps is efficient depends entirely on how expensive getFromDatabase() is, and how big your objects are. It does not seem out of all reasonable boundaries to do something like this.
As for the implementation, It looks like you can probably layer your maps in a slightly different way to get the behavior you want, and still have good concurrency properties.
Create your first map with weak values, and put the computing function getFromDatabase() on this map.
The second map is the expiring one, also computing, but this function just gets from the first map.
Do all your access through the second map.
In other words, the expiring map acts to pin a most-recently-used subset of your objects in memory, while the weak-reference map is the real cache.
-dg
I don't understand the full picture here, but two things.
Given this statement: "this implementation will evict objects even if they are strongly reachable, once their time is up. This could result in multiple objects with the same UID floating around in the environment, which I don't want." -- it sounds like you just need to use weakKeys() and NOT use either timed or size-based eviction.
Or if you do want to bring an "interner" into this, I'd use a real Interners.newWeakInterner.

Categories