Guava loading cache - keep fresh until expiry - java

I have a specific use case with a LoadingCache in Guava.
Expire keys that have not been accessed in 30m
As long as a key is in the cache, keep it fresh irrespective of access
I've been able to get to these semantics only through the use of some external kludge.
https://gist.github.com/kashyapp/5309855
Posting here to see if folks can give better ideas.
Problems
refreshAfterWrite() is triggered only on access
cache.refresh() -> CacheLoader.reload()
updates timers for accessed/written even if we return oldValue
returning an immediateCancelledFuture() causes ugly logging
basically no way for reload() to say that nothing changed
Solution
set expireAfterAccess on the cache
schedule a refreshJob for every key using an external executor service
refreshJob.run() checks if the cache still has the key
(asMap().containsKey()) doesn't update access times
queries upstream, and does a cache.put() only if there is a changed value
Almost there
But this is not exactly what I set out to do, close enough though. If upstream is not changing, then un-accessed keys expire away. Keys which are getting changed upstream do not get expired in the cache.

Related

Collection Object lost due to redis TTL using redission client

We are using Redission client for java to get the data from redis but object gets deleted from collection due to TTL.
Example
We are trying the below approach to get the data from Redis with TTL.
final RList rList = client.getList(getEnvCacheKey(cacheKey));
rList.expire(7L, TimeUnit.SECONDS);
rlist.add("Value1");
rlist.add("Value2");
assertThat(rList).containsOnly("Value1", "Value2"); // This condition is true until 7 seconds
Now after 7 seconds
assert rlist.size() == 2 condition becomes false since object references are deleted due to TTL.
Due this we landed up in a production issue. Is there any way we can retain the objects even after TTL? Any sort of help will be appreciated.
The TTL(Time-To-Live) itself sets the expiration of a particular key after which the key can no longer be retrieved. If you want to keep the key in the memory you could simply skip setting rList.expire(7L, TimeUnit.SECONDS); altogether (infinite expiration).
In case you want to extend expiration, you can do so by repeating the expire command. It is also possible to remove the TTL altogether this way, although I could not tell you how to do it specifically with the Redisson.
As for the expired keys, Redis clears them 10 times a second (according to this documentation), so I don't think that you can (consistently) recover the values within the expired keys.
My general advice would be to take a step back and look at your system design. In case you are missing the expired keys, maybe this part of the product should get an extended TTL/no TTL/periodical TTL refresh

Is there any third party java cache which provides control over the expiration event?

My use-case is that I need to implement a cache on top of a service should expire entries after a certain amount of time (from their time of creation).
And if the entry is getting expired, then service look up should be done to get the latest entry. lets call is service refresh.
But, lets say if service refresh fails, then I should be able to use the stale data in the cache.
But since the cache is already expired, I don't have that entry.
So, I am thinking of controlling the expiration of the cache and cache entry would only be expired only if service is available to get the latest data, otherwise don't remove that entry.
I was looking into Google Guava cache, but it only provides a removalListener which would just notify me with the event but I am not able to control the expiration event with this.
Is there any third party cache implementation which can serve my purpose?
This kind of resilience and robustness semantics are implemented in cache2k. We use this in production for quite some time. The setup looks like
CacheBuilder.newCache(Key.class, Value.class)
.name("myCache")
.source(new YourSourceImplementation())
.backgroundRefresh(true)
.suppressExceptions(true)
.expiryDuration(60, TimeUnit.SECONDS)
.build();
With exceptionExpiryDuration you can actually set a shorter interval for the retry. There was a similar question on SO, which I also answered, see: Is it possible to configure Guava Cache (or other library) behaviour to be: If time to reload, return previous entry, reload in background (see specs) There you find some more details about it.
Regardless what cache you use, you will run into a lot of issues, since exception handling in general and building robust and resilient applications needs some thoughts in the details.
That said, I am not totally happy with the solution yet, since I think we need more control, e.g. how long stale data should be served. Furthermore, the cache needs to have some alerting if there is stale data in it. I put some thoughts on how to improve this here: https://github.com/cache2k/cache2k/issues/26
Feedback is very welcome.

Eagerly repopulate EhCache instead of waiting for a read

In my scenario, getting the fresh (non-cached) values is a very expensive operation so it is imperative pre-calculated cached values exist at all times instead of refreshing them on read, like EhCache seems to do.
For this it sounds reasonable to have a thread firing on TTL expiration repopulating the cache with fresh values, so no reads are ever waiting.
Is there a way to achieve this using Ehcache? Listening for OnElementExpired/Evicted events to repopulate the cache seems like a no-go (by the time I receive the event, a read would already be waiting).
I guess I could make the cache itself eternal and have my own scheduled task that repopulates, but then I get nothing from EhCache over dumb maps that I have now. Is this really how it is? Is there no way to have EhCache help me in this situation?
Ehcache provides a way of doing what you want with scheduled refresh.
You will need two things in order to make this work with Ehcache:
Use a cache loader - that is move to a cache read-through pattern. This is required as otherwise Ehcache has no idea how to get to the data mapped to a key.
Configure scheduled refresh - this works by launching a quartz scheduler instance.
Take a look at RefreshAheadCache, provided by EHCache.
However, I cannot find any examples of its use and indicators that this is mature.
The comment of the class says:
A cache decorator which implements read ahead refreshing. Read ahead occurs when a cache entry is accessed prior to its expiration, and triggers a reload of the value in the background.
This does not directly solve the problem as you mention below:
My problem is how to repopulate the cache without waiting for a read to trigger it
As far as I know there is no standard way to do it. The reason for it, is that the expiry is not timer based.
(Shameless) hint: Since I think this is quite useful, I implemented this in cache2k. The feature is called background refresh, enabled by CacheBuilder.backgroundRefresh(true).
may be this code could help :
https://github.com/jsr107/jsr107spec/issues/328

Guava Cache CacheLoader.refreshAfterWrite() and .expireAfterAccess() in combination

We are using a Guava LoadingCache which is build by a CacheLoader.
What we are looking for, is a Cache which will refresh its content regularly, but also expires keys after a given (longer) timeframe, if the key is not accessed anymore.
Is it possible to use .refresAfterWrite(30, TimeUnit.SECONDS) and also .expireAfterAccess(10,TimeUnit.MINUTES) on the same CacheLoader?
My experience is that the keys are never evicted because of the regular reload through refreshAfterWrite. The documentation leaves me a little uncertain about this point.
This should behave as you desired. From the CacheBuilder docs:
Currently automatic refreshes are performed when the first stale request for an entry occurs. The request triggering refresh will make a blocking call to CacheLoader.reload(K, V) and immediately return the new value if the returned future is complete, and the old value otherwise.
So if a key is queried 30 seconds after its last write, it will be refreshed; if it is not queried for 10 minutes after its last access, it will become eligible for expiration without being refreshed in the meantime.

How to reliably use a Guava LoadingCache when requiring persistence?

I store references in my application, e.g. (key) number => (value) name, usually for a short time (suitable to keep in memory), but sometimes for a longer time which requires me to persist it to DB (to survive application restarts and memory constraints). Earlier we always persisted to DB, but now I want to use LoadingCache as a performance enhancement as references usually are short-lived (no need to persist).
In my application I store references, e.g. key 123 => value "Paul".
I use a LoadingCache with expiry set to 45 seconds with the reasoning that if value has not been used for 45 seconds I want to persist it to DB (using a RemovalListener) so that a request for key 123 a day later from cache should return "Paul" using .load() from DB (as it's not in cache).
My question concerns that maintenance isn't guaranteed to happen immediately and that I don't understand when onRemoval()actually is called and how to solve my problem reliably.
Example flow:
1. Input is received saying that key 123 corresponds to value "Paul" which I .put() to my LoadingCache
2. Whenever data is used (often within 10-20 seconds) it's fetched via .get() and thereafter removed with .invalidate()
The LoadingCache has a RemovalListener that onRemoval() persist expired entries (checked with .wasEvicted() to avoid explicitly invalidated records) to DB table.
My problem is with how the cache's cleanup happens.
Example:
I add a value which should expire 45 seconds later.
If I try to .get() it 50 seconds later it is not returned to me as it has expired, but as I understand it from my tests there is no guarantee that value is removed using onRemoval() yet either before my .get() returns? This means I have not persisted it to DB so my load() method cannot find the value either (until later when onRemoval() actually has happened). So, I've lost my value when I need it which is not acceptable.
Am I trying to use LoadingCache for something it was not intended for or is there any way I can make it suit my needs? Or perhaps there is an alternative suggestion / solution?
Thanks a lot in advance!

Categories