When to use Java Cache and how it differs from HashMap? - java

I've gone through javax.cache.Cache to understand it's usage and behavior. It's stated that,
JCache is a Map-like data structure that provides temporary storage of
application data.
JCache and HashMap stores the elements in the local Heap memory and don't have persistence behavior by default. By implementing custom CacheLoader and CacheWriter we can achieve persistence. Other than that, When to use it?

Caches usually have more management logic than a map, which are nothing else but a more or less simple datastructure.
Some concepts, JCaches may implement
Expiration: Entries may expire and get removed from the cache after a certain period of time or since last use
Eviction: elements get removed from the cache if space is limited. There can be different eviction strategies .e. LRU, FIFO, ...
Distribution: i.e. in a cluster, while Maps are local to a JVM
Persistence: Elements in the cache can be persistent and present after restart, contents of a Map are just lost
More Memory: Cache implementations may use more memory than the JVM Heap provides, using a technique called BigMemory where objects are serialized into a separately allocated bytebuffer. This JVM-external memory is managed by the OS (paging) and not the JVM
option to store keys and values either by value or by reference (in maps you to handle this yourself)
option to apply security
Some of these some are more general concepts of JCache, some are specific implementation details of cache providers

Here are the five main differences between both objects.
Unlike java.util.Map, Cache :
do not allow null keys or values. Attempts to use null will result in a java.lang.NullPointerException
provide the ability to read values from a javax.cache.integration.CacheLoader (read-through-caching) when a
value being requested is not in a cache
provide the ability to write values to a javax.cache.integration.CacheWriter (write-through-caching) when a
value being created/updated/removed from a cache
provide the ability to observe cache entry changes
may capture and measure operational statistics
Source : GrepCode.com

Mostly, caching implementations keep those cached objects off heap (outside the reach of GC). GC keeps track of each and every object allocated in java. Imagine you have millions of objects in memory. If those objects are not off heap, believe me, GC will make your application performance horrible.

Related

Local Java Cache library supporting multiple map tiers?

I am looking for a Java based caching library that supports multiple standard Map interfaces as tiers (that I for instance could use for on-heap, off-heap and flash based maps) i.e. instead of layering multiple independent caches that each keep their own eviction mechanism I want a SINGLE logic that will move entries between the tiers as they becomes more or less frequently used.
My use-case involves a huge number of relatively small entries so holding separate caches where each lower level also holds the entries of the previous tiers as well as resulting in duplication of usage meta data for each key would be very storage inefficient.
The access time must be as low and consistent as possible so not considering distributed/remote cache tiers (Redis, Memcached...) in this case.

Java: use concurrent hashmap or memcached cache

I have a simple country states hashmap, which is a simple static final unmodifiable concurrent hashmap.
Now we have implemented memcached cache in our application.
My question is, Is it beneficial to get the values from cache instead of such a simple map?
What benefits I will get or not get if I move this map to cache?
This really depends on the size of the data and how much memory is you've allocated for your JVM.
For simple data like states of a country which are within a few hundred entries, a simple HashMap would suffice and using memcache is an overkill and in fact slower.
If it's large amount of data which grow (typically 10s/100s MBs or larger) and require frequent access, memcache (or any other persistent cache) would be better than an in-memory storage.
It will be much faster as a HashMap because it is stored in memory and the lookup can be done via the jvm by it's reference. The lookup from memcache would require extra work for the processor to look up the map.
If your application is hosted on only one server then you don't need distributed feature of memcache and HashMap will be damn fast. Stats
But this is not case of web applications. ~99% cases for web applications you host it on multiple servers and want to use distributed caching, memcache is best in such cases.

android cache > internal storage vs. object cache

i need to cache images (only 5 or up to 100) from the web and displayed in a listview. if the user selects a row of the listview the cache can be cleared. i had a look on some examples. some use external storage. some use internal and external. some objects..
so what are the advantages/disadvantages of internal storage ( http://developer.android.com/guide/topics/data/data-storage.html#filesInternal via getCacheDir()) and object cache (something like WeakHashMap or HashMap < String, SoftReference < Drawable > )?
a problem with softreferences seems to be that they may get gc'ed too fast ( SoftReference gets garbage collected too early) . what about the android internal storage? the reference sais "These files will be ones that get deleted first when the device runs low on storage.".
does it make any difference to use a object cache or the temporary internal storage? except for the object cache should be a bit faster
Here are the few differences between the two:
Object cache is faster than internal storage, but has lower capacity.
Object cache is transient in nature while internal storage has longer life span
Object cache takes the actual space in the heap. Internal storage doesn't. This is an important point, as making your object cache too large could cause the OutOfMemoryException even with SoftReference
Now given those differences, they are not totally mutually exclusive. A lot of we implemented is using multi layer caching especially related to the image loading. Here are the steps that we use:
If the image hasn't been cached, fetch from the URL and cache it in first level cache which is the SoftReference/WeakHashMap or even hard cache with limited size using LinkedHashMap
We then implement removeEldestEntry() in LinkedHashMap. Upon hitting the hard cache capacity, we move things to secondary cache which is the internal storage. Using this method, you don't have to refetch the image from the URL + it's still be faster and it frees up your memory
We ran a cleanup on timely basis on the background for the internal storage using LRU algorithm. You shouldn't rely on Android to clean this up for you.
We have made the multi layers caching a common component and have used it many of our projects for our clients. This technique is pretty much following to what L1, L2 cache in Computer Architecture.
You should look for the question "How do I lazy download images in ListView" and related.

Is it safe to cache DataSource lookups in Java EE?

I'm developing a simple Java EE 5 "routing" application. Different messages from a MQ queue are first transformed and then, according to the value of a certain field, stored in different datasources (stored procedures in different ds need to be called).
For example valueX -> dataSource1, valueY -> dataSource2. All datasources are setup in the application server with different jndi entries. Since the routing info usually won't change while the app is running, is it save to cache the datasource lookups? For example I would implement a singleton, which holds a hashmap where I store valueX->DataSource1. When a certain entry is not in the list, I would do the resource lookup and store the result in the map. Do I gain any performance with the cache or are these resource lookups fast enough?
In general, what's the best way to build this kind of cache? I could use a cache for some other db lookups too. For example the mapping valueX -> resource name is defined in a simple table in a DB. Is it better too lookup the values on demand and save the result in a map, do a lookup all the time or even read and save all entries on startup? Do I need to synchronize the access? Can I just create a "enum" singleton implementation?
It is safe from operational/change management point of view, but not safe from programmer's one.
From programmer's PoV, DataSource configuration can be changed at runtime, and therefore one should always repeat the lookup.
But this is not how things are happening in real life.
When a change to a Datasource is to be implemented, this is done via a Change Management procedure. There is a c/r record, and that record states that the application will have a downtime. In other words, operational folks executing the c/r will bring the application down, do the change and bring it back up. Nobody does the changes like this on a live AS -- for safety reasons. As the result, you shouldn't take into account a possibility that DS changes at runtime.
So any permanent synchronized shared cache is good in the case.
Will you get a performance boost? This depends on the AS implementation. It likely to have a cache of its own, but that cache may be more generic and so slower and in fact you cannot count on its presence at all.
Do you need to build a cache? The answer usually comes from performance tests. If there is no problem, why waste time and introduce risks?
Resume: yes, build a simple cache and use it -- if it is justified by the performance increase.
Specifics of implementation depend on your preferences. I usually have a cache that does lookups on demand, and has a synchronized map of jndi->object inside. For high-concurrency cache I'd use Read/Write locks instead of naive synchronized -- i.e. many reads can go in parallel, while adding a new entry gets an exclusive access. But those are details much depending on the application details.

Java based memcached client, optimization of putting data inside memcache

I have say list of 1000 beans which I need to share among different projects. I use memcache for this purpose. Currently, loop is run over complete list and each bean is stored in memcache with some unique memcache id. I was wondering, instead of putting each and every bean in memcache independently. Put all the beans in hashmap with the same key which is used for storing beans in memcache, and then put this hashmap in memcache.
Will this give me any significant improvement over putting each and every bean individually in memcached. Or will this cause me any trouble because of large size of the object.
Any help is appreciated.
It won't get you any particular benefit -- it'll actually probably be slower on the load -- serialization is serialization, and adding a hashmap wrapper around it just increases the amount of data that needs to be deserialized and populated. for retrievals, assuming that most lookups are desecrate by the key you want to use for your hashmap you'll have a much much slower retrieval time because you'll be pulling down the whole graph just to get to one of it's discreet member info.
Of course if the data is entirely static and you're only using memcached to populate values in various JVM's you can do it that way and just hold onto the hashmap in a static... but then you're multiplying your memory consumption by the number of nodes in the cluster...
I did some optimization work in spymemcached that helps it do the right thing when doing the wire encoding.
This may, or may not help you with your application. In general, just measure when you have performance questions about your app.

Categories