We are trying to develop a system for distributed caching. Right now, we have 12 applications and they all load same cache. So each jvm loads cache in its in-memory. Problem with this system is redundant data. All 12 applications are loading same cache.
We want to develop a system where you add one or two(for failover) JVM's which load cache and the other 12 applications call these new Cache JVM's.
Can someone suggest me if there are any technologies/frameworks that has solution for my needs?
Thanks
Have a look at Memcached. It may offer a solution to your distributed cache needs.
Also, as #Guy Bouallet mentioned, ehcache is also a viable solution.
Ehcache is a good alternative. It can be used to cache data loaded from database, Web pages or other key/value elements in a distributed environment.
I personally used it in several professional applications and it had shown to be an efficent solution.
Related
I am implementing a memory caching system for a web application. This memory caching system will have to handle objects sized from small scale to large scale and more and more hits to caches (reads and writes) . The system will have to handle multiple cachinng services such as JCS, ehCach, Memcach, SQL caching etc based on the configuration.
For learning and studying purposes and to implement a better architecture for my system, any one please let me have some resources. (example: sample class diagrams with project source files ).
The question is totally unspecific! The best thing you can do is to work through the tutorials, examples and manuals of the caching solutions.
You should also consider distributed caching solutions like infinispan and hazelcast.
For in-memory only caching Guava Cache and cache2k (I work on cache2k) might be sufficient.
If you start a new architecture around caching I strongly suggest that you look into the JSR107/JCache spec, because this is the new standard way to access caching services.
I've been researching this for a week now, but I'd like some thoughts on my particular situation...
2 physical servers:
Server A - public WAR, admin WAR
Server B - public WAR
Requirements:
Both WARs need to view the same data.
admin WAR modifies / adds data to the cache.
public WARs modify other parts of the cache / add data to it.
entire cache needs to reside in memory on each physical server (if I add something on Server A admin WAR or public WAR, it needs to show up on Server B public WAR) so in the event of a failure, we aren't waiting for half the cache to be populated
1,500 active users/server, vast majority of traffic is read, very little write
Additional hardware is out of the question.
Is there a good third party caching solution for this scenario? It seems most distributed caching systems want to leave half the data on Server A and half on Server B, which wouldn't meet our failover performance needs.
Thanks for any ideas!
You should look at Redis
http://www.gigaspaces.com/ has a solution for that, it allows you to create "Space" that serves as cache in replicated mode, so each node will have exact copy of data.
They also have solution for fail-over or hot stand by.
Edit:
Gigaspace is far more than just a shared cache, but you can use just the caching solution. It's called In memory data grid. They have dramaticaly changed they web pages so I can't find exact page. But if you search through the documentation yo'll find it.
You can start here
http://www.gigaspaces.com/datagrid
But the technology is not free.
Take a look at the replicated options for EhCache.
Sounds like you've been searching for information on "distributed caches", which has a different defintion than "replicated cache". A distributed cache is a larger cache system spread out among many machines, so that the loss of anyone machine in the cluster does not bring down the entire cache, but just a portion. In this scenario the total size of your cache can reach (number of machines times memory of each machine).
In a replicated cache, the cached data is replicated across each machine, limiting you to a total cache size of max(memory of any one machine).
It seems most distributed caching systems want to leave half the data
on Server A and half on Server B, which wouldn't meet our failover
performance needs.
No, you can tweak it easy. Otherwise you need sticky seesion (you have to know exactly, which cache stores your data). You can choose any solution on the market EhCache, GigaSpace, GridGain etc. I would recommend to use JBoss Cache, imho the simplest and exactly what you need
There are many solutions in this space.
Memcached
EhCache
Infinispan
All of them can be configured as distributed caches. AFAIK Infinispan works best when left an an embedded cache in JBoss AS, last I checked it was difficult to integrate into other app servers. If you have money I would recommend BigMemory from Terracotta. Its the commercial derivative of EhCache and provides alot of additional nice-to-have features.
We use Apache Commons JCS and have been very pleased with it. It claims to be almost twice as fast as EHCache. For the situation you have described, you would probably configure a Lateral TCP Cache.
I try to find how to implement distributed caching for applications.
Ehcache already used for caching in my project, that's why I search how to solve this issue using it.
But, unfortunately, it seems, that Terracotta Enterprise Suite is needed for this and it is commercial. Isn't it?
Is there another solution how to use Ehcache for distributed caching (RMI or anything else)?
You don't need terracotta enterprise suite to cluster you Ehcache instances. So you can use clustering with Ehcache & Terracotta today, with pure OSS :
http://www.ehcache.org/documentation/configuration/distributed-cache-configuration
Edit: This link has expired. Below is the new link related to clustered cache
http://www.ehcache.org/documentation/3.4/clustered-cache.html
Now if you need replication, you can use other mechanisms like RMI indeed :
http://www.ehcache.org/documentation/replication/index
Though, only the Terracotta clustering would bring you HA and features like consistency guarantees and the like...
You may want to try Hazelcast as well. It is open source, distributed cache and is super easy to use.
PS: I work for Hazelcast
there are different ways to implement distributed cache using ehcache mechanism. One can be using RMI or Jgroups.
In one of the project i came through same situation and after some research i figured out Using Redis server for cache management is a easy and effective solution.
I suggesting this as a answer because if you try using ehcache it will take time and increase complexity and you can have multiple cache manager in your workspace.
Running a last db server jersey applications and I need to start thinking about memory caching.
The majority of the db commands are only updated once a day. There is an opportinity to cache these queries at the server level.
What options do I have? I know quite a few large applications use memcached. Others??
Any of the Java memcached libraries is probably your best best.
Spymemcached
Memcached-Java-Client
Xmemcached
Memcached is a good default. Redis can be used to. It offers richer functionality should you choose to use it, but if you're use case will always be what memcached offers then there's no particular advantage.
Note that PostgreSQL has an internal cache (the buffer cache) and uses the kernel's disk cache. So tuning the PostgreSQL config for your needs may be a good idea.
In addition to this you could use materialized views for some queries.
I'm looking for a solution to share a cache between two tomcat web apps running on different hosts. The cache is being used for data synchronization, so the cache must be guaranteed to be up-to-date at all times between the two tomcat instances. (Sorry, I'm not 100% sure if the correct terminology for this requirement is "consistency" or something more specific like having ACID property). Another requirement is of course is that it should be fast to access the cache, with about equal numbers of writes as reads. I do have access to a shared filesystem so that is a consideration.
I've looked at something like ehcache but in order to get a shared cache between the webapps I would either need to implement on top of a Terracotta environment or using the new ehcache cache server. The former (Terracotta) seems like overkill for this, while the cache web server seems like it wouldn't provide the fast performance that I want.
Another solution I've looked at is building something simple on top of a fast key-value store like Redis or memcachedb. Redis is in-memory but can easily be configured to be a centralized cache, while memcachedb is a disk-based persistent cache which could work because I have a shared filesystem.
I'm looking for suggestions on how to best solve this problem. The solution needs to be a relatively mature technology as it will be used in a production environment.
Thanks in advance!
I'm quite sure that you don't require terracotta or ehcache server if you need a distributed cache. Ehcache with one of the four replication mechanisms would do.
However, based on what you've written I guess that you're looking for more than just a cache. Memcached/Ehcache are examples of what you might call a caching layer for your application - nothing more.
If you find yourself using words like 'guaranteed' 'up-to-date' 'ACID' you're better off using an in-memory DB like Oracle Times Ten/MySQL Cluster/Redis with a disk-based persistent storage.
You can use memcached (not memcachedb) for fast and efficient caching. Redis or memcachedb could be an overkill unless you want persistent caching. Memcached can be clustered very easily and you can use spymemcached java client to access it. Memcacached is very mature and is running in several hundred thousands, if not millions of production servers. It can be monitored through Nagios and Munin systems when in production.