I'm working on a Java Spring project where we have micro services architecture, each service has an internal cache of type hibernate second level cache implemented by Ehcache region factory, to insure data consistency between the different caches we use the Ehcache integration with JMS, ActiveMQ is the JMS provider we use.
This is working so far, but how can I send manually a signal to the topic the caches queues are superscribed to, this signal should evict a specific cache region?
what is the nature and structure of such message in Java?
Any help is appreciated!
Related
I have a standalone Spring application that reads messages from a Weblogic cluster. It is not a MDP, rather it runs multiple threads that each use a JMSTemplate to browse the queue and retrieve messages based on specific criteria.
I would like to cache the JMS connections, while also ensuring that I open enough connections that I always am retrieving messages from each server in the cluster. My issue is the default ConnectionFactory does not cache at all, but the Spring wrappers SingleConnectionFactory and CachingConnectionFactory do not allow for multiple connections open at once.
Should I implement my own ConnectionFactory that caches on a limited basis? Or what is the recommended approach.
Use case seemed to be quite clumsy to do in Spring, so resolved the issue by directly managing the JMS resources directly.
We're trying to horizontally scale a JPA based application, but have encountered issues with the second level cache of JPA. We've looked at several solutions (EhCache, Terracotta, Hazelcast) but couldn't seem to find the right solution. Basically what we want to achieve is to have multiple application servers all pointing to a single cache server that serves as the JPA's second level cache.
From a non java perspective, it would look like several PHP servers all pointing to one centralised memcache server as it's cache service. Is this currently possible with Java?
Thanks
This is in response to the comment above.
Terracotta will be deployed in it's own server
Each of the app server will have terracota drivers which will store/retrieve data to-fro terracotta server.
Ehcache api present in the application war, will invoke the terracota drivers to store data into terracotta server.
Hibernate api will maintain the L1 cache, in addition it will use the ehcache api to save/retrieve data to-fro L2 cache. Blissfully unaware about how ehcache api performs the task.
Is it reliable to use ehCache as a datasource instead of a database ?
My business functionality will be to periodically collect information from a running application and store it in the Ehcache cache and then retrieve and display statistics about the collected information by querying the cache with EhCache Search API. The cache will only nee to keep the last 30-45 days of data.
What do you think about this approah?
ehCache could be an acceptable solution - assuming TTI, TTL and other params are set according to your business needs. There shouldn't be any reliability issue per se. A SQL database affords options for transactional commits, complex queries and relational support which aren't provided of course of ehCache by itself.
I’m trying to figure out which cache concurrency strategy should I use for my application (for entity updates, in particular). The application is a web-service developed using Hibernate, is deployed on Amazon EC2 cluster and works on Tomcat, so no application server there.
I know that there are nonstrict-read-write \ read-write and transactional cache concurrency strategies for data that can be updated and there are mature, popular, production ready 2L cache providers for Hibernate: Infinispan, Ehcache, Hazelcast.
But I don't completely understand the difference between the transactional and read-write caches from the Hibernate documentation. I thought that the transactional cache is the only choice for a cluster application, but now (after reading some topics), I'm not so sure about that.
So my question is about the read-write cache. Is it cluster-safe? Does it guarantee data synchronization between database and the cache as well as synchronization between all the connected servers? Or it is only suitable for single server applications and I should always prefer the transactional cache?
For example, if a database transaction that is updating an entity field (first name, etc.) fails and has been rolled back, will the read-write cache discard the changes or it will just populate the bad data (the updated first name) to all the other nodes?
Does it require a JTA transaction for this?
The Concurrency strategy configuration for JBoss TreeCache as 2nd level Hibernate cache topic says:
'READ_WRITE` is an interesting combination. In this mode Hibernate
itself works as a lightweight XA-coordinator, so it doesn't require a
full-blown external XA. Short description of how it works:
In this mode Hibernate manages the transactions itself. All DB
actions must be inside a transaction, autocommit mode won't work.
During the flush() (which might appear multiple time during
transaction lifetime, but usually happens just before the commit)
Hibernate goes through a session and searches for
updated/inserted/deleted objects. These objects then are first saved
to the database, and then locked and updated in the cache so
concurrent transactions can neither update nor read them.
If the transaction is then rolled back (explicitly or because of some
error) the locked objects are simply released and evicted from the
cache, so other transactions can read/update them.
If the transaction is committed successfully, then the locked objects are
simply released and other threads can read/write them.
Is there some documentation how this works in a cluster environment?
It seems that the transactional cache works correctly for this, but requires JTA environment with a standalone transaction manager (such as JBossTM, Atomikos, Bitronix), XA datasource and a lot of configuration changes and testing. I managed to deploy this, but still have some issues with my frameworks. For instance, Google Guice IoC does not support JTA transactions and I have to replace it with Spring or move the service to some application server and use EJB.
So which way is better?
Thanks in advance!
Summary of differences
NonStrict R/w and R/w are both asynchronous strategies, meaning they
are updated after the transaction is completed.
Transactional is
obviously synchronous and is updated within the transaction.
Nonstrict R/w never locks an entity, so there's always the chance of
a dirty read.
Read-Write always soft locks an entity, so any
simultaneous access is sent to the database. However, there is a
remote chance that R/w might not produce Repeatable Read isolation.
The best way to understand the differences between these strategies
is to see how they behave during the course of the Insert, update or
delete operations.
You can check out my post
here
which describes the differences in further detail.
Feel free to comment.
So far I've only seen clustered 2LC working with transactional cache modes. That's precisely what Infinispan does, and in fact, Infinispan has so far stayed away from implementing the other cache concurrency modes. To lighten the transactional burden, Infinispan integrates via transaction synchronizations with Hibernate as opposed to XA.
Reading the ActiveMQ documentation (we are using the 5.3 release), I find a section about the possibility of using a JDBC persistence adapter with ActiveMQ.
What are the benefits? Does it provide any gain in performance or reliability? When should I use it?
In my opinion, you would use JDBC persistence if you wanted to have a failover broker and you could not use the file system. The JDBC persistence was significantly slower (during our tests) than journaling to the file system. For a single broker, the journaled file system is best.
If you are running two brokers in an active/passive failover, the two brokers must have access to the same journal / data store so that the passive broker can detect and take over if the primary fails. If you are using the journaled file system, then the files must be on a shared network drive of some sort, using NFS, WinShare, iSCSI, etc. This usually requires a higher-end NAS device if you want to eliminate the file share as a single point of failure.
The other option is that you can point both brokers to the database, which most applications already have access to. The tradeoff is usually simplicity at the price of performance, as the journaled JDBC persistence was slower in our tests.
We run ActiveMQ in an active/passive broker pair with journaled persistence via an NFS mount to a dedicated NAS device, and it works very well for us. We are able to process over 600 msgs/sec through our system with no issues.
Hey, the use of journaled JDBC seems to be better than using JDBC persistence only since the journaling is very much faster than JDBC persistence. It is better than just journalled persistence only cos' you have an additional backup of the messages in the db. Journalled JDBC has the additional advantage that the same data in journal is persisted to the db later and this can be accessed by developers when needed!
However, when you are using master/slave ActiveMQ topology with journalled JDBC, you might end up loosing messages since you might have messages in journal that are not yet into the DB!
If you have a redelivery plugin policy in place and use a master/slave setup, the scheduler is used for the redelivery.
As of today, the scheduler can only be setup on a file database, not on the JDBC. If you do not pay attention to that, you will take all messages that are in redelivery out of the HA scenario and local to the broker.
https://issues.apache.org/jira/browse/AMQ-5238 is an issue in Apache issue tracker that asks for a JDBC persistence adapter for schedulerdb. You can place a vote for it to make it happen.
Actually, even on the top AMQ HA solution, LevelDB+ZooKeeper, the scheduler is taken out of the game and documented to create issues (http://activemq.apache.org/replicated-leveldb-store.html at end of page).
In a JDBC scenario, therefor it can be considered unsafe and unsupported, but at least not clearly documented, how to setup the datastore for the redelivery policy.