Apache Ignite JDBC and write-behind caching strategy with 3rd party persistence - java

I currently use MySQL as a persistent data store and I'd like to introduce data grid layer between my application and MySQL to handle database outages. I'd like to do it as non-invasively to the current application structure as possible.
Apache Ignite is shipped with two features related to my problem: write-behind caching strategy with 3rd party persistence and custom JDBC driver.
I would like to combine these two features as follows:
the application will use Ignite JDBC driver to persist data.
Ignite will query/update data in memory and will asynchronously flush data to the MySQL database (write-behind caching strategy).
when MySQL becomes unavailable Ignite will batch the updates until MySQL restores and will still serve queries/updates without affecting the client app.
Is this setup possible with only configuration changes like replacing the DataSource implementation and configuring the Ignite cache?

I don't think that 3-rd point is available from out of the box.The CacheStore implementation (for example, CacheJdbcPojoStore) assumes that connection to the underlying database is reliable and can be established at any time. The write-behind mechanism works in the same way, i.e. it can establish a connection when the internal buffer is overflowed, a timeout occurs, the back-pressure mechanism is triggered.
Thus, you have to implement your own CacheStore, which takes care of the accumulation of data, while the MySQL database is disabled for some reason.
Perhaps, the following links will be helpful:
https://apacheignite.readme.io/docs/3rd-party-store#section-cachestore
https://apacheignite.readme.io/docs/3rd-party-store#section-cachestoresession

Related

Cache MySQL DB with Apache Ignite

I have some application written in JAVA.
We are using MySQL DB.
It is possible to integrate that MySQL DB with Apache Ignite as In Memory cache and use that configuration without any updates in JAVA application (of course some DB connection details should be changed)?
So my application do the same staff but only difference will be connection with Apache Ignite instead of MySQL?
It is possible this kind of configuration?
I suppose you are looking for the write-through feature. I'm not sure what is your use case, but you should be aware of some limitations like your data have to be preloaded into Ignite before running SELECT queries. From a very abstract perspective, you need to define POJOs and implement a custom CacheStore interface. Though GridGain Control Center can do the latter for you automatically, check this demo as a reference.

Apache Ignite as transparent cache for Postgresql tables

Is it possible to use Apache Ignite as transparent cache for several tables in PostgreSQL RDBMS and to query that cache using Ignite SQL?
For example like this:
Application (via SQL) ---> Apache Ignite (if data is not loaded) ---> Postgresql
I'm new to Ignite and cannot figure out how to do that or is it even possible.
Ignite's SQL works over in-memory data only so you need to load data into caches beforehand. In other words, read-through doesn't work for SQL queries.
Starting with version 2.1 Ignite provides its custom persistent store that allows running SQL queries against the data both in memory and on disk.
It will work, if you preload data to cache before querying.
You can do it by configuring a CacheStore and calling IgniteCache#loadCache(). Here is documentation: https://apacheignite.readme.io/v2.3/docs/3rd-party-store#cachestore
Another option is to enable readThrough parameter and use cache API. Unfortunately, this option has no effect on Ignite SQL and works for cache API only.

Bidirectional Replication on Postgresql 9.3

I am using Postgres 9.3 on my production servers. I would like to achieve high availability of Postgres DB using Master-Master configuration where each master would run in an Active-Active mode with bidirectional replication.
I have 2 Java Spring REST web-services pointed to 2 separate database engines each having their own storage. Both web services point to its own database plus the other one in HA configuration.
Now if any one of the Databases fails, I want the active database server to work and when the failed one recover, the data should be synced back to the recovered one.
I tried doing bidirectional replication using Bucardo 5.3.1 but the recovered database does not get updated with the new data and Bucardo syncs need to be kicked again. (see bug: https://github.com/bucardo/bucardo/issues/88)
Is there any way I can achieve this with some other bi-directional replication tool?
Or is there any other way where I can have with 2 Postgres engines pointing to a shared storage running in Active-Active configuration?
2nd Quadrant released Postgres BDR which is a patched version of PostgreSQL that can do multimaster replication using logical WAL decoding. You will find more informations here : https://www.2ndquadrant.com/fr/resources/bdr/
I have finally decided to move to Enterprise DB of Postgres (a paid licence) that provides replication tools via GUI which are easy to use and configure.

Using EhCache as the main Datatasource instead of Database

Is it reliable to use ehCache as a datasource instead of a database ?
My business functionality will be to periodically collect information from a running application and store it in the Ehcache cache and then retrieve and display statistics about the collected information by querying the cache with EhCache Search API. The cache will only nee to keep the last 30-45 days of data.
What do you think about this approah?
ehCache could be an acceptable solution - assuming TTI, TTL and other params are set according to your business needs. There shouldn't be any reliability issue per se. A SQL database affords options for transactional commits, complex queries and relational support which aren't provided of course of ehCache by itself.

When should I use the JDBC Persistence Adapter in ActiveMQ?

Reading the ActiveMQ documentation (we are using the 5.3 release), I find a section about the possibility of using a JDBC persistence adapter with ActiveMQ.
What are the benefits? Does it provide any gain in performance or reliability? When should I use it?
In my opinion, you would use JDBC persistence if you wanted to have a failover broker and you could not use the file system. The JDBC persistence was significantly slower (during our tests) than journaling to the file system. For a single broker, the journaled file system is best.
If you are running two brokers in an active/passive failover, the two brokers must have access to the same journal / data store so that the passive broker can detect and take over if the primary fails. If you are using the journaled file system, then the files must be on a shared network drive of some sort, using NFS, WinShare, iSCSI, etc. This usually requires a higher-end NAS device if you want to eliminate the file share as a single point of failure.
The other option is that you can point both brokers to the database, which most applications already have access to. The tradeoff is usually simplicity at the price of performance, as the journaled JDBC persistence was slower in our tests.
We run ActiveMQ in an active/passive broker pair with journaled persistence via an NFS mount to a dedicated NAS device, and it works very well for us. We are able to process over 600 msgs/sec through our system with no issues.
Hey, the use of journaled JDBC seems to be better than using JDBC persistence only since the journaling is very much faster than JDBC persistence. It is better than just journalled persistence only cos' you have an additional backup of the messages in the db. Journalled JDBC has the additional advantage that the same data in journal is persisted to the db later and this can be accessed by developers when needed!
However, when you are using master/slave ActiveMQ topology with journalled JDBC, you might end up loosing messages since you might have messages in journal that are not yet into the DB!
If you have a redelivery plugin policy in place and use a master/slave setup, the scheduler is used for the redelivery.
As of today, the scheduler can only be setup on a file database, not on the JDBC. If you do not pay attention to that, you will take all messages that are in redelivery out of the HA scenario and local to the broker.
https://issues.apache.org/jira/browse/AMQ-5238 is an issue in Apache issue tracker that asks for a JDBC persistence adapter for schedulerdb. You can place a vote for it to make it happen.
Actually, even on the top AMQ HA solution, LevelDB+ZooKeeper, the scheduler is taken out of the game and documented to create issues (http://activemq.apache.org/replicated-leveldb-store.html at end of page).
In a JDBC scenario, therefor it can be considered unsafe and unsupported, but at least not clearly documented, how to setup the datastore for the redelivery policy.

Categories