Loading and persisting Hazelcast distributed map

Loading and persisting Hazelcast distributed map - java

I am using Hazelcast(3.8.1) as a cache service in my application. After going through the hazelcast documentation, I have few doubts related to it:
If we use Write-Behind Persistence, this being an async calls to its local queue, from which eventually we persist it to a db.
My question is, if all the nodes go down, then will there be a data loss in this scenario?
Note: I understand that one copy of the queue is also being maintained in a back up node. But my scenario is when all the node goes down, can we lose data?
Does hazelcast maintain an offline persistence when it goes down and load it when it is started [for all the nodes]?
Appreciate responses.

The answer to 1 is obvious, and is applicable to any in-memory system with asynchronous writes. If all nodes in your cluster go down, then yes, there's potential for data loss as your system is only eventually consistent.
For question 2: Hazelcast is an in-memory cache and therein lie its primary benefits. Writing to or loading from persistent storage should be secondary because it conflicts with some of the main attributes of a caching system (speed, I guess...).
With that said, it allows you to load from and write to persistent storage, either synchronously (write-through) or asynchronously (write-behind)
If your main reason for using Hazelcast is replication and partitioning (of persistent, consistent data), then you'd be better off using a NoSql database such as Mongodb. This depends a lot on your usage patters because it may still make sense if you expect far more reads than writes.
If, on the other hand, your main reason for using it is speed, then what you need is to better manage fault-tolerance, which has more to do with your cluster topology (maybe you should have cross-datacenter replication) than with persistence. It's atypical to be concerned with "all nodes dying" in your DC unless you have strong consistency or transaction requirements.

Yes, you would lose the data in memory if it is not persisted to the database yet.
OTOH, Hazelcast has Hot Restart for persistence to disk in Enterprise version. This helps in case of a planned shutdown of whole cluster or a sudden cluster-wide crash, e.g., power outage.

Related

Parallel and Transactional Processing in Java (Java EE)

I have an architectural question about how to handle big tasks both transactional and scalable in Java/Java EE.
The general challenge
I have a web application (Tomcat right now, but that should not limit the solution space, so just take this to illustrate what I'd like to achieve). This web application is distributed over several (virtual and physical) nodes, connected to a central DBMS (MySQL in this case, but again, this should not limit the solution...) and able to handle some 1000s of users, serving pages, doing stuff, just as you'd expect from your average web-based information system.
Now, there are some tasks which affect a larger portion of data and the system should be optimized to carry out these tasks reasonably fast. (Faster than processing everything sequentially, that is). So I'd make the task parallel and distribute it over several (or all) nodes:
(Note: the data portions which are processed are independent, so there are no database or locking conflicts here).
The problem is, I'd like the (whole) task to be transactional. So if one of the parallel subtasks fails, I'd like to have all other tasks rolled back as a result. Otherwise the system would be in a potentially inconsistent state from a domain perspective.
Current implementation
As I said, the current implementation uses Tomcat and MySQL. The nodes use JMS to communicate (so there is a JMS server to which a dispatcher sends a message for each subtask; and executors take tasks from the message queue, execute them, and post the results to a result queue from which the dispatcher collects the results. The dispatcher blocks and waits for all results to come in and if anything is fine, it terminates with an OK status.
The problem here is that all the executors have their own local transaction context, so the picture would look like this:
If for some reason one of the subtasks fails, the local transaction is rolled back and the dispatcher gets an error result. (There is some failsafe mechanism here, which tries to repeat the failed transaction, but let's assume for some reason, the one task cannot be completed).
The problem is that the system now is in a state where all transactions but one is already committed and completed. And because I cannot get the one final transaction to finish successfully, I cannot get out of this state.
Possible solutions
These are the thoughts which I have followed so far:
I could somehow implement a domain-specific rollback mechanism myself. Because the distributor knows which tasks have been carried out, it could revert the effects explicitly (e.g. storing old values somewhere and revert already committed values back to the previous values). Of course, in this case, I must guarantee that no other process changes something in between, so I'd also have to set the system to a read-only state, as long as the big operation is running.
More or less, I'd need to simulate a transaction in business logic ...
I could choose not to parallelize and do everything on a single node in one big transaction (but as stated at the beginning, I need to speed up processing, so this is not an option...)
I have tried to find out about XATransactions or distributed transactions in general, but this seems to be an advanced Java EE feature, which is not implemented in all Java EE servers, and which would not really solve that basic problem, because there does not seem to be a way to transfer a transaction context over to a remote node in an asynchronous call. (e.g. section 4.5.3 of EJB Specification 3.1: "Client transaction context does not propagate with an asynchronous method invocation. From the Bean Developer’s view, there is never a transaction context flowing in from the client.")
The Question
Am I overlooking something? Is it not possible to distribute a task asynchronously over several nodes and at the same time have a (shared) transactional state which can be rolled back as a whole?
Thanks for any pointers, hints, propositions ...

If you want to distribute your application as described, JTA is your friend in Java EE context. Since it's part of the Java EE spec, you should be able to use it in any compliant container. As with all implementations of the spec, there are differences in the details or configuration, as for example with JPA, but in real life it's very uncommon to change application servers very often.
But without knowing the details and complexity of your problem, my advice is to rethink if you really need to share the task execution for one use case, or if it's not possible and better to have at least everything belonging to that one use case within one node, even though you might need several nodes for the overall application. In case you really have to use several nodes to fulfill your requirements, then I'd go for distributed tasks which do not write directly to the database, but give back results and then commit/rollback them in the one component which initiated the tasks.
And don't forget to measure first, before over-engeneering the architecture. Try to keep it simple at first, assuming that one node could handle it and then write a stress test which tries to break your system, to learn about the maximum possible load it can handle with the given architecture.

Running Neo4j purely in memory without any persistence

I don't want to persist any data but still want to use Neo4j for it's graph traversal and algorithm capabilities. In an embedded database, I've configured cache_type = strong and after all the writes I set the transaction to failure. But my write speeds (node, relationship creation speeds) are a slow and this is becoming a big bottleneck in my process.
So, the question is, can Neo4j be run without any persistence aspects to it at all and just as a pure API? I tried others like JGraphT but those don't have traversal mechanisms like the ones Neo4j provides.

As far as I know, Neo4J data storage and Lucene indexes are always written to files. On Linux, at least, you could set up a ramfs filing system to hold the files in-memory.
See also:
Loading all Neo4J db to RAM

How many changes do you group in each transaction? You should try to group up to thousands of changes in each transaction since committing a transaction forces the logical log to disk.
However, in your case you could instead begin your transactions with:
db.tx().unforced().begin();
Instead of:
db.beginTx();
Which makes that transaction not wait for the logical log to force to disk and makes small transactions much faster, but a power outage could have you lose the last couple of seconds of data potentially.
The tx() method sits on GraphDatabaseAPI, which for example EmbeddedGraphDatabase implements.

you can try a virtual drive. It would make neo4j persist to the drive, but it would all happen in memory
https://thelinuxexperiment.com/create-a-virtual-hard-drive-volume-within-a-file-in-linux/

Why does Hibernate attempt to "cache" and how does this work in a clustered environment?

Say you have a 4-node J2EE application server cluster, all running instances of a Hibernate application. How does caching work in this situation? Does it do any good at all? Should it simply be turned off?
It seems to me that data on one particular node would quickly become stale, as other users hitting other nodes make changes to database data. In such a situation, how could Hibernate ever trust that its cache is up to date?

First of all, you should clarify what cache you're talking about, Hibernate has 3 of them (the first-level cache aka session cache, the second-level cache aka global cache and the query cache that relies on the second-level cache). I guess the question is about the second-level cache so this is what I'm going to cover.
How does caching work in this situation?
If you want to cache read only data, there is no particular problem.
If you want to cache read/write data, you need a cluster-safe cache implementation (via invalidation or replication).
Does it do any good at all?
It depends on a lot of things: the cache implementation, the frequency of updates, the granularity of cache regions, etc.
Should it simply be turned off?
Second-level caching is actually disabled by default. Turn it on if you want to use it.
It seems to me that data on one particular node would become stale quickly as other users hitting other nodes make changes to database data.
Which is why you need a cluster-safe cache implementation.
In such a situation, how could Hibernate ever trust that its cache is up to date?
Simple: Hibernate trusts the cache implementation which has to offer a mechanism to guarantee that the cache of a given node is not out of date. The most common mechanism is synchronous invalidation: when an entity is updated, the updated cache sends a notification to the other members of the cluster telling them that the entity has been modified. Upon receipt of this message, the other nodes will remove this data from their local cache, if it is stored there.

First of all, there are 2 caches in Hibernate.
There is the first level cache, which you cannot remove, and is called Hibernate session. Then, there is the second level cache which is optional and pluggable (e.g Ehcache). It works accross many requests and, most probably, it's the cache you are referring to.
If you work on a clustered environment, then you need a 2nd level cache which can replicate changes accross the members of the cluster. Ehcache can do that. Caching is a hard topic and you need a deep understanding in order to use it without introducing other problems. Caching in a clustered environment is slightly more difficult.

Sharing nHibernate and hibernate 2nd level cache

Is it possible to share the 2nd level cache between a hibernate and nhibernate solution? I have an environment where there are servers running .net and servers running java who both access the same database.
there is some overlap in the data they access, so sharing a 2nd level cache would be desirable. Is it possible?
If this is not possible, what are some of the solutions other have come up with?

There is some overlap in the data they access, so sharing a 2nd level cache would be desirable. Is it possible?
This would require (and this is very likely oversimplified):
Being able to access a cache from Java and .Net.
Having cache provider implementations for both (N)Hibernate.
Being able to read/write data in a format compatible with both languages (or there is no point at mutualizing the cache).
This sounds feasible but:
I'm not aware of an existing ready-to-use solution implementing this (my first idea was Memcache but AFAIK Memcache stores a serialized version of the data so this doesn't meet the requirement #3 which is the most important).
I wonder if using a language neutral format to store data would not generate too much overhead (and somehow defeat the purpose of using a cache).
If this is not possible, what are some of the solutions other have come up with?
I never had to do this but if we're talking about a read-write cache and if you use two separate caches, you'll have to invalidate a given Java cache region from the .Net side and inversely. You'll have to write the code to handle that.

As Pascal said, it's improbable that sharing the 2nd cache is technically possible.
However, you can think about this from a different perspective.
It's unlikely that both applications read and write the same data. So, instead of sharing the cache, what you could implement is a cache invalidation service (using the communications stack of your choice).
Example:
Application A mostly reads Customer data and writes Invoice data
Application B mostly reads Invoice data and writes Customer data
Therefore, Application A caches Customer data and Application B caches Invoice data
When Application A, for example, modifies an invoice, it sends a message to Application B and tells it to evict the invoice from the cache.
You can also evict whole entity types, collections and regions.

What architecture? Distribute content building across a cluster

I am building an content serving application composing of a cluster of two types of node, ContentServers and ContentBuilders.
The idea is to always serve fresh content. Content is fresh if it was built recently, i.e. Content.buildTime < MAX_AGE.
Requirements:
*ContentServers will only have to lookup content and serve it up (e.g. from a distributed cache or similar), no waiting for anything to be built except on first request for each item of Content.
*ContentBuilders should be load balanced, should rebuild Content just before it expires, should only build content that is actually being requested. The built content should be quickly retrievable by all ContentServers
What architecture should I use? I'm currently thinking a distributed cache (EhCache maybe) to hold the built content and a messaging queue (JMS/ActiveMQ maybe) to relay the Content requests to builders though I would consider any other options/suggestions. How can I be sure that the ContentBuilders will not build the same thing at the same time and will only build content when it nears expiry?
Thanks.

Honestly I would rethink your approach and I'll tell you why.
I've done a lot of work on distributed high-volume systems (financial transactions specifically) and your solution--if the volume is sufficiently high (and I'll assume it is or you wouldn't be contemplating a clustered solution; you can get an awful lot of power out of one off-the-shelf box these days)--then you will kill yourself with remote calls (ie calls for data from another node).
I will speak about Tangosol/Oracle Coherence here because it's what I've got the most experience with, although Terracotta will support some or most of these features and is free.
In Coherence terms what you have is a partitioned cache where if you have n nodes, each node possesses 1/n of the total data. Typically you have redundancy of at least one level and that redundancy is spread as evenly as possible so each of the other n-1 nodes possesses 1/n-1 of the backup nodes.
The idea in such a solution is to try and make sure as many of the cache hits as possible are local (to the same cluster node). Also with partitioned caches in particular, writes are relatively espensive (and get more expensive with the more backup nodes you have for each cache entry)--although write-behind caching can minimize this--and reads are fairly cheap (which is what you want out of your requirements).
So your solution is going to ensure that every cache hit will be to a remote node.
Also consider that generating content is undoubtedly much more expensive than serving it, which I'll assume is why you came up with this idea because then you can have more content generators than servers. It's the more tiered approach and one I'd characterize as horizontal slicing.
You will achieve much better scalability if you can vertically slice your application. By that I mean that each node is responsible for storing, generating and serving a subset of all the content. This effectively eliminates internode communication (excluding backups) and allows you to adjust the solution by simply giving each node a different sized subset of the content.
Ideally, whatever scheme you choose for partitioning your data should be reproducible by your Web server so it knows exactly which node to hit for the relevant data.
Now you might have other reasons for doing it the way you're proposing but I can only answer this in the context of available information.
I'll also point you to a summary of grid/cluster technologies for Java I wrote in response to another question.

You may want to try Hazelcast. It is open source, peer2peer, distributed/partitioned map and queue with eviction support. Import one single jar, you are good to go! Super simple.

If the content building can be parallelized (builder 1 does 1..1000, builder 2 does 1001..2000) then you could create a configuration file to pass this information. A ContentBuilder will be responsible for monitoring its area for expiration.
If this is not possible, then you need some sort of manager to orchestrate the content building. This manager can also play the role of the load balancer.The manager can be bundled together with a ContentBuilder or be a node of it's own.
I think that the ideas of the distributed cache and the JMS messaging are good ones.

It sounds like you need some form of distributed cache, distributed locking and messaging.
Terracotta gives you all three - a distributed cache, distributed locking and messaging, and your programming model is just Java (no JMS required).
I wrote a blog about how to ensure that a cache only ever populates its contents once and only once here: What is a memoizer and why you should care about it.
I am in agreement with Cletus - if you need high performance you will need to consider partitioning however unlike most solutions, Terracotta will work just fine without partitioning until you need it, and then when you apply partitioning it will just divy up the work according to your partitioning algorithm.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.