Sharing nHibernate and hibernate 2nd level cache - java

Is it possible to share the 2nd level cache between a hibernate and nhibernate solution? I have an environment where there are servers running .net and servers running java who both access the same database.
there is some overlap in the data they access, so sharing a 2nd level cache would be desirable. Is it possible?
If this is not possible, what are some of the solutions other have come up with?

There is some overlap in the data they access, so sharing a 2nd level cache would be desirable. Is it possible?
This would require (and this is very likely oversimplified):
Being able to access a cache from Java and .Net.
Having cache provider implementations for both (N)Hibernate.
Being able to read/write data in a format compatible with both languages (or there is no point at mutualizing the cache).
This sounds feasible but:
I'm not aware of an existing ready-to-use solution implementing this (my first idea was Memcache but AFAIK Memcache stores a serialized version of the data so this doesn't meet the requirement #3 which is the most important).
I wonder if using a language neutral format to store data would not generate too much overhead (and somehow defeat the purpose of using a cache).
If this is not possible, what are some of the solutions other have come up with?
I never had to do this but if we're talking about a read-write cache and if you use two separate caches, you'll have to invalidate a given Java cache region from the .Net side and inversely. You'll have to write the code to handle that.

As Pascal said, it's improbable that sharing the 2nd cache is technically possible.
However, you can think about this from a different perspective.
It's unlikely that both applications read and write the same data. So, instead of sharing the cache, what you could implement is a cache invalidation service (using the communications stack of your choice).
Example:
Application A mostly reads Customer data and writes Invoice data
Application B mostly reads Invoice data and writes Customer data
Therefore, Application A caches Customer data and Application B caches Invoice data
When Application A, for example, modifies an invoice, it sends a message to Application B and tells it to evict the invoice from the cache.
You can also evict whole entity types, collections and regions.

Related

keep data in memory instead of saving in database java

I want to run a java function and it generates a json that has about 1M size. I need the json for input of next call that is 2 minute later. I can save the json in a database, but the time is spent on saving in database is not acceptable for me. can I keep this data in memory and use for next call? I need also read this data from node.js. how can I do this job?
Why dont you use a persistent asynchronous queue in between you application and your database. This way you will just fire and forget the persist operation and serve the result as fast as possible.
If you want to also keep the object in memory your best bet would be something like Infinispan or Hazelcast. Infinispan offers it own persistent store for the cache and good database integration. Hazelcast on the other end works more as In memory key value store but some persistence can easily be implemented with it as well. Hazel cast is very easy to start with and the learning curve is not that steep.
The good thing about this infrastructure is that you can have safety that your data is in sync with database. For example you can configure how many backups of particular object to be kept and these backups are created asynchronously or synchronously depending on how you configure them. You can also send the data to the database. If persistence is strong requirement probably Infinispan is better in this regards.
When I was reading second time your post I realized that maybe you need something significantly simpler when it comes to caching. If you just need a local cache with no backup capabilities, just go for EHCache.
hazelcast is a Java library that provides API's to solve caching use case.
It extends java collections with capabilities suitable for caching use case - eviction, ttl for entries, read through and write through cache, etc.
For node.js use case please find my answer here https://stackoverflow.com/a/36704734/27563
Let me know if you have any questions.
Thanks

Are transactions on top of "normal file system" possible?

It seems to be possible to implement transactions on top of normal file systems using techniques like write-ahead logging, two-phase commit, and shadow-paging etc.
Indeed, it must have been possible because a transactional database engine like InnoDB can be deployed on top of a normal file system. There are also libraries like XADisk.
However, Apache Commons Transaction state:
...we are convinced that the main advertised feature transactional file access can not be implemented reliably. We are convinced that no such implementation can be possible on top of an ordinary file system. ...
Why did Apache Commons Transactions claim implementing transactions on top of normal file systems is impossible?
Is it impossible to do transactions on top of normal file systems?
Windows offers transactions on top of NTFS. See the description here: http://msdn.microsoft.com/en-us/library/windows/desktop/bb968806%28v=vs.85%29.aspx
It's not recommended for use at the moment and there's an extensive discussion of alternative scenarios right in MSDN: http://msdn.microsoft.com/en-us/library/windows/desktop/hh802690%28v=vs.85%29.aspx .
Also if you take a definition of the filesystem, DBMS is also a kind of a filesystem and a filesystem (like NTFS or ext3) can be implemented on top (or in) DBMS as well. So Apache's statement is a bit, hmm, incorrect.
This answer is pure speculation, but you may be comparing apples and oranges. Or perhaps more accurately, milk and dairy products.
When a database uses a file system, it is only using a small handful of predefined files on the system (per database). These include data files and log files. The one operation that is absolutely necessary for ACID-compliant transactions is the ability to force a write to permanent memory (either disk or static RAM). And, I think most file systems provide this capability.
With this mechanism, the database can maintain locks on objects in the database as well as control access to all objects. Happily, the database has layers of memory/page management built on top of the file system. The "database" itself is written in terms of things like pages, tables, and indexes, not files, directories, and disk blocks.
A more generic transactional system has other challenges. It would need, for instance, atomic actions for more things. E.g. if you "transactionally" delete 10 files, all these would have to disappear at the same time. I don't think "traditional" file systems have this capability.
In the database world, the equivalent would be deleting 10 tables. Well, you essentially create new versions of the system tables without the tables — within a transaction, while the old tables are being used. Then you put a full lock on the system tables (preventing reads and writes), waiting until they are available. Then you swap in the new table definitions (i.e. without the tables), unlock the tables, and clean up the data. (This is intended as an intuitive view of the locking mechanism in this case, not a 100% accurate description.)
So, notice that locking and transactions are deeply embedded in the actions the database is doing. I suspect that the authors of this module come to realize that they had to basically fully re-implement all existing file system functionality to support their transactions — and that was a bit too much scope to take on.

how serializable is releated to caching a java/command class?

I am working on ibm websphere commerce (wcs). In this framework we have an option to cache our command class, basically they are just a java classes. While having a new cache entry i got to know that these java classes must be serializable (implement the java.io.Serializable interface). Why is that?
is it like caching is basically saving an output of some execution? and in this case it will save the sequence of bytes generated as part of serialization and whenever a requested to that cached object comes then it will just deserialize and returns the object without executing actual program? Can anyone please share knowledge about this??
Thanks in advance,
Santosh
For caching the result of a method execution and returning it for subsequent calls serialization is not needed.
The most likely reason it needs to be Serializable is that when you cache some data in a clustered environment changes made to the cached data on one node would have to be replicated on other nodes of the cluster. For doing this replication the data needs to be serialized and sent across to another node using some remoting api.
The other reason for requiring the class to be serialiazable is that the cache implementation might overflow the data to a disk. Even in this case the objects in the cache need to be converted to some form that can be stored on the disk and recreated.
The following is a passage from ehcache documentation that explains the overflow scenario in more detail.
When an element is added to a cache and it goes beyond its maximum
memory size, an existing element is either deleted, if overflowToDisk
is false, or evaluated for spooling to disk, if overflowToDisk is
true.
In the latter case, a check for expiry is carried out. If it is
expired it is deleted; if not it is spooled. The eviction of an item
from the memory store is based on the 'MemoryStoreEvictionPolicy'
setting specified in the configuration file.
Serialization saves the actual object itself, in its current state.
The reason why is due to WebSphere Commerce's use of WebSphere Application Server's Dynacache feature. WAS dynacache is an in-memory java cache that is very similiar to a built in memcached. Out of the box, the starter store uses the dynacache to cache JSP, servlets, controllers, commands, command tasks and other java objects. There is also caching done on the DB side. This is why in performance tests, IBM scales much better at high volumes than other software.

is it possible save state between requests in GAE/java

I plan to implement a GAE app only for my own usage.
The application will get its data using URL Fetch service, updating it every x minutes (using Scheduled tasks). Then it will serve that information to me when I request it.
I have barely started to look into GAE, but I have a main question that I am not able to clear. Can state be maintained in GAE between different requests without using jdo/jpa and the datastore?
As I am the only user, I guess I could keep the info in a servlet subclass and so I can avoid having to deal with Datastore...but my concern is that, as this app will have very few request, if it is moved to disk or whatever (don't know yet if it has some specific name), it will loose its status?
I am not concerned about having to restart the whole app and start collecting data from scratch from time to time, that is ok.
If this is an app for your own use, and you're double-extra sure that you won't be making it multi-user, and you're not concerned about the possibility that you might be using it from two browsers at once, you can skip using sessions and use a known key for storing information in memcache.
If your reason for avoiding datastore is concern over performance, then I strong recommend testing that assumption. You may be pleasantly surprised.
You could use the http session to maintain state between requests, but that will use the datastore itself (although you won't have to write any code to get this behaviour).
You might also consider using the Cache API (like memcache). It's JSR 107 I think, which Google provide an implementation of. The Cache is shared between instances, but it can empty at anytime. But if you're happy with that behaviour this may be an option. Looking at your requirements this may be the most feasible option, if you don't want to write your own persistence code.
You could store data as a static against your Class or in an application scoped Object, but doing that means when your instance spins down or your instance switches to another instance, the data would be lost as your classes would need to be loaded into the new instance.
Or you could serialize the state to the client and send it back in with each request.
The most robust option is persistence to the datastore - the JPA code is trivial. Perhaps you should reconsider?

Strategy for Offline/Online data synchronization

My requirement is I have server J2EE web application and client J2EE web application. Sometimes client can go offline. When client comes online he should be able to synchronize changes to and fro. Also I should be able to control which rows/tables need to be synchronized based on some filters/rules. Is there any existing Java frameworks for doing it? If I need to implement on my own, what are the different strategies that you can suggest?
One solution in my mind is maintaining sql logs and executing same statements at other side during synchronization. Do you see any problems with this strategy?
There are a number of Java libraries for data synchronizing/replication. Two that I'm aware of are daffodil and SymmetricDS. In a previous life I foolishly implemented (in Java) my own data replication process. It seems like the sort of thing that should be fairly straightforward, but if the data can be updated in multiple places simultaneously, it's hellishly complicated. I strongly recommend you use one of the aforementioned projects to try and bypass dealing with this complexity yourself.
The biggist issue with synchronization is when the user edits something offline, and it is edited online at the same time. You need to merge the two changed pieces of data, or deal with the UI to allow the user to say which version is correct. If you eliminate the possibility of both being edited at the same time, then you don't have to solve this sticky problem.
The method is usually to add a field 'modified' to all tables, and compare the client's modified field for a given record in a given row, against the server's modified date. If they don't match, then you replace the server's data.
Be careful with autogenerated keys - you need to make sure your data integrity is maintained when you copy from the client to the server. Strictly running the SQL statements again on the server could put you in a situation where the autogenerated key has changed, and suddenly your foreign keys are pointing to different records than you intended.
Often when importing data from another source, you keep track of the primary key from the foreign source as well as your own personal primary key. This makes determining the changes and differences between the data sets easier for difficult synchronization situations.
Your synchronizer needs to identify when data can just be updated and when a human being needs to mediate a potential conflict. I have written a paper that explains how to do this using logging and algebraic laws.
What is best suited as the client-side data store in your application? You can choose from an embedded database like SQLite or a message queue or some object store or (if none of these can be used since it is a web application) files/ documents saved on the client using Web DB or IndexedDB through HTML 5's LocalStorage API.
Check the paper Gold Rush: Mobile Transaction Middleware with Java-Object Replication. Microsoft's documentation of occasionally connected systems describes two approaches: service-oriented or message-oriented and data-oriented. Gold Rush takes the earlier approach. The later approach uses database merge-replication.

Categories