Efficient database synchronization between clients - server in 2019

Efficient database synchronization between clients - server in 2019 - java

I need to keep in sync Client with postgreSQL database (only data that are loaded from database, not entire database, 50+ db tables and a lot of collections inside entities). As recently I have added server based on Spring-REST API to my application I could manage those changes maybe differently/more efficient that would require less work. So untill now my approach was to add psql notification that triggers json
CREATE TRIGGER extChangesOccured
AFTER INSERT OR UPDATE OR DELETE ON xxx_table
FOR EACH ROW EXECUTE PROCEDURE notifyUsers();
the client then receive the json built as:
json_build_object(
'table',TG_TABLE_NAME,
'action', TG_OP,
'id', data,
'session', session_app_name);
compare if this change is made by this client or any other and fetch the new data from database.
Then on client side new object is manually "rewritten", something like method copyFromObject(new_entity) and variables are being overriden (including collections, avoid transient etc...).
This approach requires to keep copyFromObject method for each entity (hmm still can be optimized with reflections)
Problems with my approach is:
requires some work when modifying variables (can be optimized using reflections)
entire new entity is loaded when changed by some client
I am curious of Your solutions to keep clients in sync with db, generally I have desktop client here and the client loads a lot of data from database which must be sync, loading database takes even 1min on the app start depends on chosen data-period which should be fetched
The perfect solution would be to have some engine that would fetch/override only those variables in entities that was really changed and make it automatically.

A simple solution is to implement Optimistic Lock? It will prevent user from persisting data if the entity was changed after the user fetched it.
Or
You can use 3rd party apps for DB synchronization. I've played some time ago with Pusher and you can find an excessive tutorial about Client synchronization here: React client synchronization
Of course pusher is not the only one solution, and I'm not related to the dev team of that app by at all.

For my purpose I have implemented AVL Tree based loaded entities and database synchronization engine that creates repositiories based on the loaded entities from hibernate and asynchronously search throught all the fields in entities and rewrites/merge all the same fields (so that if some field (pk) is the same entity like the one in repository, it replaces it)
In this way synchronization with database is easy as it comes to find the externally changed entity in the repository (so basically in the AVL Tree which is O(log n)) and rewrite its fields.

Related

Proper way to handle schema changes in MongoDB with java driver

I'm having an application which stores data in a cloud instance of mongoDB. So If I explain further on requirement, I'm currently having data organized at collection level like below.
collection_1 : [{doc_1}, {doc_2}, ... , {doc_n}]
collection_2 : [{doc_1}, {doc_2}, ... , {doc_n}]
...
...
collection_n : [{doc_1}, {doc_2}, ... , {doc_n}]
Note: my collection name is a unique ID to represent collection and in this explanation I'm using collection_1, collection_2 ... to represent that ids.
So I want to change this data model to a single collection model as below. The collection ID will be embedded into document to uniquely identify the data.
global_collection: [{doc_x, collection_id : collection_1}, {doc_y, collection_id : collection_1}, ...]
I'm having the data access layer(data insert, delete, update and create operations) for this application written using Java backend.
Additionally, the entire application is deployed on k8s cluster.
My requirement is to do this migration (data access layer change and existing data migration) with a zero downtime and without impacting any operation in the application. Assume that my application is a heavily used application which has a high concurrent traffic.
What is the proper way to handle this, experts please provide me the guidance..??
For example, if I consider the backend (data access layer) change, I may use a temporary code in java to support both the models and do the migration using an external client. If so, what is the proper way to do the implementation change, is there any specific design patterns for this??
Likewise a complete explanation for this is highly appreciated...

I think you have honestly already hinted at the simplest answer.
First, update your data access layer to handle both the new and old schema: Inserts and updates should update both the new and old in order to keep things in sync. Queries should only look at the old schema as it's the source of record at this point.
Then copy all data from the old to the new schema.
Then update the data access to now query the new data. This will keep the old data updated, but will allow full testing of the new data before making any changes that will result in the two sets of data being out of sync. It will also help facilitate rolling updates (ie. applications with both new and old data access code will still function at the same time.
Finally, update the data access layer to only access the new schema and then delete the old data.
Except for this final stage, you can always roll back to the previous version should you encounter problems.

Domino java xpage - caching values server-wide

I have a Java XPages application with a REST service that functions as an API for rooms & resources database (getting appointments for specific room, creating etc).
The basic workflow is that an HTTP request is being made to a specific REST action, having the room's mail address in the search query. Then in the java code I'm iterating over all documents from the rooms & resources database, until I find a document with the InternetAddress field with the searched mail address.
This isn't as fast as I would like it to be, and there are multiple queries like this being made all the time.
I'd like to do some sort of caching in my application, that when one room is found once, it's document UID is being stored in a server-wide cache so next time a request is made for this mail address, I can directly go to the document using getDocumentByUNID(), which I think should be way faster than searching over the entire database.
Is it possible to have such persistent lookup table in Java XPages without having any additional applications, while keeping it as fast as possible? A hash table would be perfect for this.
To clarify: I don't want caching in a single request, because I'm not doing more than one database lookups in a single query, I'd want to keep the caching server-wide, so it would be kept between multiple requests.

Yes, it is possible to store persistent data. What you are looking for is called an application scoped managed bean.

Best way to share a data structure between different instances of the same java application?

Currently, I'm developing an application based on spring boot. One of the requirements is that the application should be real-time and I need some kind of unique data structure based on InvertedRadixTree(not exactly this but data strucutre is using the tree to answer the queries). I developed an admin UI for crud operations. The number of cruds are not so much and basically will be done by OPs employees. the data structure that I developed is thread safe and is synchronized by database(which is mongodb) and since this is the only app using this database, I'm not worried about the other apps messing up with the data. The only problem that I have is that if we have multiple instances of this app, and one of them do some crud operations on mongodb; although the data structure of this instance will get updated, the other instance will not be updated. I created an scheduler to update the data structure from database every 12 hours, but I'm looking for another solution like sharing data structure between all the instances. I really appreciate every suggestions.
EDIT: After searching around, I found that updating the whole data structure doesn't take to much. I wrote some test cases and put around a million record of my class inside mongodb and fetched the whole collection. Fetching and data structure creation took less than a second. So I ended up using this method instead of using some sophisticated method for synchronizing memory and database.

One of the suggestion can be that you can use a shared database. Every time there is an update by any of the APP,It should be updated in the database.And every time you have to use the data you will have to load the fresh data from the database.This is the easiest way as far as i think ..!!!

I would use something like redis http://redis.io/topics/pubsub , and listen to an event fired for the instance that make the change and use some local cache on every instance if the data is not frequently updated

JPA merge in a RESTful web application with DTOs and Optimistic Locking?

My question is this: Is there ever a role for JPA merge in a stateless web application?
There is a lot of discussion on SO about the merge operation in JPA. There is also a great article on the subject which contrasts JPA merge via a more manual Do-It-Yourself process (where you find the entity via the entity manager and make your changes).
My application has a rich domain model (ala domain-driven design) that uses the #Version annotation in order to make use of optimistic locking. We have also created DTOs to send over the wire as part of our RESTful web services. The creation of this DTO layer also allows us to send to the client everything it needs and nothing it doesn't.
So far, I understand this is a fairly typical architecture. My question is about the service methods that need to UPDATE (i.e. HTTP PUT) existing objects. In this case we have these two approaches 1) JPA Merge, and 2) DIY.
What I don't understand is how JPA merge can even be considered an option for handling updates. Here's my thinking and I am wondering if there is something I don't understand:
1) In order to properly create a detached JPA entity from a wire DTO, the version number must be set correctly...else an OptimisticLockException is thrown. But the JPA spec says:
An entity may access the state of its version field or property or
export a method for use by the application to access the version, but
must not modify the version value[30]. Only the persistence provider
is permitted to set or update the value of the version attribute in
the object.
2) Merge doesn't handle bi-directional relationships ... the back-pointing fields always end up as null.
3) If any fields or data is missing from the DTO (due to a partial update), then the JPA merge will delete those relationships or null-out those fields. Hibernate can handle partial updates, but not JPA merge. DIY can handle partial updates.
4) The first thing the merge method will do is query the database for the entity ID, so there is no performance benefit over DIY to be had.
5) In a DYI update, we load the entity and make the changes according to the DTO -- there is no call to merge or to persist for that matter because the JPA context implements the unit-of-work pattern out of the box.
Do I have this straight?
Edit:
6) Merge behavior with regards to lazy loaded relationships can differ amongst providers.

Using Merge does require you to either send and receive a complete representation of the entity, or maintain server side state. For trivial CRUD-y type operations, it is easy and convenient. I have used it plenty in stateless web apps where there is no meaningful security hazard to letting the client see the entire entity.
However, if you've already reduced operations to only passing the immediately relevant information, then you need to also manually write the corresponding services.
Just remember that when doing your 'DIY' update you still need to pass a Version number around on the DTO and manually compare it to the one that comes out of the database. Otherwise you don't get the Optimistic Locking that spans 'user think-time' that you would have if you were using the simpler approach with merge.
You can't change the version on an entity created by the provider, but when you have made your own instance of the entity class with the new keyword it is fine and expected to set the version on it.
It will make the persistent representation match the in-memory representation you provide, this can include making things null. Remember when an object is merged that object is supposed to be discarded and replaced with the one returned by merge. You are not supposed to merge an object and then continue using it. Its state is not defined by the spec.
True.
Most likely, as long as your DIY solution is also using the entity ID and not an arbitrary query. (There are other benefits to using the 'find' method over a query.)
True.

I would add:
7) Merge translates to insert or to update depending on the existence of the record on DB, hence it does not deal correctly with update-vs-delete optimistic concurrency. That is, if another user concurrently deletes the record and you update it, it must (1) throw a concurrency exception... but it does not, it just inserts the record as new one.
(1) At least, in most cases, in my opinion, it should. I can imagine some cases where I would want this use case to trigger a new insert, but they are far from usual. At least, I would like the developer to think twice about it, not just accept that "merge() == updateWithConcurrencyControl()", because it is not.

data consistent in a desktop application

I am trying to create a desktop application using eclipse-rcp. In that application, I use a ORM framework to load objects from database and using JFace-databinding to bind these objects to user-interface, so the users can modify data that these objects contains.
since the objects loaded, other users or other client may also work with the same data. so when user want to save the objects back into database, the data these objects contains may differs from data in database, the difference may be caused by my application or caused by others.
should I check against real data in database when I need to save a object that may be not fresh any more?
maybe this is a common problem in ORM, but this is first time I need to deal with ORM.

yes - it's not a bad idea to check against "real" data before saving. you may have a special field - last update timestamp, or increment count.
such approach is called optimistic locking and, since it is very typical it may be supported by ORM's.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.