How to synchronize data within programs using a database?

How to synchronize data within programs using a database? - java

I have a very small program that is going to be used by two or more computers at the same time. It's a small program that can add string to the list and remove it from the list. I save all these strings in a remote postgres database. When I delete string from the list, it gets deleted from the database however if the program is running on other computer you still can see this string. Currently I have only one option in mind which is refreshing data in program every x time ? Are there better options ? Database is very small, only one column and shouldnt be more than 100 rows.

You should never allow client programs to interact with a remote database directly. Exposing your database to the clients is a huge security problem. You should always have a server-program in between which communicates with the clients, validates their input, communicate with the database, and then tells the clients what they want to know.
This would also give you the ability to add push-updates to your network protocol (when one client makes a change, update the database and also inform the other client about the change).
But when you really want to take the risk and you consider a server-program too much complexity, you could add a timestamp to every row when it gets changed. That way you could refresh the clients at regular intervals by querying only for the rows which got changed since the last refresh.
Another option would be to allow the clients to communicate with each other in a peer-to-peer manner. When a client makes a change, it doesn't just notify the database, it also notifies the other clients via separate network connections. In order to do that, the clients need to know each others IP addresses or at least each others hostnames. When these aren't known, you could have the clients write their IPs to another table of the database when they connect, so the other client can query for it. Just make sure the entries get deleted, so you don't annoy every single IP address some client ever had.

Related

How to identify unsynced entities? Is the client the one who generates the Ids?

Let's say I want to synchronize (using HTTP protocol) an entity called Person. So, the persons in client (mobile/desktop/whatever) is a mirror-replicate of persons exist in server's database. Obviously, server owns all persons and client owns only the specific user's persons.
Consider the following case.
Client is offline. While he is offline, he creates a Person and because he can't connect to the server, he keeps this person to a local storage. Let's say a local database (SQLite or whatever). The moment this happens, how is/should this person identified?
Before I started implementing the whole thing, I thought the server should be the one that generates the IDs of persons coming to him. However, when I started implementing, I start facing this problem.
In case the server generates the IDs, since the person is never seen by the server, client must give it an ID in order to be able to find the person and obviously use this ID to store the person into his local storage. Now, when client comes online, he will send the person to server. Server, gives it an ID and stores it in his own database. After that, client will request for any kind of person changes that happened after his last time of synced and server will return this specific person.
Lets make an example. Client is offline, creates 100 Persons and stores them to his local storage. Person 1, Person 2, Person 3, etc... Now, he gets connected to the server and he sends all 100 persons. Since the connection happens over HTTP, client makes a post request to post-persons endpoint. Then, server generates IDs (either incremental or UUIDs) and probably change some other properties as well. Now, client access get-persons endpoint and he sees 100 updated persons, each one of them having a new ID that he could not know about. How does the client know which of these persons correspond to persons that client already has? Removing the old client's 100 persons, and inserting 100 new with server ID seems unorthodox. With other words, Person 1 known by the client, is stored as Person [uuid] in server, and server returns it as Person [uuid]. How client knows that his Person 1 corresponds to Person [uuid]? A solution might be, to send client's IDs to server, and server will respond like Person [uuid] 1. Now client knows, his 1 is this one. And to me, this seems even more unorthodox.
Second option is to have the clients generate UUIDs either they are offline, either online. This solution seems the "simplest" approach when it comes to implementation by my side. Client creates Person [uuid]. When he comes online, he sends it to server. After that client accesses get-persons and he gets as respond an update Person [uuid]. He easily identifies & stores it in his local storage. The server does not generate any kind of ID for persons.
Is there anything I am missing? Till now, I thought servers are the ones that generate the IDs of syncable entities, but I think the second approach is easier to implement and more comprehensive. But does it introduce any kind of "danger" for later?
There are no explicit requirements when it comes to what kind of ID I will use to client, or the server. However, I am aware of the trade-offs using UUIDs over simple increment numbers.
The stack (even though I consider it irrelevant):
Spring boot as server among Hibernate and MySQL
Client with Hibernate and H2 standalone as local storage
Everything Java 8

There are other possible solutions, and the best one depends on your particular situation - whether you need to maintain local relations, etc.
The first solution keeps using IDs with no logic around it on the server.
Don't generate ID at all and store persons without it (if possible), and when synced, the server returns IDs, and you change them locally.
Thus, it's simple to know which persons are already in sync with the server (those with ID) and which are not yet known to the server.
It's also simple to implement INSERT/UPDATE logic.
However, this solution may be a problem if you need to maintain local relations.
When you need local relations, you can generate temporary IDs. Let's say that you want a numeric ID, and so all positive IDs are those that are already synced, and negative IDs are those that are temporary.
When synced with the server, you obtain new IDs from the server (positive ones), and you simply rewrite all IDs in your local database. That's, however, a bit error-prone as you have to be sure to update all relations.
You can also have localId and serverId. Technically, you only need serverId to be globally unique to store persons on your server. You keep serverId empty and fill it once the entity is synchronized with the server (the server returns it). And you can generate localId as you want and use it for local relations. However, this can be a problematic approach if you need to have several clients in sync, and each of them would like to generate its own local IDs.
The second approach is to use ID pools. It needs a bit of work on the server and storing assigned pools.
You can allocate a pool of IDs for the given client. The client must be online at least once, and the server sends its unique offset.
Depending on your needs, you can have, for example, an 8-byte identifier (64 bits). The first 36 bits can be used for the offset, so your server can manage 2^36 clients. And the last 28 bits are free to use by the client.
So the client just increments the internal counter whenever it needs a new ID and adds the unique offset => gets a globally unique identifier.

multiple programs acessing the same mySQL DB

I have 4 different raspberry pi's running the same program on each, the program sends information to a mySQL DB to be inserted into a table.
It is possible for this to happen or what problems will occur?
e.g.
Rpi:1 accessed -> sends info to DB
Rpi:2 accessed -> sends info to DB
Rpi:3 accessed -> sends info to DB
can these happen simultaneously?
I dont have 4 devices at the minute thats why i haven't tried it but i'm just wondering how this would work or if it is possible.
Revised : Cheers for the responses guys, each of the RPi are connected to a RFID module so when the fob get read, it send the timestamp to a DB and thats the same with all the 4 devices! Each device will be used at a random time when some one wants to access the system, will this cause problems?
Thanks :)

As #mastah indicates it really depends on what you want to do.
The answer is yes it can be done, but the some things are more complex than others. EG you want the devices to record temperature in different places, then each device will simply create a new record every few minutes along with the location name. the name and time of the record would be the unique key. No problem.
If, say, you want to be able to change any record in the database on any device, you need to think about how two people changing the same record will be reconciled.
It also depends on what you mean by "simultaneously". In general, database writes are done sequentially in "transactions". So you may need to consider whether "simultaneously" means "very quickly one after another" or not. Does the order of the writes matter?

Pagination in Highly dynamic and Frequently change Data in java

I am java developer and my application is in iOS and android.I have created web service for that and it is in restlet Framework as JDBC as DB connectivity.
My problem is i have three types of data it is called intersection like current + Past + Future.and this intersection contain list of user as a data.There is single web service for giving all users to device as his/her intersection.I have implement pagination but server has to process all of his/her intersections and out of this giving (start-End) data to device.I did this because there are chances that past user may also come in current.This the total logic.
But as intersection grows in his/her profile server has to process all user.so it become slow and this is obvious.also device call this web service in every 5 minutes.
please provide better suggestion to handle this scenario.
Thanks in advance.
Ketul Rathod

It's a little hard to follow your logic, but it sounds like you can probably benefit from caching your results on the server.
If it makes sense, after every time you process the users data on the server, save the results (to a file, to a database table, whatever). Then, in 5min, if there are no changes, simply return the same. If there were changes, retrieve from cache (optionally invalidating the cache in the process), append those changes to what is cached, re-save the results in the cache, and return the results.
If this is applicable to your workflow, your server-side processing time will be significantly less.

Sync data bethen two JPA applications

I wrote an application that uses JPA (and hibernate as persistence provider).
It works on a database with several tables.
I need to create an "offline mode", where a copy of the programa, which acts as a client, allows the same functionality while keeping their data synchronized with the server when it is reachable.
The aim is to get a client that you can "detach" from the server, make changes on the data and then merge changes back. A bit like a revision control system.
It is not important to manage conflicts, in case the user will decide which version to keep.
My idea, but it can't work, was to assign to each row in the database the last edit timestamp. The client initially download a copy of the entire database and also records a second timestamp when it modify a row while non connected to the server. In this way, it knows what data has changed and the last timestamp where it is synchronized with the server. When you reconnect to the server, he will have to ask what are the data that have been changed since the last synchronization from the server and sends the data it has changed. (a bit simplified, but the management of conflicts should not be a big problem)
This, of course, does not work in case of deleting a row. If both the server or the client deletes a row they will not notice it and the other will never know.
The solution would be to maintain a table with the list of deleted rows, but it seems too expensive.
Does anyone know a method that works? there is already something similar?

Enver:
If you like to have a simple solution, you can create Version-Fields that acts like your "Timestamp".
Audit:
If you like to have a complex, powerfull solution, you should use the Hibernateplugin

Continuously fetch data from database using Java

I have a scenario where my Java program has to continuously communicate with the database table, for example my Java program has to get the data of my table when new rows are added to it at runtime. There should be continuous communication between my program and database.
If the table has 10 rows initially and 2 rows are added by the user, it must detect this and return the rows.
My program shouldn't use AJAX and timers.

If the database you are using is Oracle, consider using triggers, that call java stored procedure, that notifies your client of changes in the db (using JMS, RMI or whatever you want).

without Ajax and timers, it not seems to do this task.
I have also faced the same issue, where i need to push some data from server to client when it changes.
For this, you can user Server push AKA "Comet" programming.
In coment
we make a channel between client and server, where client subscribes for particular channel.
Server puts its data in the channel when it has it.
when client reads the channel, it gets all the data in the channel and channel is emptied.
so every time client reads from channel, it will get new data only.
Also to monitor DB changes, you can have two things,
Some trigger/timer (Check out Quartz Scheduler)
Event base mechanism, which pushes data in the channel on particular events.
basically, client can't know anything happening on server side, so you must push some data or event to tell client that, i have some new data, please call some method. Its kind of notification. So please check in comet/server push with event notification.
hope this helps.
thanks.

Not the simplest problem, really.
Let's divide it into 2 smaller problems:
1) how to enable reloading without timers and ajax
2) how to implement server side
There is no way to notify clients from the server. So, you need to use flash or silverlight or JavaFX or Applets to create a thick client. If the problem with Ajax is that you don't know how to use it for this problem then you can investigate some ready-to-use libraries of jsp tags or jsf components with ajax support.
If you have only 1 server then just add a cache. If there are several servers then consider using distributed caches.

If you have a low-traffic database you could implement a thread that rapidly checks for updates to the DB (polling).
If you have a high-traffic DB i wouldn't recommend that, 'cause polling creates much additional traffic.

server notifying client is not a good idea (consider a scenario with a 1000 clients). Do u use some persistence layer or u have to stick to pure JDBC?

If you have binary logs turned on in MYSQL , you can see all of the transactions that occur in the database.

A portable way to do this, is adding a column time stamp (create date) which indicates when the row was added to the table. After initial loading of the content you simply poll for new content which a where clause current_time >= create_date. In case that rows could have identical timestamps you need to filter duplicates before adding them.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.