monitoring mysql for changes - java

I have a Java app using a MySQL database through hibernate. The database is really used as persistence layer: The database is read at the initial load of the program, and the records are then maintained in memory.
However, we are adding extra complexity, where another process may change the database as well, and it would be nice for the changes to reflect on the Java app. Yet, I don't particularly like pulling mechanisms to query the database every few seconds, especially that the database is rarely updated.
Is there a way to have a callback to listen to database changes? Would triggers help?

Or change both applications so the Java app is truly the owner of the MySQL database and exposes it as a service. You're coupling the two apps at the database level by doing what you're proposing.
If you have one owner of the data you can hide schema changes and such behind the service interface. You can also make it possible to have a publish/subscribe mechanism to alert interested parties about database changes. If those things are important to you, I'd reconsider letting another application access MySQL directly.

Is there a way to have a callback to listen to database changes? Would triggers help?
To my knowledge, such a thing doesn't exist and I don't think a trigger would help. You might want to check this similar question here on SO.
So, I'd expose a hook at the Java application level (it could be a simple servlet in the case of a webapp) to notify it after an update of the database and have it invalidate its cache.

Another way would be to use a self compiled MySQL server with the patches from this project
ProjectPage External Language Stored Procedures
Check this blog post for a more detailed introduction
Calling Java code in MySQL

One option would be tail the binary logs (or setup a replication slave) and look for changes relevant to your application. This is likely to be a quite involved solution.
Another would be to add a "last_updated" indexed column to the relevant tables (you can even have mysql update this automatically) and poll for changes since the last time you checked. The queries should be very cheap.

Instead of caching the database contents within the memory space of the Java app, you could use an external cache like memcached or Ehcache. When either process updates (or reads) from the database, have it update memcached as well.
This way whenever either process updates the DB, its updates will be in the cache that the other process reads from.

Related

Oracle Incremental Checksum Crypto for Security

I have a unique problem to solve.
I have a legacy java application which connects to an Oracle RDBMS. There are all sorts of queries and DMLs scattered over in the application - Inserts, Update, Delete and of course selects. It uses JBC (Preparedstatement), though one recently added lodule uses JPA.
I have a requirement to add a protection layer / logic to the application / Database whereby if any user (could even be A DBA or an OS root user) tries to modify the data (updates, inserts or deletes) bypassing the app, we are able to identify the operation as part of an audit.
Audit trail seemed to be the go to thing here, except that we cannot even trust the OS root user and thus a guy having DBA and root access can easily modify the data and remove the trace of it in the audit trails.
I was thinking to implement a cyclic crypto kind of algorithm on the sensitive tables so that on every DML executed by the application, a crypto / hash is introduced and it is incremental so that any change is easily caught by doing an audit using the application.
In theory, it seems feasible except that it might get tricky because after every DML we would potentially need to recalculate the hash / checksum of a number of subsequent records and this might overburden the application / database.
Is this a feasible solution?
You are right that computing a hash of every updated row of data will impose a burden on the system. Are you going to also validate that hash before changes are submitted to the database to ensure nothing has been changed outside the application? That's even more overhead, and a lot more custom code for your application. It also wouldn't help you identify who modified the data, or when, only that it had been updated outside of the app. Using a database trigger wouldn't work, as they are easily disabled and aren't capable of modifying the same table that calls them (you'd need a separate hash table with an entry for every row of data in every table you wanted to monitor). Auditing is still your best way to go, as it wouldn't require any modification to your app or your data schemas.
You have a couple of options in regards to auditing, depending on the version of Oracle you're using. If you're using 12c or later, you can use Unified Auditing, which has its own set of permissions and roles to allow separation of duties (i.e. normal DBA from security admin). Even in older versions you can put an update/delete audit on the actual audit trail table, so that any attempt to modify the data will itself leave a fingerprint.
Lastly, you can use a tool like Splunk, Elastic Search, syslog, or Oracle's Database Audit Vault or some other file monitoring solution to centralize your audit records to another system as they are created by the database - making them inaccessible to the DBA or local sys admin. This will take some work by your DBA and/or sysadmin to configure in the first place, but can go a long way to securing your audit data.
All that said, sooner or later you're going to have to trust two people: the sys admin and the DBA. If you can't trust them then you are in deep, deep trouble.
Oracle 20c has blockchain tables. Version 20c is currently only available in Oracle's cloud, but it will probably be available on-premise in a few months.

Notify Java application of changes in Apache Derby database

I have a Java application that can save and retrieve data from an Apache Derby database using JDBC. I would like to update the view of every user when changes are made in the database.
I tried using a for-loop that polls the database every few seconds, but that uses loads of processor time as expected. I've also heard about TimeTask and ScheduledExecutorService. I'm not sure how they work but i imagine they are a better alternative to the for-loop. However they would also have to check the database, which i find less ideal than having the database notify of changes.
I've read about database Triggers, which i think might be the best solution? However, all the examples i find for apache derby only seem to trigger other changes in the database and not the Java application.
Is it possible to use a trigger to execute a method in the Java application? If so, how? or perhaps there is another approach to solving the problem that I don't know of?

Copying a database JPA

I have a working code that basically copies records from one database to another one using JPA. It works fine but it takes a while, so I wonder if there's any faster way to do this.
I thought Threads, but I get into race conditions and synchronizing those pieces of the code end up being as long as the one by one process.
Any ideas?
Update
Here's the scenario:
Application (Core) has a database.
Plugins have default data (same structure as Core, but with different data)
When the plugin is enabled it checks in the Core database and if not found it copies from it's default data into the core database.
Most databases provide native tools to support this. Unless you need to write additional custom logic to transform the data in some way, I would recommend looking at the export/import tools provided by your database vendor.

Update two identical database schemas at the same time

I've got an Oracle database that has two schemas in it which are identical. One is essentially the "on" schema, and the other is the "off" schema. We update data in the off schema and then switch the schemas behind an alias which our production servers use. Not a great solution, but it's what I've been given to work with.
My problem is that there is a separate application that will now be streaming data to the database (also handed to me) which is currently only updating the alias, which means it is only updating the "on" schema at any given time. That means that when the schemas get switched, all the data from this separate application vanishes from production (the schema it is in is now the "off" schema).
This application is using Hibernate 3.3.2 to update the database. There's Spring 3.0.6 in the mix as well, but not for the database updates. Finally, we're running on Java 1.6.
Can anyone point me in a direction to updating both "on" and "off" schemas simultaneously that does not involve rewriting the whole DAO layer using Spring JDBC to load two separate connection pools? I have not been able to find anything about getting hibernate to do this. Thanks in advance!
You shouldn't be updating two seperate databases this way, especially from the application's point of view. All it should know/care about is whether or not the data is there, not having to mess with two separate databases.
Frankly, this sounds like you may need to purchase an ETL tool. Even if you can't get it to update the 'on' schema from the 'off' one (fast enough to be practical), you will likely be able to use it to keep the two in sync (mirror changes from 'on' to 'off').
HA-JDBC is a replicating JDBC Driver we investigated for a short while. It will automatically replicate all inserts and updates, and distribute all selects. There are other database specific master-slave solutions as well.
On the other hand, I wouldn't recommend doing this for 4-8 hour procedures. Better lock the database before, update one database, and then backup-restore a copy, and then unlock again.

Strategy for Offline/Online data synchronization

My requirement is I have server J2EE web application and client J2EE web application. Sometimes client can go offline. When client comes online he should be able to synchronize changes to and fro. Also I should be able to control which rows/tables need to be synchronized based on some filters/rules. Is there any existing Java frameworks for doing it? If I need to implement on my own, what are the different strategies that you can suggest?
One solution in my mind is maintaining sql logs and executing same statements at other side during synchronization. Do you see any problems with this strategy?
There are a number of Java libraries for data synchronizing/replication. Two that I'm aware of are daffodil and SymmetricDS. In a previous life I foolishly implemented (in Java) my own data replication process. It seems like the sort of thing that should be fairly straightforward, but if the data can be updated in multiple places simultaneously, it's hellishly complicated. I strongly recommend you use one of the aforementioned projects to try and bypass dealing with this complexity yourself.
The biggist issue with synchronization is when the user edits something offline, and it is edited online at the same time. You need to merge the two changed pieces of data, or deal with the UI to allow the user to say which version is correct. If you eliminate the possibility of both being edited at the same time, then you don't have to solve this sticky problem.
The method is usually to add a field 'modified' to all tables, and compare the client's modified field for a given record in a given row, against the server's modified date. If they don't match, then you replace the server's data.
Be careful with autogenerated keys - you need to make sure your data integrity is maintained when you copy from the client to the server. Strictly running the SQL statements again on the server could put you in a situation where the autogenerated key has changed, and suddenly your foreign keys are pointing to different records than you intended.
Often when importing data from another source, you keep track of the primary key from the foreign source as well as your own personal primary key. This makes determining the changes and differences between the data sets easier for difficult synchronization situations.
Your synchronizer needs to identify when data can just be updated and when a human being needs to mediate a potential conflict. I have written a paper that explains how to do this using logging and algebraic laws.
What is best suited as the client-side data store in your application? You can choose from an embedded database like SQLite or a message queue or some object store or (if none of these can be used since it is a web application) files/ documents saved on the client using Web DB or IndexedDB through HTML 5's LocalStorage API.
Check the paper Gold Rush: Mobile Transaction Middleware with Java-Object Replication. Microsoft's documentation of occasionally connected systems describes two approaches: service-oriented or message-oriented and data-oriented. Gold Rush takes the earlier approach. The later approach uses database merge-replication.

Categories