Loading data from Cassandra at application startup

Loading data from Cassandra at application startup - java

I am working on a java-spring boot application. I have some application's specific constant values stored in the Cassandra database and I need to hit the database for every request to fetch these constant values from the database, which is a bad practice. So I thought of fetching these values at the application startup and store it in static variables and access these variables across the application. But the problem I am facing here is, after the application startup and once the data is fetched from DB and stored in static variables, and in case if we change the values in the database, these static variables are not updating with the new values until I restart the application. Is there any best approach to update these static variables as and when we change the values in the database, without restarting the application. Kindly help me out. Thanks in advance.

One approach could be to use some kind of message broker like Kafka (http://kafka.apache.org/) to publish database changes to subscribers or (without kafka) to use some kind of push notifications.
A possible setup could be something like this:
Service2 is used to change your "constants". So you should not change these values directly using cqlsh but only using this second service.
After a successfull change you could now send a http request to a "refresh"-endpoint of your regular spring-boot application. This refresh notification then can trigger an update in your application. This is basically a callback approach.
The second and more flexible way could be using a message broker like kafka.
With kafka you would still create a service which you use to change your "constants". The difference here would be that after you successfully changed your data, you would send a message to kafka. Kafka now can dispatch your message to each service which registered itself as a subscriber. Using this option you are free to add more than one subscriber in case you have more services depending on your "constants".
Personally I'm not a fan of a polling solution where you check every x seconds if a change is present.

Related

Call a function on db update

I want to create a Spring Boot App that has an API that clients query to get data from. Also I'd like to update any change that is made on the db appear in the front in real time. Something like firebase does when updating a document. I'm using in-memory H2 Database at the moment as it is not something that I need to persist between runs (for now I guess...)
The API is not a problem but the real-time updates as that part is already done.
I thought about implementing pub-sub strategy or something like that but I'm a bit lost about it actually.
I also know the existence about WebFlux and have read info about it but I'm not sure that fulfill my needs.

I had implemented CDC (Change Data Capture) using Debezium.
To set it up, you create a relevant connector for your DB (example), it uses Kafka connect and keeps track of your DB operations.
When any DB operation (Insert, Update, Delete) occurs then it publishes CDC messages to a Kafka topic.
You can setup a Spring Kafka Consumer application to listen to this Kafka topic, consume the CDC events and react on the basis of the type of operation [ op = c (create), u (update), d (delete) ].
Here is a sample project that I have created. You can use it as a reference.
This is how do the Kafka messages look like - link

Kafka Producer - Just need some direction

I'm working with Kafka for the first time. I've set up confluent cloud locally, created topics along with some JDBC connectors to populate the topics with data. Now, I want to try to trigger a connector from my Java application. The gist is that we have a large data feed that we want to run only once a day, triggered by an existing Java application. I already have the topic, database tables, and a JDBC connector with a custom query. This all works fine, produces the data I want and I can see it in coming in via the CLI but now I need to trigger the pull from java - is this scenario possible with Kafka?

The JDBC Kafka source connector is meant to be ran continuously, but if you want to "trigger" it, that would be an HTTP client to make a POST request with mode=bulk, or incrementing / timestamp to get only the data you added, and a large poll.interval.ms if using bulk to prevent reading the table multiple times. You'd add your query there too.
You then would somehow need to know when the connector started its tasks finished reading the data, then you would issue an HTTP DELETE to stop the sourcing of the database.
Or rather than deleting the connector, you can set the poll interval to a day and leave it alone and just have your database client insert the data as needed. You will still want to monitor if the connector is actually successful on each day.

If I understand your question correctly, you are trying to do the following.
On a trigger, you want to pull data from a database using JDBC and push the those data to kafka. If this is your requirement following is one solution i can propose.
Kafka producer is something that you can create by your own easily. In fact your your Java code that pulls data from databse itself can act as kafka producer also.
You are connecting to database and pulling data using a library. Similarly there are libraries that allows to connect to Kafka directly from java code. You can use these library along with your Java code to push data to the kafka topics. And when your java code is able to push data to a kafka topics , that itself make your java code as kafka producer.
Following is one example that you can refer for creating kafka producer: https://dzone.com/articles/kafka-producer-and-consumer-example
The solution I proposed is depicted over a simple diagram below.

Find out whose using Redis

We have one Redis for our company and multiple teams are using it. We are getting a surge of requests and nobody seems to know which application is causing it. We have only one password that goes around the whole company and our Redis is secured under a VPN so we know it's not coming from the outside.
Is there a way to know whose using Redis? Maybe we can pass in some headers with the connection from every app to identify who makes the most requests, etc.
We use Spring Data Redis for our communication.

This question is too broad since different strategies can be used here:
Use Redis MONITOR command. This is basically a built-in debugging tool that monitors all the commands executed by Redis
Use some kind of intermediate proxy. Instead of routing all the commands directly to redis - route everything to proxy that will do some processing like measuring the amounts of commands by the calling host or maybe types of commands depending what you want.
This is still only a configuration related solution so you won't need any changes at the level of applications
Since you have spring boot, you can use Micrometer / metering integration. This way you could create a counter / gauge that will get updated upon each request to Redis. If you also stream the metering data to tools like Prometheus, you'll be able to create a dashboard, say in grafana to see the whole picture. Micrometer can integrate also with other products, Prometheus/Grafana was only an example, you can chose any other solution (maybe in your organization you already have something like that).

Auto refresh from database across all instances

In my java/spring application a database record is fetched at the server init and is stored as a static field. Currently we do a mbean refresh to refresh the database values across all instances. Is there any other way to programatically refresh the database value across all the instances of the server? I am reading about EntityManager refresh.Will that work across all instances?Any help would be greatly appreciated.

You could schedule a reload every 5 minutes for example.
Or you could send events and all instance react to that event.

Till now, Communication between databases and servers is one-sided i.e. app server requests for data from the database. This generally results in the problem, and as you mentioned, that all application servers cannot know about a database change if an application is being run in cluster mode.
The current solution includes refreshing the fields time-to-time (A poll based technique).
To make this a push based model, We can create wrapper APIs over databases and let those wrapper APIs pass on the change to all the application servers.
By this I mean, Do not directly update database values from one application server but instead, on an update request send this change request to another application which keeps track of your application servers and pushes an event (via API call or queues) for a refresh of the passed database table.
Luckily, if you are using some new database (like MongoDB), they provide this update push to app servers out of the box now.

Best approach for avoiding database polling using Spring

I have web service for reading and updating data and using spring, spring JDBC for DB access. My controller can be accessed by many channels like desktop, mobile etc. If data is updated using desktop, then same should reflect in mobile immediately. Current approach is calling service continuously to get updated data. I feel that it is worst approach and causing DB performance issue as well.
Is there a possible way such that GET service is called only when there is DB update by other channel instead of continuous polling ? What is best approach for this and how to implement it ?

Continuously calling the service seems like a really bad idea. I think you need a database trigger that fires when rows are inserted/updated/deleted. It could POST something to a Web Service or put something on a Message Queue.
Good luck.

I can think of an architectural answer to the problem. Use a messaging solution between the spring controller and the database. Infact you will need two queues
EventSink queue -
Publish all data change requests originating from any of the channels to this queue.The subscriber will be the service managing the database update aka dbservice .
EventBroadcast queue -
Publish the changed data post db update to this queue. Ideally the dbservice should handle this publish within the same transaction as db update. All channels can subscribe to this queue to receive the update.
The merits to consider this approach would involve
Pros - this approach involves no database services so both performance and de-coupling from database changes.
Cons - Increased complexity

Continuously polling is not as bad as you might imagine. Pushing messages to clients without them making a request requires web-sockets or of the like to achieve this. If it is not a large repose from the server, and is not too often, as in many millions and millions of requests then I would leave it for now.
If however this is a large amount of bandwidth we are talking about it then you wouldn't want to be polling. You would probably want to look at a subscriber type pattern whereby clients would subscribe to be notified when a specific event occurs. When this event occurs the server would then send a message to the clients.
Detecting this event shouldn't require polling a database. The modification to the database should trigger the event. You might do this with point-cuts in Spring if you are into that sort of thing.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Loading data from Cassandra at application startup - java

Related

Call a function on db update

Kafka Producer - Just need some direction

Find out whose using Redis

Auto refresh from database across all instances

Best approach for avoiding database polling using Spring

Categories

Resources