I want to create a Spring Boot App that has an API that clients query to get data from. Also I'd like to update any change that is made on the db appear in the front in real time. Something like firebase does when updating a document. I'm using in-memory H2 Database at the moment as it is not something that I need to persist between runs (for now I guess...)
The API is not a problem but the real-time updates as that part is already done.
I thought about implementing pub-sub strategy or something like that but I'm a bit lost about it actually.
I also know the existence about WebFlux and have read info about it but I'm not sure that fulfill my needs.
I had implemented CDC (Change Data Capture) using Debezium.
To set it up, you create a relevant connector for your DB (example), it uses Kafka connect and keeps track of your DB operations.
When any DB operation (Insert, Update, Delete) occurs then it publishes CDC messages to a Kafka topic.
You can setup a Spring Kafka Consumer application to listen to this Kafka topic, consume the CDC events and react on the basis of the type of operation [ op = c (create), u (update), d (delete) ].
Here is a sample project that I have created. You can use it as a reference.
This is how do the Kafka messages look like - link
Related
I have spring boot application which persists data to mongodb and sends this data to kafka. I want this two processes to run atomically. That means if data is persisted to mongo then it should be sent to kafka. How can I do it?
With Kafka itself you can't.
Kafka offers transactions, but they are restricted to write to multiple partitions in Kafka atomically. They are designed with stream processing in mind, so consuming from one topic and producing to another in one go - but a Kafka transaction cannot know whether a write command to mongo succeeded.
The usecase you have is something that appears regularly though. Usually you would use the outbox pattern. The answer is to only modify one of the two resources (the database or Apache Kafka) and drive the update of the second one based on that in an eventually consistent manner.
If you really need atomic writes, I believe it is possible to do so by relying on the ACID guarantees Mongo >= 4.2 gives you instead of Kafka's transactional guarantees. But this would mean you need to manage the Kafka offsets in Mongo.
If you have "Kafka the definitive guide, 2nd edition", there is a small chapter with more details about what exactly Kafka transactions can and cannot do and possible workarounds.
I'm working with Kafka for the first time. I've set up confluent cloud locally, created topics along with some JDBC connectors to populate the topics with data. Now, I want to try to trigger a connector from my Java application. The gist is that we have a large data feed that we want to run only once a day, triggered by an existing Java application. I already have the topic, database tables, and a JDBC connector with a custom query. This all works fine, produces the data I want and I can see it in coming in via the CLI but now I need to trigger the pull from java - is this scenario possible with Kafka?
The JDBC Kafka source connector is meant to be ran continuously, but if you want to "trigger" it, that would be an HTTP client to make a POST request with mode=bulk, or incrementing / timestamp to get only the data you added, and a large poll.interval.ms if using bulk to prevent reading the table multiple times. You'd add your query there too.
You then would somehow need to know when the connector started its tasks finished reading the data, then you would issue an HTTP DELETE to stop the sourcing of the database.
Or rather than deleting the connector, you can set the poll interval to a day and leave it alone and just have your database client insert the data as needed. You will still want to monitor if the connector is actually successful on each day.
If I understand your question correctly, you are trying to do the following.
On a trigger, you want to pull data from a database using JDBC and push the those data to kafka. If this is your requirement following is one solution i can propose.
Kafka producer is something that you can create by your own easily. In fact your your Java code that pulls data from databse itself can act as kafka producer also.
You are connecting to database and pulling data using a library. Similarly there are libraries that allows to connect to Kafka directly from java code. You can use these library along with your Java code to push data to the kafka topics. And when your java code is able to push data to a kafka topics , that itself make your java code as kafka producer.
Following is one example that you can refer for creating kafka producer: https://dzone.com/articles/kafka-producer-and-consumer-example
The solution I proposed is depicted over a simple diagram below.
I have created a spring boot + websocket + stomp application with rabbitmq configured. Now, I need to get updates from oracle table upon a table INSERT (which is done by external service) in the queue, so that I could process it and send it back it to the exchange which the broker will send to the client application
I would like to know if it is possible to get the table inserts in the rabbitmq queue, so that I could write a queue listener and get the data to be processed? Can this be done via rabbitmq configuration, if yes, could you please share any references so that I could dig deep? I could not find any references online though.
Now if it is not possible to somehow configure oracle datasource in rabbitmq, please let me know if there are any other ways I could achieve it.
My goal is to develop a repository that provides a Flux of live time series data starting from a certain time in the past. The repository should provide an API as follows:
public interface TimeSeriesRepository {
//returns a Flux with incoming live data without considering past data
public Flux<TimeSeriesData> getLiveData();
//returns a Flux with incoming live data starting at startTime
public Flux<TimeSeriesData> getLiveData(Instant startTime);
}
The assumptions and constraints are:
the application is using Java 11, Spring Boot 2/Spring 5
the data is stored in a relational database such as PostreSQL and is timestamped.
the data is regularly updated with new data from an external actor
a RabbitMQ broker is available and could be used (if appropriate)
should not include components that require a Zookeeper cluster or similar e.g. event logs such as Apache Kafka or Apache Pulsar or Stream Processing Engines such as Apache Storm or Apache Flink because it is not a large-scale cloud application but should run on a regular PC (e.g. with 8GB RAM)
My first idea was that I would use Debezium to forward incoming data to RabbitMQ and use Reactor RabbitMQ to create a Flux. Actually this was my initial plan before I understood that the second method in the repository that considers historical data is also required. However, this solution would not provide historical data.
Thus, I considered using an event log such as Kafka, so I could replay data from the past but found out the operational overhead is too high. So I dismissed this idea and did not even bother to figure out the details on how this could have worked or potential drawbacks.
Now, I have considered using Spring Data R2DBC but I could not figure out how a query should look like that fulfills my goal.
How could I implement the Interface using any of the mentioned tools or maybe even with plain Java/Spring Data?
I will accept any answer that seems like a feasible approach.
I am working on a java-spring boot application. I have some application's specific constant values stored in the Cassandra database and I need to hit the database for every request to fetch these constant values from the database, which is a bad practice. So I thought of fetching these values at the application startup and store it in static variables and access these variables across the application. But the problem I am facing here is, after the application startup and once the data is fetched from DB and stored in static variables, and in case if we change the values in the database, these static variables are not updating with the new values until I restart the application. Is there any best approach to update these static variables as and when we change the values in the database, without restarting the application. Kindly help me out. Thanks in advance.
One approach could be to use some kind of message broker like Kafka (http://kafka.apache.org/) to publish database changes to subscribers or (without kafka) to use some kind of push notifications.
A possible setup could be something like this:
Service2 is used to change your "constants". So you should not change these values directly using cqlsh but only using this second service.
After a successfull change you could now send a http request to a "refresh"-endpoint of your regular spring-boot application. This refresh notification then can trigger an update in your application. This is basically a callback approach.
The second and more flexible way could be using a message broker like kafka.
With kafka you would still create a service which you use to change your "constants". The difference here would be that after you successfully changed your data, you would send a message to kafka. Kafka now can dispatch your message to each service which registered itself as a subscriber. Using this option you are free to add more than one subscriber in case you have more services depending on your "constants".
Personally I'm not a fan of a polling solution where you check every x seconds if a change is present.