What is the elegant way of halting consumption of messages when an exception happens in the consumer or listener,so the messages can be re-queued. The listener process is consuming messages from the queue and calling a different API. Now if the API is not available, we don't want to consume messages from the queue. Is there any way to stop consuming messages from the queue for a finite time and come back up again when the API is available.
any sample code snippet of how it can be done also will help.
When asking questions like this, it's generally best to show a snippet of your configuration so we can answer appropriately, depending on how you are using the framework.
You can simply call stop() (and start()) on the listener container bean.
If you are using #RabbitListener, the containers are not beans, but are available via the RabbitListenerEndpointRegistry bean.
Calling stop() on the registry stops all the containers.
Or, you can call registry.getListenerContainer(id).stop(), where id is the value of the #RabbitListener 's id property.
Related
I have an application which reads from a kafka queue and and goes on like this.
validate->convert->enrich->persist->notify
In the flow, I'm gathering some performance and other data points into a ThreadLocal container.
In the happy scenario I'm sending these information to a service to be later used in reporting. But the pipeline can stop in any step if one of the step fails due to a known error (eg, convert failed so flow should stop there). I do not like each of these processors to have a code that sends the information in the ThreadLocal to reporting service if the execution resulted in error, as that would couple those services with information not related to its task.
It would be nice to have a way to execute a service at the end of the flow to send this information out, no matter which step the pipeline stops moving forward. Also there could be scenarios some code did throw an exception that was not known or other issue that break the flow.
Is there a way that a final operation to be executed no matter the result of the pipeline so that it can be used to send this information similar to a finally block in java?
The integration flow is like a simple Java try...catch...finally. It is really more about a distributed computation and loosely-coupling principle between components. So, even if you tie endpoints with channels in between, there really have nothing to know about the previous and next step: everything is done in the current endpoint with its input and output channels. Therefore your request about something like finally in the flow does not fit to the EIP concepts and cannot be implement as some primitive in the framework.
You are just lucky in your use-case that you can rely on the ThreadLocal for your flow logic, but you should keep in mind that it is not a way to deal with messaging. It really has to be stateless and have scope only of the message traveling from one endpoint to another. Therefore it might be better to revise your logic in favor of storing such a tracing information into headers of that message on each step. This way in the future you can make the flow fully async or even distributed in the network.
This is just my concern for a design you have so far.
For the current error handling problem consider to have that "final" step behind some well-know channel, so you will be free to send a message to that endpoint from whatever place you need. For example you can wrap problematic endpoints into an ExpressionEvaluatingRequestHandlerAdvice. Handle an error over there and send it to the mentioned channel. This way your business method will be free from error handling and so. See more in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#expression-advice
If your flow starts from some gateway or inbound channel adapter, you can have an errorChannel configured there to be able to catch all the downstream errors in one central place. And again: send the handling result to the mentioned channel.
But no. No finally in the framework at the moment and I doubt it would even be suitable in the future. For the messaging and async reason I explained before.
I am using spring-kafka to implement a consumer that reads messages from a certain topic. All of these messages are processed by them being exported into another system via a REST API. For that, the code uses the WebClient from the Spring Webflux project, which results in reactive code:
#KafkaListener(topics = "${some.topic}", groupId = "my-group-id")
public void listenToTopic(final ConsumerRecord<String, String> record) {
// minimal, non-reactive code here (logging, serizializing the string)
webClient.get().uri(...).retrieve().bodyToMono(String.class)
// long, reactive chain here
.subscribe();
}
Now I am wondering if this setup is reasonable or if this could cause a lot of issues because the KafkaListener logic from spring-kafka isn't inherently reactive. I wonder if it is necessary to use reactor-kafka instead.
My understanding of the whole reactive world and also the kafka world is very limited, but here is what I am currently assuming the above setup would entail:
The listenToTopic function will almost immediately return, because the bulk of the work is done in a reactive chain, which will not block the function from returning. This means that, from what I understand, the KafkaListener logic will assume that the message is properly processed right there and then, so it will probably acknowledge it and at some point also commit it. If I understand correctly, then that means that the processing of the messages could get out of order. Work could still be done in the previous, reactive chain while the KafkaListener already fetches the next record. This means if the application relies on the messages being fully processed in strict order, then the above setup would be bad. But if it does not, then the above setup would be okay?
Another issue with the above setup is that the application could overload itself with work if there are a lot of messages coming in. Because the listener function returns almost immediately, a large amount of messages could be processing inside of reactive chains at the same time.
The retry-logic that comes built in with the #KafkaListener logic would not really work here, because exceptions inside of the reactive chain would not trigger it. Any retry-logic would have to be handled by the reactive code inside of the listener function itself.
When using reactor-kafka instead of the #KafkaListener annotation, one could change the behaviour described in point 1. Because the listener would now be integrated into the reactive chain, it would be possible to acknowledge a message only when the reactive chain has actually finished. This way, from what I understand, the next message will only be fetched after one message is fully processed via the reactive chain. This would probably solve the issues/behaviour described in point 2-4 as well.
The question: Is my understanding of the situation correct? Are there other issues that could be caused by this setup that I have missed?
Your understanding is correct; either switch to a non-reactive rest client (e.g. RestTemplate) or use reactor-kafka for the consumer.
Let me try explaining the situation:
There is a messaging system that we are going to incorporate which could either be a Queue or Topic (JMS terms).
1 ) Producer/Publisher : There is a service A. A produces messages and writes to a Queue/Topic
2 ) Consumer/Subscriber : There is a service B. B asynchronously reads messages from Queue/Topic. B then calls a web service and passes the message to it. The webservice takes significant amount of time to process the message. (This action need not be processed real-time.)
The Message Broker is Tibco
My intention is : Not to miss out processing any message from A. Re-process it at a later point in time in case the processing failed for the first time (perhaps as a batch).
Question:
I was thinking of writing the message to a DB before making a webservice call. If the call succeeds, I would mark the message processed. Otherwise failed. Later, in a cron job, I would process all the requests that had initially failed.
Is writing to a DB a typical way of doing this?
Since you have a fail callback, you can just requeue your Message and have your Consumer/Subscriber pick it up and try again. If it failed because of some problem in the web service and you want to wait X time before trying again then you can do either schedule for the web service to be called at a later date for that specific Message (look into ScheduledExecutorService) or do as you described and use a cron job with some database entries.
If you only want it to try again once per message, then keep an internal counter either with the Message or within a Map<Message, Integer> as a counter for each Message.
Crudely put that is the technique, although there could be out-of-the-box solutions available which you can use. Typical ESB solutions support reliable messaging. Have a look at MuleESB or Apache ActiveMQ as well.
It might be interesting to take advantage of the EMS platform your already have (example 1) instead of building a custom solution (example 2).
But it all depends on the implementation language:
Example 1 - EMS is the "keeper" : If I were to solve such problem with TIBCO BusinessWorks, I would use the "JMS transaction" feature of BW. By encompassing the EMS read and the WS call within the same "group", you ask for them to be both applied, or not at all. If the call failed for some reason, the message would be returned to EMS.
Two problems with this solution : You might not have BW, and the first failed operation would block all the rest of the batch process (that may be the desired behavior).
FYI, I understand it is possible to use such feature in "pure java", but I never tried it : http://www.javaworld.com/javaworld/jw-02-2002/jw-0315-jms.html
Example 2 - A DB is the "keeper" : If you go with your "DB" method, your queue/topic customer continuously drops insert data in a DB, and all records represent a task to be executed. This feels an awful lot like the simple "mapping engine" problem every integration middleware aims to make easier. You could solve this with anything from a custom java code and multiples threads (DB inserter, WS job handlers, etc.) to an EAI middleware (like BW) or even a BPM engine (TIBCO has many solutions for that)
Of course, there are also other vendors... EMS is a JMS standard implementation, as you know.
I would recommend using the built in EMS (& JMS) features,as "guaranteed delivery" is what it's built for ;) - no db needed at all...
You need to be aware that the first decision will be:
do you need to deliver in order? (then only 1 JMS Session and Client Ack mode should be used)
how often and in what reoccuring times do you want to retry? (To not make an infinite loop of a message that couldn't be processed by that web service).
This is independent whatever kind of client you use (TIBCO BW or e.g. Java onMessage() in a MDB).
For "in order" delivery: make shure only 1 JMS Session processes the messages and it uses Client acknolwedge mode. After you process the message sucessfully, you need to acknowledge the message with either calling the JMS API "acknowledge()" method or in TIBCO BW by executing the "commit" activity.
In case of an error you don't execute the acknowledge for the method, so the message will be put back in the Queue for redelivery (you can see how many times it was redelivered in the JMS header).
EMS's Explicit Client Acknolwedge mode also enables you to do the same if order is not important and you need a few client threads to process the message.
For controlling how often the message get's processed use:
max redelivery properties of the EMS queue (e.g. you could put the message in the dead
letter queue afer x redelivery to not hold up other messages)
redelivery delay to put a "pause" in between redelivery. This is useful in case the
Web Service needs to recover after a crash and not gets stormed by the same message again and again in high intervall through redelivery.
Hope that helps
Cheers
Seb
I am trying to implement a custom inbound channel adapter in spring integration to consume messages from apache kafka. Based on spring integration examples, I found that I need to create a class that implements MessageSource interface and implement receive() method that would return consumed Message from kafka. But based on consumer example in kafka, the message iterator in KafkaStream is backed by a BlockingQueue. So if no messages are in the queue, the thread will be blocked.
So what is the best way to implement a receive() method as this method can potentially block until there is something to consume.. ?
In more general sense, How do we implement a custom inbound channel for streaming message sources that blocks until there is something ready to consume..?
The receive() method can block (as long as the underlying operation responds properly to an interrupted thread), and from an inbound-channel-adapter perspective, depending on the expectations of the underlying source, it might be preferable to use a fixed-delay trigger. For example, "long polling" can simulate event-driven behavior when a very small delay value is provided.
We have a similar situation in our JMS polling MessageSource implementation. There, the underlying behavior is handled by one of the JmsTemplate's receive() methods. The JmsTemplate itself allows configuration of a timeout value. That means, as an example, you may choose to block for 5-seconds max but then have a very short delay trigger between each blocking receive call. Alternatively, you can specify an indefinite receive timeout. The decision ultimately depends on the expectations of the underlying resource, message throughput, etc.
Also, I wanted to let you know that we are exploring Kafka adapters ourselves. Perhaps you would like to collaborate on this within the spring-integration-extensions repository?
Regards,
Mark
I have a spring bean with 4 blocking queues. Each queue is assigned a method (named processQueueX() ) which calls take() on that queue and processes taken object from queue.
I want to call each of those method in a separate thread on app startup.
I tried with task scheduler and fixed-delay setting but that in some way blocks tomcat and it stops responding to requests. Each method needs to be called once, so scheduling was a bad idea I guess.
Init method does not work also since it works in a single thread, each method has endless loop to process queue forever.
Is there a way to call these methods declaratively from spring config file in manner similar to task namespace? Or programmatically?
Tnx
I think using scheduler not a bad idea use quart scheduler with simple trigger thus quarz will do threading for you and tomcat not effected .And configure quartz with just enough number of thread.
Would 23.4. The Spring TaskExecutor abstraction help?
Where the example has a MessagePrinterTask class, you would have similar, but your run() method would access one of the queues. You would set up your Spring config to inject one of the queues into the task, so depending on how similar your queues are, you might be able to use the same Runnable task.