I have a storm topology to process messages from Kafka and make HTTP call / saves in Cassandra based on the task in hand. I process the messages as soon as they come. How ever few messages are not processed completely due to the response form external sources such as an HTTP. I would like to implement a exponential backoff mechanism for retrial in-case HTTP server does not respond/returns an error message to retry after some time. I could think of few ideas using which I could achieve them. I would like to know which of them will be a better solution also if there is any other solution that I can use which is fault tolerant. Since this is used to implement an exponential backoff each message will have a different delay time.
Send it another topic in Kafka which is consumed later. My preferred Solution. I know we can use Kafka offset so consume the message at a latter stage. How ever I could not find documentation/Sample code to do the same. It will be really helpful if any one can help me out with this.
Write the message Cassandra / Redis and write a scheduler to fetch the messages which are not processed and are ready to be consumed and Send it to Kafka so that my storm topology can consume it. (Existing solution in other legacy project(Non Storm))
Send to Beanstalk with Delay (Existing solution in other legacy project(Non Storm). How ever I would like to avoid using this solution and use it only in case I am out of option).
While this is pretty much what I would like to do. I am not able to find documentation to implement delayProcessingUntil as mentioned in Kafka - Delayed Queue implementation using high level consumer
I have done scheduled job from Data-store and delay using Beanstalk in the past, but I would prefer to use Kafka.
Kafka spout has an exponential backoff message retry built-in. You can configure initial delay, delay multiplier and maximum delay through spout configuration. If there is an error in the bolt, you can call collector.fail(input). After that you just leave it to spout to do the retry.
https://github.com/apache/storm/blob/v0.10.0/external/storm-kafka/src/jvm/storm/kafka/ExponentialBackoffMsgRetryManager.java
I think your use case describes the need for a database rather than a queue. You want to temporarily store records until their time and then remove them so they don't show up in future searches. Trying to do that in a queue would be awkward at best, as your analysis shows.
I suggest you create another column family in Cassandra to hold these delayed requests. You'd store the request itself along with a time to retry. Whether you'd want to also have a time series of failed HTTP attempts and related data is up to you. As a delayed request is finally fulfilled, you'd delete the corresponding row from the CF. The search for delayed requests is straightforward, too.
Of course, any database, even a file on the local drive or in HDFS could work, too.
You might be interested in the Kafka Retry project https://github.com/IBM/kafka-retry. It provides a delayed retry queue using a single retry topic.
Related
I am using a managed RabbitMQ cluster through AWS Amazon-MQ. If the consumers finish their work quickly then everything is working fine. However, depending on few scenarios few consumers are taking more than 30 mins to complete the processing.
In that scenarios, RabbitMQ deletes the consumer and makes the same messages visible again in the queue. Becasue of this another consumer picks it up and starts processing. It is happing in the loop. Therefore the same transaction is getting executed again and I am loosing the consumer as well.
I am not using any AcknowledgeMode so I believe it's AUTO by default and it has 30 mins limit.
Is there any way to increase the Delivery Acknowledgement Timeout for AUTO mode?
Or please let me know if anyone has any other solutions for this.
Reply From AWS Support:
Consumer timeout is now configurable but can be done only by the service team. The change will be permanent irrespective of any version.
So you may update RabbitMQ to latest, and no need to stick with 3.8.11. Provide your broker details and desired timeout, they should be able to do it for you.
This is the response from AWS support.
From my understanding, I see that your workload is currently affected by the consumer_timeout parameter that was introduced in v3.8.15.
We have had a number of reach outs due to this, unfortunately, the service team has confirmed that while they can manually edit the rabbitmq.conf, this will be overwritten on the next reboot or failover and thus is not a recommended solution. This will also mean that all security patching on the brokers where a manual change is applied, will have to be paused. Currently, the service does not support custom user configurations for RabbitMQ from this configuration file, but have confirmed they are looking to address this in future, however, is not able to an ETA on when this will available.
From the RabbitMQ github, it seems this was added for quorum queues in v3.8.15 (https://github.com/rabbitmq/rabbitmq-server/releases/tag/v3.8.15 ), but seems to apply to all consumers (https://github.com/rabbitmq/rabbitmq-server/pull/2990 ).
Unfortunately, RabbitMQ itself does not support downgrades (https://www.rabbitmq.com/upgrade.html )
Thus the recommended workaround and safest action form the service team, as of now is to create a new broker on an older version (3.8.11) and set auto minor version upgrade to false, so that it wont be upgraded.
Then export the configuration from the existing RabbitMQ instance and import it into new instance and use this instance going forward.
Application1: source system sending 10 requests per sec to apache camel(ActiveMQ).
Application2: Apache camel which receives request from Application1 and sends it to downstream system Application3(10 requests/sec).
Application3: The downstream system(post API) gets 10 requests per sec from apache camel and processes the request.
Problem statement: Application3 has DB updates and processing tasks to be handled because of the 10 requests at a time to application3 the duplicates are been generated during processing and DB updates.
Please suggest the way I can add a delay of 1sec between each request either at apache camel /the downstream system.
Thanks in advance.
There is a Delay EIP in Camel you can use to delay every message (probably in Application 2) for a fixed or even calculated amount of time. Make sure that you only have 1 Consumer in Application 2.
There is also a Throttle EIP in Camel, but it keeps all messages that are hold back in memory. In your case it is better to slow down the consumption of Application 2 to avoid overloading Application 3.
I guess that Application 3 get duplicates because it can't keep up with the requests and then Application 2 gets Timeouts and its error handling sends the requests again.
However, when you use messaging, you have to prepare for duplicates. In error cases you can easily receive duplicates and therefore you have to make Application 3 idempotent.
Of course, Camel can also help you with this thanks to the Idempotent Consumer.
I "monitor" the number of consecutive failures in my Camel processing pipeline with a Camel RoutePolicy.
When a threshold of failures is reached, I want to pause the processing for a configured amount of time because it probably means that the data from another system is not yet ready and therefore every message fails.
Since the source of my pipeline is a Kafka topic, I should not just stop the whole route because the broker would assume my consumer died and rebalance.
The best way to "pause" topic consumption seems to be to pause the [KafkaConsumer][3] (the native, not the one of Camel). Like this, the consumer continues to poll the broker, but it does not fetch any messages. Exactly what I need.
But can I access the native [KafkaConsumer][3] from the RoutePolicy context to call the pause and resume methods?
The spring-kafka listener containers expose these methods, it would be nice to use them from Camel too.
This is not yet supported, the two methods must be added to the camel-kafka consumer first.
There is also an existing issue for it: https://issues.apache.org/jira/browse/CAMEL-15106
Does ActiveMQ support Idempotent producer? I know Camel has an idempotent consumer pattern to detect and handle duplicate messages, but I'm wondering if this can be prevented at the source (producer).
Here is a little back ground. I have applications that are horizontally scaled accessing same database. There is one particular table that maintains status of a particular process. These horizontal applications should be able to read the status and invoke another process, however only one of them should be able to invoke it. This application periodically polls the data base and posts a message to a messaging broker, once the required condition is met. But I want one of the load balancing application should be able to post the message.
One crude approach I'm thinking is...
On Machine 1:
Read the database for checking if the necessary condition is met.
Before posting message to the broker, write a record to another status table with a unique key that identifies the process and commits. If this operation fails due to unique key constraint violation, it means process on another machine succeeded in posting the message.
Post the message to the broker
If the message posting is failed, for some reason, perform delete operation on the status table based on the unique key/ primary key.
The same operation can be performed by same application running on machine 2 , 3, 4 etc.
Below is one pitfall I quickly notice with this approach.
Assuming that Machine 1 is able to complete step 2 but failed performing step 3 and continues with step 4. Meanwhile Machine 2, when it failed at step 2, will move on with out attempting to read the status again and post the message.
To address this, I need to put retry on step 3, until the message is successfully posted to broker.
Another option is to use https://camel.apache.org/components/latest/eips/idempotentConsumer-eip.html pattern. But this is essentially a filter at consumer side. Though this will serve my purpose, is there a similar approach out of box available on message publishing side.
I wonder, if this approach is even correct or any better alternative approach, or any existing libraries that can be used to perform locking kind of mechanism across JVM either local or remote.
It's not clear what version of ActiveMQ you're using (i.e. ActiveMQ 5.x or ActiveMQ Artemis) so I'll try to address this issue for both.
ActiveMQ 5.x doesn't have any built-in support for detecting duplicates sent from clients. However, you could potentially implement this feature using a broker plugin. The only challenge I see here is configuring, managing, and monitoring the cache of duplicate IDs.
ActiveMQ Artemis does have built in support for detecting duplicates sent from clients. You can read more about duplicate detection in the documentation. Since the broker supports this behavior natively it provides clean configuration, management, and monitoring.
In either case you'll need to set a special header on each message with "a unique key that identifies the process" just like you would for your potential database solution. Furthermore, using the broker as the duplicate detector is much simpler overall.
If you're currently using ActiveMQ 5.x but want to move to ActiveMQ Artemis in order to use the duplicate detection feature you don't necessarily need to update your clients as ActiveMQ Artemis fully supports the OpenWire protocol used by 5.x clients. You should just be able to point them to the new instance of ActiveMQ Artemis and have everything work.
In our system we're using an older version of kafka (0.9.0.1) and the old scala consumer API in a tomcat application.
Everything works fine most of the time, but sometimes when the servers where the consumers run are heavily utilised by some other tasks in the app then the consumers become unresponsive which triggers as expected a rebalance and that consumer is removed from its partitions and other consumers are used.
My question is if there is an easy way for the consumer to re-register itself when it comes back up?
I know that the old consumers store the partition consumer details in Zookeeper and was thinking we could have a task that would periodically check if our consumer is registered there and restart the consumer if not, but I'm not sure what exactly we should check there. Can anyone point me to some documentation about the data stored in zookeeper by kafka (haven't found anything in the official documentation sadly :( )?
Basically, what you want is fixed assignments, and that consumer groups never rebalance. If there was a way to disable automatic consumer rebalancing in the old Scala client, or maybe even increase the rebalance timeout to a much higher value, that could also work, but I couldn't find how to do that with the old Scala consumer.
However, it is possible to assign fixed topic/partitions when using the newer Java consumers, also available in that same 0.9 kafka version. Look for Subscribing To Specific Partitions in the latest Javadocs:
https://kafka.apache.org/090/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
Subscribing To Specific Partitions
In the previous examples we subscribed to the topics we were interested in and
let Kafka give our particular process a fair share of the partitions for those topics.
This provides a simple load balancing mechanism so multiple instances of our program
can divided up the work of processing records.
In this mode the consumer will just get the partitions it subscribes to
and if the consumer instance fails no attempt will be made to
rebalance partitions to other instances.