Repeated receipt of uncommitted messages - java

Is it possible to set up receiving a message from kafka so that the reader receives an uncommited message again and again, and does not take the next offset?
Setting enable.auto.commit is equal to false
And I use java and kafka without Spring

You can set max.poll.records=1 and simply cache&return the singular consumed record "again and again", rather than polling in a loop.
Otherwise, continuing to poll from the same client will seek the offset forward.

Related

When can a Flink job consume from Kafka?

We have a Flink job which has the following topology:
source -> filter -> map -> sink
We set a live(ready) status at the sink operator open-override function. After we get that status, we send events. Sometimes it can't consume the events sent early.
We want to know the exact time/step that we can send data which will not be missing.
It looks like you want to ensure that no message is missed for processing. Kafka will retain your messages, so there is no need to send messages only when the Flink consumer is ready. You can simplify your design by avoiding the status message.
Any Kafka Consumer (not just Flink Connector) will have an offset associated with it in Kafka Server to track the id of the last message that was consumed.
From kafka docs:
Kafka maintains a numerical offset for each record in a partition. This
offset acts as a unique identifier of a record within that partition,
and also denotes the position of the consumer in the partition. For
example, a consumer which is at position 5 has consumed records with
offsets 0 through 4 and will next receive the record with offset 5
In your Flink Kafka Connector, specify the offset as the committed offset.
OffsetsInitializer.committedOffsets(OffsetResetStrategy.EARLIEST)
This will ensure that if your Flink Connector is restarted, it will consume from the last position that it left off, before the restart.
If for some reason, the offset is lost, this will read from the beginning (earliest message) in your Kafka topic. Note that this approach will cause you to reprocess the messages.
There are many more offset strategies you can explore to choose the right one for you.
Refer - https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/#starting-offset

Spring Kafka: How does kafkaTemplate executeInTransaction method play with Consumer's read_committed isolation level

I am seeing that the isolation.level=read_committed consumer property ensures that only committed messages will be read by the consumer. I am trying to understand what is meant by a committed message in this context exactly ? When can we say a producer's message is committed to a topic?
My Specific scenario
I am using Spring-kafka, kafkaTemplate.executeInTransaction method asynchronously to send messages to kafka.
I see executeInTransaction method internally calls producer.commitTransaction() which in turn throws exception if it cannot complete within max.block.ms
My Confusion is
if producer.commitTransaction() completes within max.block.ms, does that mean the message has been stored in the topic successfully, ready for a consumer with isolation.level=read_committed to consume?
I ask this because I see there is another property delivery.timeout.ms which corresponds to processes that start after send()/max.block.ms is complete.
So.. does this mean even after producer.commitTransaction() returns, we still need to wait maximum of delivery.timeout.ms to be certain that the message has been written to a topic ?
No; when the commit is successful, the record is secure in the log.

How can I consume message in topic many times

I have producer which I call and posts a record to Kafka, then I call a consumer which returns the record, but when I call the consumer again the consumer doesn't return any records. (I need to get the record which I had posted to Kafka again). How can I do this?(Any code would be appreciated)
Kafka doesn't delete the message after it has been consumed. But it keeps the offset of reading for any consumer. So after you read a message from it, the offset goes forward. The second read doesn't read anything because the offset point after your only message and there is nothing after that. You should try resetting the offset before you read again. See this post:
Reset consumer offset to the beginning from Kafka Streams
But if you don't want to reset locally or globally, you can create another consumer group and since every consumer group has its own offset, your second read by the new consumers can achieve what you want. See this link:
kafka-tutorial-kafka-consumer
Hope this would be helpful.
You can manually reset the offset to desired offset or if you need to consumer from the start offset ( whatever is available in kafka) , then you can set the consumer property "auto.offset.reset=earliest"
You can also provide every time a new group.id value for the consumer properties. Just generate a random string value. The property auto.offset.reset must be set to earliest.

Reading messages offset in Apache Kafka

I am very much new to Kafka and we are using Kafka 0.8.1.
What I need to do is to consume a message from topic. For that, I will have to write one consumer in Java which will consume a message from topic and then save that message to database. After a message is saved, some acknowledgement will be sent to Java consumer. If acknowledgement is true, then next message should be consumed from the topic. If acknowldgement is false(which means due to some error message,read from the topic, couldn't be saved into the database), then again that message should be read.
I think I need to use Simple Consumer,to have control over message offset and have gone through the Simple Consumer example as given in this link https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example.
In this example, offset is evaluated in run method as 'readOffset'. Do I need to play with that? For e.g. I can use LatestTime() instead of EarliestTime() and in case of false, I will reset the offset to the one before using offset - 1.
Is this how I should proceed?
I think you can get along with using the high level consumer (http://kafka.apache.org/documentation.html#highlevelconsumerapi), that should be easier to use than the SimpleConsumer. I don't think the consumer needs to reread messages from Kafka on database failure, as the consumer already has those messages and can resend them to the DB or do anything else it sees fit.
High-level consumers store the last offset read from a specific partition in Zookeeper (based on the consumer group name), so that when a consumer process dies and is later restarted (potentially on an other host), it can continue processing messages where it left off. It's possible to autosave this offset to Zookeeper periodically (see the consumer properties auto.commit.enable and auto.commit.interval.ms), or have it saved by application logic by calling ConsumerConnector.commitOffsets . See also https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example .
I suggest you turn auto-commit off and commit your offsets yourselves once you received DB acknowledgement. Thus, you can make sure unprocessed messages are reread from Kafka in case of consumer failure and all messages commited to Kafka will eventually reach the DB at least once (but not 'exactly once').

RabbitMQ: messages remain "Unacknowledged"

My Java application sends messages to RabbitMQ exchange, then exchange redirects messages to binded queue.
I use Springframework AMQP java plugin with RabbitMQ.
The problem: message comes to queue, but it stays in "Unacknowledged" state, it never becomes "Ready".
What could be the reason?
An Unacknowledged message implies that it has been read by your consumer, but the consumer has never sent back an ACK to the RabbitMQ broker to say that it has finished processing it.
I'm not overly familiar with the Spring Framework plugin, but somewhere (for your consumer) you will be declaring your queue, it might look something like this (taken from http://www.rabbitmq.com/tutorials/tutorial-two-java.html):
channel.queueDeclare(queueName, ....)
then you will setup your consumer
bool ackMode = false;
QueueingConsumer consumer = new QueueingConsumer(channel);
channel.basicConsume(queueName, ackMode, consumer);
ackMode above is a boolean, by setting it to false, we're explicitly saying to RabbitMQ that my consumer will acknowledge each message it is given. If this flag was set to true, then you wouldn't be seeing the Unacknowledged count in RabbitMQ, rather as soon as a consumer has read the message off (i.e it has been delivered to the consumer it will remove it from the queue).
To acknowledge a message you would do something like this:
QueueingConsumer.Delivery delivery = consumer.nextDelivery();
//...do something with the message...
channel.basicAck(delivery.getEnvelope().getDeliveryTag(), false); //the false flag is to do with multiple message acknowledgement
If you can post some of your consumer code then I might be able to help further...but in the mean time take a look at BlockingQueueConsumer specifically: the constructor you will see that you can set the AcknowledgeMode and also take a look at the nextMessage() this will return a Message object which contains a method called getDeliveryTag() this will return a Long which is the ID that you would send back on the basicAck
Just to add my 2 cents for another possible reason for messages staying in an unacknowledged state, even though the consumer makes sure to use the basicAck method-
Sometimes multiple instances of a process with an open RabbitMQ connection stay running, one of which may cause a message to get stuck in an unacknowledged state, preventing another instance of the consumer to ever refetch this message.
You can access the RabbitMQ management console (for a local machine this should be available at localhost:15672), and check whether multiple instances get hold of the channel, or if only a single instance is currently active:
Find the redundant running task (in this case - java) and terminate it. After removing the rogue process, you should see the message jumps to Ready state again.

Categories