I have a Spring boot application (v2.2.10.RELEASE) that subscribes to multiple topics in pubSub and pulls async data and sends it to somewhere else. I am not using SpringGCP, just native google libraries
this is my subscriber setting:
// Instantiate an asynchronous message receiver.
MessageReceiver receiver =
(PubsubMessage message, AckReplyConsumer consumer) -> {
messages.add(message);
consumer.ack();
};
Subscriber subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setParallelPullCount(2)
.setFlowControlSettings(flowControlSettings)
.setCredentialsProvider(credentialsProvider)
.setExecutorProvider(executorProvider)
//.setChannelProvider()
.build();
With high traffic and big messages (2 - 4 kb) I encounter this info message:
[grpc-default-worker-ELG-1-1] INFO i.grpc.internal.AbstractClientStream - Received data on closed stream
first of all, I don't fully understand what that means? all that I noticed was that when this happens the delivered duplicated messages increase. so I assumed it meant that pubSub tried to reach the subscriber with some messages but the subscriber for some reason was not ready so pubSub will try to deliver the messages again. and hence more duplicates, is that right?
would this problem be solved using the TransportChannelProvider in subscribers? my understanding of the poorly written documentation, that this will create a new channel for delivery when the current in-use channel is closed, hence get rid of the previous log message.
if yes, how do I define the channel target string? and where can I find A NameResolver-compliant URI for the mangagedChannel. the snippet I mean is this:
private TransportChannelProvider getChannelProvider() {
ManagedChannel channel = ManagedChannelBuilder.forTarget(target).usePlaintext(true).build();
return FixedTransportChannelProvider.create(GrpcTransportChannel.create(channel));
}
I am pretty new to GCP so sorry if my question is not coherent enough
Using a custom TransportChannelProvider won't solve this type of issue. This is more likely an issue deeper down in the stack, e.g., at the gRPC level. There have been some open issues for this type of error [1, 2].
With regard to why it is causing duplicates, it is possible that the messages are getting delivered via a stream that is already closed (which aligns with the error message) because they were trapped in a lower-level buffer at the gRPC layer and therefore ended up being duplicates of messages that were subsequently delivered and processed via another stream. This could be a version of the issue discussed in the documentation around large backlogs of small messages. There was a fix for this issue in v1.109.0 of the Java client library, so if you are using a version older than that, it is worth updating.
If duplicates continue to be an issue, it would be best to reach out to support with the name of your subscription and the message IDs of some of the duplicate messages so that they can look at the delivery patterns for those messages and further diagnose if these redeliveries are unexpected.
Related
We have a streams application that consumes messages from a source topic, does some processing and forward the results to a destination topic.
The structure of the messages are controlled by some avro schemas.
When starting consuming messages if the schema is not cached yet the application will try to retrieve it from schema registry. If for whichever reason the schema registry is not available (say a network glitch) then the currently being processed message is lost because the default handler is something called LogAndContinueExceptionHandler.
o.a.k.s.e.LogAndContinueExceptionHandler : Exception caught during Deserialization, taskId: 1_5, topic: my.topic.v1, partition: 5, offset: 142768
org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 62
Caused by: java.net.SocketTimeoutException: connect timed out
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:na]
...
o.a.k.s.p.internals.RecordDeserializer : stream-thread [my-app-StreamThread-3] task [1_5] Skipping record due to deserialization error. topic=[my.topic.v1] partition=[5] offset=[142768]
...
So my question is what would be the proper way of dealing with situations like described above and make sure you don't lose messages no matter what. Is there an out of the box LogAndRollbackExceptionHandler error handler or a way of implementing your own?
Thank you in advance for your inputs.
I've not worked a lot on Kafka, but when i did, i remember having issues such as the one you are describing in our system.
Let me tell you how we took care of our scenarios, maybe it would help you out too:
Scenario 1: If your messages are being lost at the publishing side (publisher --> kafka), you can configure Kafka acknowledgement setting according to your need, if you use spring cloud stream with kafka, the property is spring.cloud.stream.kafka.binder.required-acks
Possible values:
At most once (Ack=0)
Publisher does not care if Kafka acknowledges or not.
Send and forget
Data loss is possible
At least once (Ack=1)
If Kafka does not acknowledge, publisher resends message.
Possible duplication.
Acknowledgment is sent before message is copied to replicas.
Exactly once (Ack=all)
If Kafka does not acknowledge, publisher resends message.
However, if a message gets sent more than once to Kafka, there is no duplication.
Internal sequence number, used to decide if message has already been written on topic or not.
Min.insync.replicas property needs to be set to ensure what is the minimum number of replices that need to be synced before kafka acknowledges to the producer.
Scenario 2: If your data is being lost at the consumer side (kafka --> consumer), you can change the Auto Commit feature of Kafka according to your usage. This is the property if you are using Spring cloud stream spring.cloud.stream.kafka.bindings.input.consumer.AutoCommitOffset.
By default, AutoCommitOffset is true in kafka, and every message that is sent to the consumer is "committed" at Kafka's end, meaning it wont be sent again. However if you change AutoCommitOffset to false, you will have the power to poll the message from kafka in your code, and once you are done with your work, explicitly set commit to true to let kafka know that now you are done with the message.
If a message is not committed, kafka will keep resending it until it is.
Hope this helps you out, or atleast points you in the right direction.
I am using:
spring-boot 2.2.10
spring-cloud-gcp-pubsub 1.2.5
google-cloud-pubsub 1.108.0
google-cloud-core 1.93.7
gax 1.57.1
grpc-core 1.30.2
I am consuming messages of different sized from a GCP subscription. When a "big" message is sent to my client library:
1º It never reaches my listener code (I put a dumb logger)
2º I can see "Received data on closed stream"
3º Message is never acked, never dequed, never dlq-ed
4º Message is sent to my service over and over (sent count metric keeps going)
I know gRPC max size problem was already solved time ago, so gRPC keepAlive ... so I am lost on leads to investigate.
I've got following problem. I'm using RabbitTemplate class from spring-rabbit-2.0.5.RELEASE. And send messeges to different exchanges using it. By default everything work fine. But when one of the exchanges is being deleted and there is a lot of messages to process there is a problem with sending messages to existing exchange - but no error is being thrown - messages are just silently dropped.
Code can be simplify to this. In given scenario after deleting exchange EX2 - Only part of the messages would be sent to EX1. Simple fix for that would be add a Thread.sleep(50) after each send - but this obviously unacceptable.
RabbitTemplate rabbitTemplate = new RabbitTemplate();
for (int i = 0; i < 1000; i++) {
rabbitTemplate.send("EX1", "RK1", someMessage);
rabbitTemplate.send("EX2", "RK2", someMessage);
}
After doing some investigations I came to following conclusions:
1) I'm reusing an existing channel - which is obvious
2) After sending message to non existing exchange channel is being closed and unfortunately it seems that is being closed by Rabbit itself and shutdown message is being send asynchronously to driver
3) After getting message about closed connection driver recreate a channel but messages sent in the meantime are lost
One of the possible solution would be having different channel for each exchange (it will work in my case as I'm sending messages only to couple of exchanges (less then 10)).
But in general it seems that this is just expected behaviour of RabbitTemplate (when your are not using confirmations)
I think you need to study what is Publisher Confirms and Returns: https://docs.spring.io/spring-amqp/docs/2.1.3.RELEASE/reference/html/_reference.html#cf-pub-conf-ret
Also follow the link about Scoped Operations.
I'm trying to build a custom mq exit to archive messages that hit a queue. I have the following code.
class MyMqExits implements WMQSendExit, WMQReceiveExit{
#Override
public ByteBuffer channelReceiveExit(MQCXP arg0, MQCD arg1, ByteBuffer arg2) {
// TODO Auto-generated method stub
if ( arg2){
def _bytes = arg2.array()
def results = new String(_bytes)
println results;
}
return arg2;
}
...
The content of the message (header/body) is in the byte buffer, along with some unreadable binary information. How can I parse the message (including the body and the queue name) from arg2? We've gone through IBM's documentation, but haven't found an object or anything that makes this easy.
Assuming the following two points:
1) Your sender application has not hard coded the queue name where it puts messages. So you can change the application configuration to send messages to a different object.
2) MessageId of the archived message is not important, only message body is important.
Then one alternative I can think of is to create an Alias queue that resolves to a Topic and use two subscribers to receive messages.
1) Subscriber 1: An administratively defined durable subscriber with a queue provided to receive messages. Provide the same queue name from which your existing consumer application is receiving messages.
2) Subscriber 2: Another administratively defined durable subscriber with queue provided. You can write a simple java application to get messages from this queue and archive.
3) Both subscribers subscribe to the same topic.
Here are steps:
// Create a topic
define topic(ANY.TOPIC) TOPICSTR('/ANY_TOPIC')
// Create an alias queue that points to above created topic
define qalias(QA.APP) target(ANY.TOPIC) targtype(TOPIC)
// Create a queue for your application that does business logic. If one is available already then no need to create.
define ql(Q.BUSLOGIC)
// Create a durable subscription with destination queue as created in previous step.
define sub(SB.BUSLOGIC) topicstr('/ANY_TOPIC') dest(Q.BUSLOGIC)
// Create a queue for application that archives messages.
define ql(Q.ARCHIVE)
// Create another subscription with destination queue as created in previous step.
define sub(SB.ARCHIVE) topicstr('/ANY_TOPIC') dest(Q.ARCHIVE)
Write a simple MQ Java/JMS application to get messages from Q.ARCHIVE and archive messages.
A receive exit is not going to give you the whole message. Send and receive exits operate on the transmission buffers sent/received by channels. These will contain various protocol flows which are not documented because the protocol is not public, and part of those protocol flows will be chunks of the messages broken down to fit into 32Kb chunks.
You don't give enough information in your question for me to know what type of channel you are using, but I'm guessing it's on the client side since you are writing it in Java and that is the only environment where that is applicable.
Writing the exit at the client side, you'll need to be careful you deal with the cases where the message is not successfully put to the target queue, and you'll need to manage syncpoints etc.
If you were using QMgr-QMgr channels, you should use a message exit to capture the MQXR_MSG invocations where the whole message is given to you. If you put any further messages in a channel message exit, the messages you put are included in the channel's Syncpoint and so committed if the original messages were committed.
Since you are using client-QMgr channels, you could look at an API Exit on the QMgr end (currently client side API Exits are only supported for C clients) and catch all the MQPUT calls. This exit would also give you the MQPUT return codes so you could code your exit to look out for, and deal with failed puts.
Of course, writing an exit is a complicated task, so it may be worth finding out if there are any pre-written tools that could do this for you instead of starting from scratch.
I fully agree with Morag & Shashi, wrong approach. There is an open source project called Message Multiplexer (MMX) that will get a message from a queue and output it to one or more queues. Context information is maintained across the message put(s). For more info on MMX go to: http://www.capitalware.com/mmx_overview.html
If you cannot change the source or target queues to insert MMX into the mix then an API Exit may do the trick. Here is a blog posting about message replication via an API Exit: http://www.capitalware.com/rl_blog/?p=3304
This is quite an old question but it's worth replying with an update that's relevant to MQ 9.2.3 or later. There is a new feature called Streaming Queues (see https://www.ibm.com/docs/en/ibm-mq/9.2?topic=scenarios-streaming-queues) and one of the use-cases it is designed to support is putting a copy of every message sent to a given queue, to an alternative queue. Another application can then consume the duplicate messages and archive them separately to the application that is processing the original messages.
I'm debugging some Java code that uses Apache POI to pull data out of Microsoft Office documents. Occasionally, it encounter a large document and POI crashes when it runs out of memory. At that point, it tries to publish the error to RabbitMQ, so that other components can know that this step failed and take the appropriate actions. However, when it tries to publish to the queue, it gets a com.rabbitmq.client.AlreadyClosedException (clean connection shutdown; reason: Attempt to use closed channel).
Here's the error handler code:
try {
//Extraction and indexing code
}
catch(Throwable t) {
// Something went wrong! We'll publish the error and then move on with
// our lives
System.out.println("Error received when indexing message: ");
t.printStackTrace();
System.out.println();
String error = PrintExc.format(t);
message.put("error", error);
if(mime == null) {
mime = "application/vnd.unknown";
}
message.put("mime", mime);
publish("IndexFailure", "", MessageProperties.PERSISTENT_BASIC, message);
}
For completeness, here's the publish method:
private void publish(String exch, String route,
AMQP.BasicProperties props, Map<String, Object> message) throws Exception{
chan.basicPublish(exch, route, props,
JSONValue.toJSONString(message).getBytes());
}
I can't find any code within the try block that appears to close the RabbitMQ channel. Are there any circumstances in which the channel could be closed implicitly?
EDIT: I should note that the AlreadyClosedException is thrown by the basicPublish call inside publish.
An AMQP channel is closed on a channel error. Two common things that can cause a channel error:
Trying to publish a message to an exchange that doesn't exist
Trying to publish a message with the immediate flag set that doesn't have a queue with an active consumer set
I would look into setting up a ShutdownListener on the channel you're trying to use to publish a message using the addShutdownListener() to catch the shutdown event and look at what caused it.
Another reason in my case was that by mistake I acknowledged a message twice. This lead to RabbitMQ errors in the log like this after the second acknowledgment.
=ERROR REPORT==== 11-Dec-2012::09:48:29 ===
connection <0.6792.0>, channel 1 - error:
{amqp_error,precondition_failed,"unknown delivery tag 1",'basic.ack'}
After I removed the duplicate acknowledgement then the errors went away and the channel did not close anymore and also the AlreadyClosedException were gone.
I'd like to add this information for other users who will be searching for this topic
Another possible reason for Receiving a Channel Closed Exception is when Publishers and Consumers are accessing Channel/Queue with different queue declaration/settings
Publisher
channel.queueDeclare("task_queue", durable, false, false, null);
Worker
channel.queueDeclare("task_queue", false, false, false, null);
From RabbitMQ Site
RabbitMQ doesn't allow you to redefine an existing queue with different parameters and will return an error to any program that tries to do that
Apparently, there are many reasons for the AMQP connection and/or channels to close abruptly. In my case, there was too many unacknowledged messages on the queue because the consumer didn't specify the prefetch_count so the connection was getting terminated every ~1min. Limiting the number of unacknowledged messages by setting the consumer's prefetch count to a non-zero value fixed the problem.
channel.basicQos(100);
For those who wonder why their consuming channels are closing, check if you try to Ack or Nack a delivery more than once.
In the rabbitmq log you would see messages like:
operation basic.ack caused a channel exception precondition_failed:
unknown delivery tag ...
I also had this problem. The reason for my case was that, first I built the queue with durable = false and in the log file I had this error message when I switched durable to true:
"inequivalent arg 'durable' for queue 'logsQueue' in vhost '/':
received 'true' but current is 'false'"
Then, I changed the name of the queue and it worked for me. I assumed that the RabbitMQ server keeps the record of the built queues somewhere and it cannot change the status from durable to non-durable and vice versa.
Again I made durable=false for the new queue and this time I got this error
"inequivalent arg 'durable' for queue 'logsQueue1' in vhost '/':
received 'false' but current is 'true'"
My assumption was true. When I listed the queues in rabbitMQ server by:
rabbitmqctl list_queues
I saw both queues in the server.
To summarize, 2 solutions are:
1. renaming the name of the queue which is not a good solution
2. resetting rabbitMQ by:
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl start_app