KafkaException: Couldn't close the Consumer? - java

Today I've came across this one time exception which I just don't understand how could occur.
Cannot be replicated, happened twice in a row and never again.
Code structure:
try {
Create Apache KafkaConsumer and consume something, do stuff, call commit.
consumer.close()
} Catch (Ex) {
log stuff
} Finally {
if (consumer) consumer.close()
}
Connection was OK - Kafka was alive because my script found and consumed a message from topic.
The thing which I can't understand: in what possible case after committing I wouldn't be able to close consumer? could kafka "died" in the middle of the process?
Btw this is just test automation, script to consume particular message.

Related

Spring Cloud Stream - notice and handle errors in broker

I am fairly new to developing distributed applications with messaging, and to Spring Cloud Stream in particular. I am currently wondering about best practices on how to deal with errors on the broker side.
In our application, we need to both consume and produce messages from/to multiple sources/destinations like this:
Consumer side
For consuming, we have defined multiple #Beans of type java.util.function.Consumer. The configuration for those looks like this:
spring.cloud.stream.bindings.consumeA-in-0.destination=inputA
spring.cloud.stream.bindings.consumeA-in-0.group=$Default
spring.cloud.stream.bindings.consumeB-in-0.destination=inputB
spring.cloud.stream.bindings.consumeB-in-0.group=$Default
This part works quite well - wenn starting the application, the exchanges "inputA" and "inputB" as well as the queues "inputA.$Default" and "inputB.$Default" with corresponding binding are automatically created in RabbitMQ.
Also, in case of an error (e.g. a queue is suddenly not available), the application gets notified immediately with a QueuesNotAvailableException and continuously tries to re-establish the connection.
My only question here is: Is there some way to handle this exception in code? Or, what are best practices to deal with failures like this on broker side?
Producer side
This one is more problematic. Producing messages is triggered by some internal logic, we cannot use function #Beans here. Instead, we currently rely on StreamBridge to send messages. The problem is that this approach does not trigger creation of exchanges and queues on startup. So when our code calls streamBridge.send("outputA", message), the message is sent (result is true), but it just disappears into the void since RabbitMQ automatically drops unroutable messages.
I found that with this configuration, I can at least get RabbitMQ to create exchanges and queues as soon as the first message is sent:
spring.cloud.stream.source=produceA;produceB
spring.cloud.stream.default.producer.requiredGroups=$Default
spring.cloud.stream.bindings.produceA-out-0.destination=outputA
spring.cloud.stream.bindings.produceB-out-0.destination=outputB
I need to use streamBridge.send("produceA-out-0", message) in code to make it work, which is not too great since it means having explicit configuration hardcoded, but at least it works.
I also tried to implement the producer in a Reactor style as desribed in this answer, but in this case the exchange/queue also is not created on application startup and the sent message just disappears even though the return status of the sending method is "OK".
Failures on the broker side are not registered at all with this approach - when I simulate one e.g. by deleting the queue or the exchange, it is not registered by the application. Only when another message is sent, I get in the logs:
ERROR 21804 --- [127.0.0.1:32404] o.s.a.r.c.CachingConnectionFactory : Shutdown Signal: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - no exchange 'produceA-out-0' in vhost '/', class-id=60, method-id=40)
But still, the result of StreamBridge#send was true in this case. But we need to know that sending did actually fail at this point (we persist the state of the sent object using this boolean return value). Is there any way to accomplish that?
Any other suggestions on how to make this producer scenario more robust? Best practices?
EDIT
I found an interesting solution to the producer problem using correlations:
...
CorrelationData correlation = new CorrelationData(UUID.randomUUID().toString());
messageHeaderAccessor.setHeader(AmqpHeaders.PUBLISH_CONFIRM_CORRELATION, correlation);
Message<String> message = MessageBuilder.createMessage(payload, messageHeaderAccessor.getMessageHeaders());
boolean sent = streamBridge.send(channel, message);
try {
final CorrelationData.Confirm confirm = correlation.getFuture().get(30, TimeUnit.SECONDS);
if (correlation.getReturned() == null && confirm.isAck()) {
// success logic
} else {
// failed logic
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
// failed logic
} catch (ExecutionException | TimeoutException e) {
// failed logic
}
using these additional configurations:
spring.cloud.stream.rabbit.default.producer.useConfirmHeader=true
spring.rabbitmq.publisher-confirm-type=correlated
spring.rabbitmq.publisher-returns=true
This seems to work quite well, although I'm still clueless about the return value of StreamBridge#send, it is always true and I cannot find information in which cases it would be false. But the rest is fine, I can get information on issues with the exchange or the queue from the correlation or the confirm.
But this solution is very much focused on RabbitMQ, which causes two problems:
our application should be able to connect to different brokers (e.g. Azure Service Bus)
in tests we use Kafka binder and I don't know how to configure the application context to make it work in this case, too
Any help would be appreciated.
On the consumer side, you can listen for an event such as the ListenerContainerConsumerFailedEvent.
https://docs.spring.io/spring-amqp/docs/current/reference/html/#consumer-events
On the producer side, producers only know about exchanges, not any queues bound to them; hence the requiredGroups property which causes the queue to be bound.
You only need spring.cloud.stream.default.producer.requiredGroups=$Default - you can send to arbitrary destinations using the StreamBridge and the infrastructure will be created.
#SpringBootApplication
public class So70769305Application {
public static void main(String[] args) {
SpringApplication.run(So70769305Application.class, args);
}
#Bean
ApplicationRunner runner(StreamBridge bridge) {
return args -> bridge.send("foo", "test");
}
}
spring.cloud.stream.default.producer.requiredGroups=$Default

How to handle RMQ connection loss?

We're using Java RMQ client in Scala and we're experiencing some issues on DEV environment. We have this fallback strategy set up:
def addConnectionShutdownListener(connection: Connection): Unit ={
connection.addShutdownListener { cause: ShutdownSignalException =>
logger.error(s"Error on RMQ connection: ${cause.getMessage}", cause)
if (exitOnFail) {
logger.info("Terminating process with RMQ consumer is shut down")
System.exit(1)
}
else if (retryOnFail) {
logger.info(s"Retrying to connect")
retryCreatingConnection(1)
}
}
}
addConnectionShutdownListener(rmqConnection)
In a similar fashion, I added channel connection shutdown listener.
So there are 2 strategies which we use (and modify through config)
exit on fail
retry on fail
I set up exit on fail strategy and sometimes it works correctly. I see this line on log when error happens Terminating process with RMQ consumer is shut down and service is restarted correctly (kubernetes pod is shut down and it is started automatically again). I disabled RMQ auto recovery because it didn't worked at all.
The problem is sometimes some queues are left without consumers and messages are being queued and hanging, but there is no this error message in log. It's really hard to test it, since I don't know what circumstances happened on our DEV environment.
What could happen?
Is there a better way to handle a connection loss, or to be more precise - to handle a scenario when consumers are somehow detached from queue?
Thanks in advance,
Amer

Java - Vertx - Publish–Subscribe pattern : publishing message inside its own consumer. Is this a bad idea?

I am new to Publish–subscribe pattern, and we are using Vertx in our application.
I am trying to do this for some usecase, where I am publishing inside its own consumer:
private void functionality() {
EventBus eb = Vertx.currentContext().owner().eventBus();
MessageConsumer<String> consumer = eb.consumer("myAddress");
consumer.handler(message -> {
if (condition1) {
doOperationWhichMightChangeTheCondition();
eb.publish("myAddress","Start Operation");
}else {
log.info("Operations on all assets completed");
}
});
eb.publish("myAddress","Start Operation");
}
Is this a bad idea? Can this also lead to StackOverFlow error like recursive calls, or any other issues?
The EventBus.publish method is asynchronous; it does not block to wait for consumers to receive/process the published message. So it is perfectly safe to call publish inside a consumer that will then consume the published message. Internally, Vert.x has Netty schedule another call to the consumer, which will not run until the current invocation (and any other methods scheduled ahead of it on the Netty event loop) complete. You can easily prove this to yourself by writing a test that with a consumer that publishes to the address it is consuming from. You won't see a StackOverFlowError.

How do I stop a Camel route when JVM gets to a certain heap size?

I am using Apache Camel to connect to various endpoints, including JMS topics, and write to a database. Sometimes the database connection fails (for whatever reason, database issue, network blip, etc) and the messages from the topic subscriber start backing up. At a certain point, there are so many messages backed up waiting to be written to the database that the application throws an out of memory error. So far I understand all that.
The problem I have is the following: When the application is frantically trying to garbage collect before eventually giving up and accepting that it is out of memory, the application stops working, but is still alive. This means that the topic subscriber is still seen as active by the JMS provider, but not reading anything off the topic, so the provider starts queueing up the messages. Eventually the provider falls over also when the maximum depth runs out.
How can I configure my application to either disconnect when reaching a certain heap usage, or kill itself completely much much faster when running out of memory? I believe there are some JVM parameters that allow the application to kill itself much quicker when running out of memory, but I am wondering if that is the best solution or whether there is another way?
First of all I think you should use a JDBC connection pool that is capable of refreshing failed connections. So you do not run into the described scenario in the first place. At least not if the DB/network issue is short lived.
Next I'd protect the message broker by applying producer flow control (at least thats how it is called in ActiveMQ). I.e. prevent message producers from submitting more messages if a certain memory threshold has been breached. If the thresholds are set correctly, then that will prevent your message broker from falling over.
As for your original question: I'd use JMX to monitor the VM. If some metric, e.g. memory, breaches a threshold then you can suspend or shut down the route or the whole Camel context via the MBeans Camel exposes.
You can control (start/stop and suspend/resume) Camel routes using the Camel context methods .stop(), .start(), .suspend() and .resume().
You can spin a separate thread that monitors the current VM memory and stops the required route when a certain condition is met.
new Thread() {
#Override
public void run() {
while(true) {
long free = Runtime.getRuntime().freeMemory();
boolean routeRunning = camelContext.isRouteStarted("yourRoute");
if (free < threshold && routeRunning) {
camelContext.stopRoute("yourRoute");
} else if (free > threshold && !routeRunning) {
camelContext.startRoute("yourRoute");
}
// Check every 10 seconds
Thread.sleep(10000);
}
}
}
As commented in the other answer, relying on this is not particularly robust, but at least a little more robust than getting an OutOfMemoryException. Note that you need to .stop() the route, .suspend() does not deallocate resources, which means the connection with the queue provider is still open and the service looks like it is open for business.
You can also stop the route as part of the error handling of the route itself (this is possibly more robust but would require manual intervention to restart the route once the error is cleared, or a scheduled route that periodically checks if the error condition still exists and restart the route if it is gone). The thing to keep in mind is that you cannot stop a route from the same thread that is servicing the route at the time so you need to spin a separate thread that does the stopping. For example:
route("sample").from("jms://myqueue")
// Handle SQL Exceptions by shutting down the route
.onException(SQLException.class)
.process(new Processor() {
// This processor spawns a new thread that stops the current route
Thread stop;
#Override
public void process(final Exchange exchange) throws Exception {
if (stop == null) {
stop = new Thread() {
#Override
public void run() {
try {
// Stop the current route
exchange.getContext().stopRoute("sample");
} catch (Exception e) {}
}
};
}
// start the thread in background
stop.start();
}
})
.end()
// Standard route processors go here
.to(...);

Does Spring Integration RabbitTemplate publish to persistent queue by default?

I have a scheduled task that performs the following bit of code:
try {
rabbitTemplate.convertAndSend("TEST");
if (!isOn()) {
turnOn();
}
}
catch (AmqpException e) {
if (isOn()) {
turnOff();
}
}
Everything works just fine. It sends this message to the default "AMQP default" exchange. I do not have a consumer on the other end to consume these messages because I am just ensuring that the server is still alive. Will these messages accumulate over time and cause a memory leak?
Thanks!
K
Do you have a RabbitMQ user interface?
You should be able to see the queues that are being created and whether they are persistent or not. Last time I checked, the default behaviour of Spring AMQP is to create persistent queues.
Have a look at the RabbitMQ Management Plugin: http://www.rabbitmq.com/management.html
Using the RabbitMQ Management Plugin, you can also consume messages that you've published via your code.
Regarding what happens with the messages, they will just pile up and pile up until RabbitMQ hits its limits, then it will no longer accept messages until you purge the queue or consume those messages. With the default RabbitMQ settings, I was able to send about 4 million simple text messages to the queue before it started blocking.

Categories