How to handle backpressure when using apache camel and Kafka? - java

I am trying to write an application that will integrate with Kafka using Camel. (Version - 3.4.2)
I have an approach borrowed from the answer to this question.
I have a route that listens for messages from a Kafka topic. The processing of this message is decoupled from consumption by using a simple executor. Each processing is submitted as a task to this executor. The ordering of the messages is not important and the only concerning factor is how quickly and efficiently the message can be processed. I have disabled the auto-commit and manually commit the messages once the tasks are submitted to the executor. The loss of the messages that are currently being processed (due to crash/shutdown) is okay but the ones in Kafka that have never been submitted for the processing should not be lost (due to committing of the offset). Now to the questions,
How can I efficiently handle the load? For e.g, there are 1000 messages but I can only parallelly process 100 at a time.
Right now the solution I have is to block the consumer polling thread and trying to continuously submit the job. But a suspension of polling would be a much better approach but I cannot find any way to achieve that in Camel.
Is there a better way (Camel way) to decouple processing from consumption and handle backpressure?
public static void main(String[] args) throws Exception {
String consumerId = System.getProperty("consumerId", "1");
ExecutorService executor = new ThreadPoolExecutor(100, 100, 0L, TimeUnit.MILLISECONDS,
new SynchronousQueue<>());
LOGGER.info("Consumer {} starting....", consumerId);
Main main = new Main();
main.init();
CamelContext context = main.getCamelContext();
ComponentsBuilderFactory.kafka().brokers("localhost:9092").metadataMaxAgeMs(120000).groupId("consumer")
.autoOffsetReset("earliest").autoCommitEnable(false).allowManualCommit(true).maxPollRecords(100)
.register(context, "kafka");
ConsumerBean bean = new ConsumerBean();
context.addRoutes(new RouteBuilder() {
#Override
public void configure() {
from("kafka:test").process(exchange -> {
LOGGER.info("Consumer {} - Exhange is {}", consumerId, exchange.getIn().getHeaders());
processTask(exchange);
commitOffset(exchange);
});
}
private void processTask(Exchange exchange) throws InterruptedException {
try {
executor.submit(() -> bean.execute(exchange.getIn().getBody(String.class)));
} catch (Exception e) {
LOGGER.error("Exception occured {}", e.getMessage());
Thread.sleep(1000);
processTask(exchange);
}
}
private void commitOffset(Exchange exchange) {
boolean lastOne = exchange.getIn().getHeader(KafkaConstants.LAST_RECORD_BEFORE_COMMIT, Boolean.class);
if (lastOne) {
KafkaManualCommit manual = exchange.getIn().getHeader(KafkaConstants.MANUAL_COMMIT,
KafkaManualCommit.class);
if (manual != null) {
LOGGER.info("manually committing the offset for batch");
manual.commitSync();
}
} else {
LOGGER.info("NOT time to commit the offset yet");
}
}
});
main.run();
}

You can use throttle EIP for this purpose.
from("your uri here")
.throttle(maxRequestCount)
.timePeriodMillis(inTimePeriodMs)
.to(yourProcessorUri)
.end()
Please take a look at the original documentation here.

Related

Pulsar client thread balance

I'm trying to implement a Pulsar client with multiple producers that distributes the load among the threads, but regardless the value passed on ioThreads() and on listenerThreads(), it is always overloading the first thread (> 65% cpu while the other threads are completely idle)
I have tried a few things including this "dynamic rebalancing" every hour(last method) but closing it in the middle of the process certainly is not the best approach
This is the relevant code
...
// pulsar client
pulsarClient = PulsarClient.builder() //
.operationTimeout(config.getAppPulsarTimeout(), TimeUnit.SECONDS) //
.ioThreads(config.getAppPulsarClientThreads()) //
.listenerThreads(config.getAppPulsarClientThreads()) //
.serviceUrl(config.getPulsarServiceUrl()).build();
...
private createProducers() {
String strConsumerTopic = this.config.getPulsarTopicInput();
List<Integer> protCasesList = this.config.getEventProtoCaseList();
for (Integer e : protCasesList) {
String topicName = config.getPulsarTopicOutput().concat(String.valueOf(e));
LOG.info("Creating producer for topic: {}", topicName);
Producer<byte[]> protobufProducer = pulsarClient.newProducer().topic(topicName).enableBatching(false)
.blockIfQueueFull(true).compressionType(CompressionType.NONE)
.sendTimeout(config.getPulsarSendTimeout(), TimeUnit.SECONDS)
.maxPendingMessages(config.getPulsarMaxPendingMessages()).create();
this.mapLink.put(strConsumerTopic.concat(String.valueOf(e)), protobufProducer);
}
}
public void closeProducers() {
String strConsumerTopic = this.config.getPulsarTopicInput();
List<Integer> protCasesList = this.config.getEventProtoCaseList();
for (Integer e : protCasesList) {
try {
this.mapLink.get(strConsumerTopic.concat(String.valueOf(e))).close();
LOG.info("{} producer correctly closed...",
this.mapLink.get(strConsumerTopic.concat(String.valueOf(e))).getProducerName());
} catch (PulsarClientException e1) {
LOG.error("Producer: {} not closed cause: {}",
this.mapLink.get(strConsumerTopic.concat(String.valueOf(e))).getProducerName(),
e1.getMessage());
}
}
}
public void rebalancePulsarThreads(boolean firstRun) {
ThreadMXBean threadHandler = ManagementFactory.getThreadMXBean();
ThreadInfo[] threadsInfo = threadHandler.getThreadInfo(threadHandler.getAllThreadIds());
for (ThreadInfo threadInfo : threadsInfo) {
if (threadInfo.getThreadName().contains("pulsar-client-io")) {
// enable cpu time for all threads
threadHandler.setThreadCpuTimeEnabled(true);
// get cpu time for this specific thread
long threadCPUTime = threadHandler.getThreadCpuTime(threadInfo.getThreadId());
int thresholdCPUTime = 65;
if (threadCPUTime > thresholdCPUTime) {
LOG.warn("Pulsar client thread with CPU time greater than {}% - REBALANCING now", thresholdCPUTime);
try {
closeProducers();
} catch (Exception e) {
if (!firstRun) {
// producers will not be available in the first run
// therefore, the logging only happens when it is not the first run
LOG.warn("Unable to close Pulsar client threads on rebalancing: {}", e.getMessage());
}
}
try {
createPulsarProducers();
} catch (Exception e) {
LOG.warn("Unable to create Pulsar client threads on rebalancing: {}", e.getMessage());
}
}
}
}
}
From what you describe, the most likely scenario is that all the topics you're using are served by one single broker.
If that's indeed the case, and avoiding topic load balancing across brokers, it's normal that it's using a single thread because all these producers will be sharing a single, pooled, TCP connection and each connection is assigned to 1 IO thread (listener threads are used for consumer listeners).
If you want to force more threads, you can increase the "Max TCP connection per each broker" setting, in order to use all the configured IO threads.
eg:
PulsarClient client = PulsarClient.builder()
.serviceUrl("pulsar://localhost:6650")
.ioThreads(16)
.connectionsPerBroker(16)
.create();

Alert if Kafka nodes are down

I have been exposed a Kafka nodes and a topic-name. My web server receives a lot of http request data which I need to process them and then push them to kafka. Sometimes, if the kafka nodes are down, then my server still keeps pumping the data, which results in blowing up my memory and my server gets down.
I want is to stop publishing the data if the Kafka is down. My Java sample code is as follows:
static Producer producer;
Produce() {
Properties properties = new Properties();
properties.put("request.required.acks","1");
properties.put("bootstrap.servers","localhost:9092,localhost:9093,localhost:9094");
properties.put("enabled","true");
properties.put("value.serializer","org.apache.kafka.common.serialization.StringSerializer");
properties.put("kafka-topic","pixel-server");
properties.put("batch.size","1000");
properties.put("producer.type","async");
properties.put("key.serializer","org.apache.kafka.common.serialization.StringSerializer");
producer = new KafkaProducer<String, String>(properties);
}
public static void main(String[] args) {
Produce produce = new Produce();
produce.send(producer, "pixel-server", "Some time");
}
//This method is called lot of times
public void send(Producer<String, String> producer, String topic, String data) {
ProducerRecord<String, String> producerRecord = new ProducerRecord<>(topic, data);
Future<RecordMetadata> response = producer.send(producerRecord, (metadata, exception) -> {
if (null != exception) {
exception.printStackTrace();
} else {
System.out.println("Done");
}
});
I have just abstracted out some sample code. The send method is called numerous times. I just want to prevent send any message if the kafka is down. What is the efficient way to tackle this situation.
If I were you I'll try to implement a circuit breaker. When you hit a reasonable amount of failures while sending your records, circuit breaks and provides some fallback behavior. Once some condition is met (e.g.: some time passed) the circuit close and you'll send records again. Also vertx.io comes with it's own solution.

Stopping a Camel route from outside the route

We have a data processing application that runs on Karaf 2.4.3 with Camel 2.15.3.
In this application, we have a bunch of routes that import data. We have a management view that lists these routes and where each route can be started. Those routes do not directly import data, but call other routes (some of them in other bundles, called via direct-vm), sometimes directly and sometimes in a splitter.
Is there a way to also completely stop a route/therefore stopping the entire exchange from being further processed?
When simply using the stopRoute function like this:
route.getRouteContext().getCamelContext().stopRoute(route.getId());
I eventually get a success message with Graceful shutdown of 1 routes completed in 10 seconds - the exchange is still being processed though...
So I tried to mimic the behaviour of the StopProcessor by setting the stop property, but that also didn't help:
public void stopRoute(Route route) {
try {
Collection<InflightExchange> browse = route.getRouteContext().getCamelContext().getInflightRepository()
.browse();
for (InflightExchange inflightExchange : browse) {
String exchangeRouteId = inflightExchange.getRouteId();
if ((exchangeRouteId != null) && exchangeRouteId.equals(route.getId())) {
this.stopExchange(inflightExchange.getExchange());
}
}
} catch (Exception e) {
Notification.show("Error while trying to stop route", Type.ERROR_MESSAGE);
LOGGER.error(e, e);
}
}
public void stopExchange(Exchange exchange) throws Exception {
AsyncProcessorHelper.process(new AsyncProcessor() {
#Override
public void process(Exchange exchange) throws Exception {
AsyncProcessorHelper.process(this, exchange);
}
#Override
public boolean process(Exchange exchange, AsyncCallback callback) {
exchange.setProperty(Exchange.ROUTE_STOP, Boolean.TRUE);
callback.done(true);
return true;
}
}, exchange);
}
Is there any way to completely stop an exchange from being processed from outside the route?
Can you get an exchange?
I use exchange.setProperty(Exchange.ROUTE_STOP, true);
Route stop flow and doesn't go to next route.

Camel: stop the route when the jdbc connection loss is detected

I have an application with the following route:
from("netty:tcp://localhost:5150?sync=false&keepAlive=true")
.routeId("tcp.input")
.transform()
.simple("insert into tamponems (AVIS) values (\"${in.body}\");")
.to("jdbc:mydb");
This route receives a new message every 59 millisecondes. I want to stop the route when the connection to the database is lost, before that a second message arrives. And mainly, I want to never lose a message.
I proceeded that way:
I added an errorHandler:
errorHandler(deadLetterChannel("direct:backup")
.redeliveryDelay(5L)
.maximumRedeliveries(1)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.logExhausted(false));
My errorHandler tries to redeliver the message and if it fails again, it redirects the message to a deadLetterChannel.
The following deadLetterChannel will stop the tcp.input route and try to redeliver the message to the database:
RoutePolicy policy = new StopRoutePolicy();
from("direct:backup")
.routePolicy(policy)
.errorHandler(
defaultErrorHandler()
.redeliveryDelay(1000L)
.maximumRedeliveries(-1)
.retryAttemptedLogLevel(LoggingLevel.ERROR)
)
.to("jdbc:mydb");
Here is the code of the routePolicy:
public class StopRoutePolicy extends RoutePolicySupport {
private static final Logger LOG = LoggerFactory.getLogger(String.class);
#Override
public void onExchangeDone(Route route, Exchange exchange) {
String stop = "tcp.input";
CamelContext context = exchange.getContext();
if (context.getRouteStatus(stop) != null && context.getRouteStatus(stop).isStarted()) {
try {
exchange.getContext().getInflightRepository().remove(exchange);
LOG.info("STOP ROUTE: {}", stop);
context.stopRoute(stop);
} catch (Exception e) {
getExceptionHandler().handleException(e);
}
}
}
}
My problems with this method are:
In my "direct:backup" route, if I set the maximumRedeliveries to -1 the route tcp.input will never stop
I'm loosing messages during the stop
This method for detecting the connection loss and for stopping the route is too long
Please, does anybody have an idea for make this faster or for make this differently in order to not lose message?
I have finally found a way to resolve my problems. In order to make the application faster, I added asynchronous processes and multithreading with seda.
from("netty:tcp://localhost:5150?sync=false&keepAlive=true").to("seda:break");
from("seda:break").threads(5)
.routeId("tcp.input")
.transform()
.simple("insert into tamponems (AVIS) values (\"${in.body}\");")
.to("jdbc:mydb");
I did the same with the backup route.
from("seda:backup")
.routePolicy(policy)
.errorHandler(
defaultErrorHandler()
.redeliveryDelay(1000L)
.maximumRedeliveries(-1)
.retryAttemptedLogLevel(LoggingLevel.ERROR)
).threads(2).to("jdbc:mydb");
And I modified the routePolicy like that:
public class StopRoutePolicy extends RoutePolicySupport {
private static final Logger LOG = LoggerFactory.getLogger(String.class);
#Override
public void onExchangeBegin(Route route, Exchange exchange) {
String stop = "tcp.input";
CamelContext context = exchange.getContext();
if (context.getRouteStatus(stop) != null && context.getRouteStatus(stop).isStarted()) {
try {
exchange.getContext().getInflightRepository().remove(exchange);
LOG.info("STOP ROUTE: {}", stop);
context.stopRoute(stop);
} catch (Exception e) {
getExceptionHandler().handleException(e);
}
}
}
#Override
public void onExchangeDone(Route route, Exchange exchange) {
String stop = "tcp.input";
CamelContext context = exchange.getContext();
if (context.getRouteStatus(stop) != null && context.getRouteStatus(stop).isStopped()) {
try {
LOG.info("RESTART ROUTE: {}", stop);
context.startRoute(stop);
} catch (Exception e) {
getExceptionHandler().handleException(e);
}
}
}
}
With these updates, the TCP route is stopped before the backup route is processed. And the route is restarted when the jdbc connection is back.
Now, thanks to Camel, the application is able to handle a database failure without losing message and without manual intervention.
I hope this could help you.

Netty slower than Tomcat

We just finished building a server to store data to disk and fronted it with Netty. During load testing we were seeing Netty scaling to about 8,000 messages per second. Given our systems, this looked really low. For a benchmark, we wrote a Tomcat front-end and run the same load tests. With these tests we were getting roughly 25,000 messages per second.
Here are the specs for our load testing machine:
Macbook Pro Quad core
16GB of RAM
Java 1.6
Here is the load test setup for Netty:
10 threads
100,000 messages per thread
Netty server code (pretty standard) - our Netty pipeline on the server is two handlers: a FrameDecoder and a SimpleChannelHandler that handles the request and response.
Client side JIO using Commons Pool to pool and reuse connections (the pool was sized the same as the # of threads)
Here is the load test setup for Tomcat:
10 threads
100,000 messages per thread
Tomcat 7.0.16 with default configuration using a Servlet to call the server code
Client side using URLConnection without any pooling
My main question is why such a huge different in performance? Is there something obvious with respect to Netty that can get it to run faster than Tomcat?
Edit: Here is the main Netty server code:
NioServerSocketChannelFactory factory = new NioServerSocketChannelFactory();
ServerBootstrap server = new ServerBootstrap(factory);
server.setPipelineFactory(new ChannelPipelineFactory() {
public ChannelPipeline getPipeline() {
RequestDecoder decoder = injector.getInstance(RequestDecoder.class);
ContentStoreChannelHandler handler = injector.getInstance(ContentStoreChannelHandler.class);
return Channels.pipeline(decoder, handler);
}
});
server.setOption("child.tcpNoDelay", true);
server.setOption("child.keepAlive", true);
Channel channel = server.bind(new InetSocketAddress(port));
allChannels.add(channel);
Our handlers look like this:
public class RequestDecoder extends FrameDecoder {
#Override
protected ChannelBuffer decode(ChannelHandlerContext ctx, Channel channel, ChannelBuffer buffer) {
if (buffer.readableBytes() < 4) {
return null;
}
buffer.markReaderIndex();
int length = buffer.readInt();
if (buffer.readableBytes() < length) {
buffer.resetReaderIndex();
return null;
}
return buffer;
}
}
public class ContentStoreChannelHandler extends SimpleChannelHandler {
private final RequestHandler handler;
#Inject
public ContentStoreChannelHandler(RequestHandler handler) {
this.handler = handler;
}
#Override
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) {
ChannelBuffer in = (ChannelBuffer) e.getMessage();
in.readerIndex(4);
ChannelBuffer out = ChannelBuffers.dynamicBuffer(512);
out.writerIndex(8); // Skip the length and status code
boolean success = handler.handle(new ChannelBufferInputStream(in), new ChannelBufferOutputStream(out), new NettyErrorStream(out));
if (success) {
out.setInt(0, out.writerIndex() - 8); // length
out.setInt(4, 0); // Status
}
Channels.write(e.getChannel(), out, e.getRemoteAddress());
}
#Override
public void exceptionCaught(ChannelHandlerContext ctx, ExceptionEvent e) {
Throwable throwable = e.getCause();
ChannelBuffer out = ChannelBuffers.dynamicBuffer(8);
out.writeInt(0); // Length
out.writeInt(Errors.generalException.getCode()); // status
Channels.write(ctx, e.getFuture(), out);
}
#Override
public void channelOpen(ChannelHandlerContext ctx, ChannelStateEvent e) {
NettyContentStoreServer.allChannels.add(e.getChannel());
}
}
UPDATE:
I've managed to get my Netty solution to within 4,000/second. A few weeks back I was testing a client side PING in my connection pool as a safe guard against idle sockets but I forgot to remove that code before I started load testing. This code effectively PINGed the server every time a Socket was checked out from the pool (using Commons Pool). I commented that code out and I'm now getting 21,000/second with Netty and 25,000/second with Tomcat.
Although, this is great news on the Netty side, I'm still getting 4,000/second less with Netty than Tomcat. I can post my client side (which I thought I had ruled out but apparently not) if anyone is interested in seeing that.
The method messageReceived is executed using a worker thread that is possibly getting blocked by RequestHandler#handle which may be busy doing some I/O work.
You could try adding into the channel pipeline an OrderdMemoryAwareThreadPoolExecutor (recommended) for executing the handlers or alternatively, try dispatching your handler work to a new ThreadPoolExecutor and passing a reference to the socket channel for later writing the response back to client. Ex.:
#Override
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) {
executor.submit(new Runnable() {
processHandlerAndRespond(e);
});
}
private void processHandlerAndRespond(MessageEvent e) {
ChannelBuffer in = (ChannelBuffer) e.getMessage();
in.readerIndex(4);
ChannelBuffer out = ChannelBuffers.dynamicBuffer(512);
out.writerIndex(8); // Skip the length and status code
boolean success = handler.handle(new ChannelBufferInputStream(in), new ChannelBufferOutputStream(out), new NettyErrorStream(out));
if (success) {
out.setInt(0, out.writerIndex() - 8); // length
out.setInt(4, 0); // Status
}
Channels.write(e.getChannel(), out, e.getRemoteAddress());
}

Categories