I have a problem of efficiency in my project which uses Camel with the Esper component.
I have several external datasources feeding information to camel endpoints. Each Camel endpoint that receives data transfers it to a route that processes it and then delivers it at an Esper endpoint.
The image below illustrates this behavior:
The efficiency problem is that all of this is done by a single Java thread. Thus if I have many sources, there is a huge bottleneck.
The following code accurately illustrates what is going on with the image:
public final void configure() throws OperationNotSupportedException{
RouteDefinition route = from("xmpp://localhost:5222/?blablabla...");
// apply some filter
FilterDefinition filterDefinition = route.filter().method(...);
// apply main processor
ExpressionNode expressionNode = filterDefinition.process(...);
// set destination
expressionNode = filterDefinition.to("esper://session_X");
}
To fix this problem, I have to handle this situation with a pool of threads or using some sort of parallel processing. I cannot use patterns like multicast, recipient list, etc because all of those send the same message to multiple endpoints / clients, which is not the case in my examples.
A possible solution would be having 1 thread per each "Datasource endpoint -> Route -> Esper endpoint" combination, like the image bellow:
Another possible solution is to have 1 thread receive everything from the datasources, and then dispatch it to multiple threads handling the route processing together with the other endpoint:
PS: I am open to any other possible suggestions you may have.
To achieve one of these I have considered using the Camel SEDA component component, however, this one does not seem to allow me to have dynamic thread pools, because the concurrentConsumers property is static. Furthermore, I am not sure if I can use a SEDA endpoint at all, because I believe (although I am not completely sure) that the syntax for an endpoint like .to("seda:esper://session_X?concurrentConsumers=10") is invalid for Camel.
So, at this point I am quite lost and I don't know what to do:
- Is SEDA the solution I am looking for?
- If yes, how do I integrate it with the Esper endpoint given the syntax problem?
- Are there any other solutions / Camel components that could fix my problem?
You must define a separate seda route that is distributing your message to the esper engine such as (using the fluent style):
public final void configure() throws OperationNotSupportedException{
from("xmpp://localhost:5222/?blablabla...")
.filter().method(...)
.process(...)
.to("seda:sub");
from("seda:sub?concurrentConsumers=10)
.to("esper://session_X");
}
That said, seda should only be used if loosing messages is not a problem. Otherwise you should use a more robust protocol such as jms that allows to persist messages.
EDIT:
Beside seda, you could use threads(), where you could customize the threading behaviour by defining an ExecutorService:
public final void configure() throws OperationNotSupportedException{
from("xmpp://localhost:5222/?blablabla...")
.filter().method(...)
.process(...)
.threads()
.executorService(Executors.newFixedThreadPool(2))
.to("esper://session_X");
}
If you using seda or threads(), you may loose transaction safety in case of failures. For this case or if you need to balance the workload to several remote hosts, you may use jms. More information about this solution is found here.
Related
I am using spring-kafka to implement a consumer that reads messages from a certain topic. All of these messages are processed by them being exported into another system via a REST API. For that, the code uses the WebClient from the Spring Webflux project, which results in reactive code:
#KafkaListener(topics = "${some.topic}", groupId = "my-group-id")
public void listenToTopic(final ConsumerRecord<String, String> record) {
// minimal, non-reactive code here (logging, serizializing the string)
webClient.get().uri(...).retrieve().bodyToMono(String.class)
// long, reactive chain here
.subscribe();
}
Now I am wondering if this setup is reasonable or if this could cause a lot of issues because the KafkaListener logic from spring-kafka isn't inherently reactive. I wonder if it is necessary to use reactor-kafka instead.
My understanding of the whole reactive world and also the kafka world is very limited, but here is what I am currently assuming the above setup would entail:
The listenToTopic function will almost immediately return, because the bulk of the work is done in a reactive chain, which will not block the function from returning. This means that, from what I understand, the KafkaListener logic will assume that the message is properly processed right there and then, so it will probably acknowledge it and at some point also commit it. If I understand correctly, then that means that the processing of the messages could get out of order. Work could still be done in the previous, reactive chain while the KafkaListener already fetches the next record. This means if the application relies on the messages being fully processed in strict order, then the above setup would be bad. But if it does not, then the above setup would be okay?
Another issue with the above setup is that the application could overload itself with work if there are a lot of messages coming in. Because the listener function returns almost immediately, a large amount of messages could be processing inside of reactive chains at the same time.
The retry-logic that comes built in with the #KafkaListener logic would not really work here, because exceptions inside of the reactive chain would not trigger it. Any retry-logic would have to be handled by the reactive code inside of the listener function itself.
When using reactor-kafka instead of the #KafkaListener annotation, one could change the behaviour described in point 1. Because the listener would now be integrated into the reactive chain, it would be possible to acknowledge a message only when the reactive chain has actually finished. This way, from what I understand, the next message will only be fetched after one message is fully processed via the reactive chain. This would probably solve the issues/behaviour described in point 2-4 as well.
The question: Is my understanding of the situation correct? Are there other issues that could be caused by this setup that I have missed?
Your understanding is correct; either switch to a non-reactive rest client (e.g. RestTemplate) or use reactor-kafka for the consumer.
I am creating two apache camel (blueprint XML) kafka projects, one is kafka-producer which accepts requests and stores it in kafka server, and other is kafka-consumer which picks ups messages from kafka server and processes them.
This setup is working fine for single topic and single consumer. However how do I create separate consumer groups within same kafka topic? How to route multiple consumer specific messages within same topic inside different consumer groups? Any help is appreciated. Thank you.
Your question is quite general as it's not very clear what's the problem you are trying to solve, therefore it's hard to understand if there's a better way to implement the solution.
Anyway let's start by saying that, as far as I can understand, you are looking for a Selective Consumer (EIP) which is something that's not supported out-of-the-box by Kafka and Consumer API. Selective Consumer can choose what message to pick from the queue or topic based on specific selectors' values that are put in advance by a producer. This feature must be implemented in the message broker as well, but kafka has not such a capability.
Kafka does implement a hybrid solution between pure pub/sub and queue. That being said, what you can do is subscribing to the topic with one or more consumer groups (more on that later) and filter out all messages you're not interested in, by inspecting messages themselves. In the messaging and EIP world, this pattern is known as Array of Filters. As you can imagine this happen after the message has been broadcasted to all subscribers; therefore if that solution does not fit your requirements or context, then you can think of implementing a Content Based Router which is intended to dispatch the message to a subset of consumers only under your centralized control (this would imply intermediate consumer-specific channels that could be other Kafka topics or seda/VM queues, of course).
Moving to the second question, here is the official Kafka Component website: https://camel.apache.org/components/latest/kafka-component.html.
In order to create different consumer groups, you just have to define multiple routes each of them having a dedicated groupId. By adding the groupdId property, you will inform the Consumer Group coordinators (that reside in Kafka brokers) about the existence of multiple separated groups of consumers and brokers will use those in order to discriminate and treat them separately (by sending them a copy of each log message stored in the topic)...
Here is an example:
public void configure() throws Exception {
from("kafka:myTopic?brokers={{kafkaBootstrapServers}}" +
"&groupId=myFirstConsumerGroup"
.log("Message received by myFirstConsumerGroup : ${body}");
from("kafka:myTopic?brokers={{kafkaBootstrapServers}}" +
"&groupId=mySecondConsumerGroup"
.log("Message received by mySecondConsumerGroup : ${body}");
}
As you can see, I created two routes in the same RouteBuilder, not to say in the same Java process. That's a very bad design decision in most of the use cases I can think of, because there's no single responsibility, segregated concerns and they will not scale. But again, it depends on your requirements/context.
Out of completeness, please consider taking a look at all other Kafka Component properties, as there may be many other configurations of your interest such as the number of consumer threads per group.
I tried to stay high level, in order to initiate the discussion... I'll edit my answer in case of new updates from you. Hope I helped!
Can I make concurrent calls using Spring JMSTemplate?
I want to make 4 external service calls in parallel and am exploring using Spring's JMSTemplate to perform these calls in parallel and wait for the execution to complete.
The other option that I am looking at is to use ExecutorService.
Is there any advantage using one over the other?
JMSTemplate is thread-safe, so making parallel calls to it is not a problem.
Messaging services are usually fast enough for most tasks and can receive your messages with minimal latency, so adding an ExecutorService doesn't seem as the first thing you usually need. What you really need is to correctly configure your JMS connections pool and give it enough open connections (four in your case) so it could handle your parallel requests with no blocking.
You only need ExecutorService in case you don't care about guaranteed delivery and your program needs extremely high speed that your messaging service cannot deliver, which is highly unlikely.
As for receiving replies from your external service, you need to use JMS Request/Reply pattern (you can find examples in this article). Happily, as you're using Spring, you could make Spring Integration do lots of work for you. You need to configure outbound-gateway to send messages and inbound-gateway to receive responses. Since version 2.2 you can also use reply-listener to simplify things on your client side. All these components are covered in the official documentation (with examples as well).
So need to talk to more than two JMS queues (send and or receive) parallel using asynchronous methods. Best option is usng #Asynch at method level
This example contains RestTemplate , But in your case create JmsTemplate beans.
Prerequisites:- Please create proper JMS Beans to connect to the queue. Proper use of this will help to invoke two queues paralleley. Its works for sure because already I have implemented. I just giving the skeleton due to Copyright issues.
More details: Spring Boot + Spring Asynch
https://spring.io/guides/gs/async-method/
Step1: Create a Service Class where JMS Queue
#EnableAsynch
public class JMSApplication {
#Autowired
JmsService jmsService;
public void invokeMe(){
// Start the clock
long start = System.currentTimeMillis();
// Kick of multiple, asynchronous lookups
Future<Object> queue1= jmsService.findqueue1();
Future<Object> queue2= jmsService.findqueue2();
// Wait until they are all done
while (!(queue1.isDone() && queue2.isDone())) {
Thread.sleep(10); //10-millisecond pause between each check
}
// Print results, including elapsed time
System.out.println("Elapsed time: " + (System.currentTimeMillis() - start));
System.out.println(queue1.get());
System.out.println(queue2.get());
}
}
Step2: Write the Service Method which will contain the business logic
for Jms
#Service
public Class JmsService{
#Asynch
public Object findqueue1(){
//Logic to invoke the JMS queue
}
#Asynch
public Object findqueue2(){
//Logic to invoke the JMS queue
}
}
I have a requirement in my java web application where I need to send email alerts for certain conditions. For this I have used javax mail api and sending email works just fine. But the problem is the programs executions waits until the methods for sending the email are executed. As there are hundreds of email to be sent at various points ... this reduces the performance significantly.
I am using spring and have also used spring aop. Can anyone suggest me how can I separate my business logic and sending email functionality. It should be like -
Sending emails is my advice which gets executed when xyz method is called - So main execution should not wait for advice to finish its execution rather it should return back and execute further business logic thus email sending executed separately.
Here creating new threads seems obvious choice. But I think there could be some better way, is there? Thanks.
You can make the mail sending method #Async. This way Spring will execute this in a seperate thread. Read this blog post about it: Creating Asynchronous Methods
What you describe is asynchronous execution and natural way to do async execution is Java is to use threads.
You can introduce some Executor, e.g., Executors.newFixedThreadPool(), and use it to offload mailing task into separate threads.
Aspect itself is a unsuitable place for this, since this would introduce state into aspect, for example, you may want to check if mail task was successful by using returned Future:
class Mailer {
private final ExecutorService executor = Executors.newFixedThreadPool(maxMailingThreads);
//...
public void doMail(MailTask anEmail) {
Future<MailTaskResult> future = executor.submit(new MailTask(anEmail));
future.get().isSuccessful(); // handle success or failure somehow
}
Better move this logic into separate class and call it from aspect somehow.
Treat the email sending functionality like an IO device. Make it a plugin to your business logic. Do not allow any knowledge of the fact that you're even talking to the email code into your business logic. Make the email logic depend on the business logic. Never the other way around.
Here's a very good talk about this kind of architecture:
https://vimeo.com/97530863
Here's a series debating it:
https://www.youtube.com/watch?v=z9quxZsLcfo
Here's a ruby master demonstrating it with real code. We miss him.
https://www.youtube.com/watch?v=tg5RFeSfBM4
If your business rules are interesting enough to be worth respecting than this is the way to make them the masters of your application. Express them only using java. Don't accept any help. No spring, no weird annotations, just business rules. Push all that "help" out to the mail code.
Do this and your app will scale well. I think this is the best way to put it:
That's from a hexagonal architecture post. But the idea of giving your business rules a safe place to live removed from implementation detail shows up in many architectures. This answer rounds them up nicely.
Use a localhost MTA (like OpenSMTPD) and then relay to your real SMTP server, like Amazon SES ("Satellite" mode). It won't block.
I did a test, and sent 1000 emails in 2.8 seconds this way
It's simpler than doing async in java, and is useful across multiple applications.
As for separating logic, raise a Spring Application Event when needed, and make another class to listen to it, and send your email from there. Or consider something like Guava's EventBus
Consider creating a separate thread to send emails within your application. This will allow parallel execution(application+email sending).
If you would want another approach you can create a separate back end application that only sends emails. Although you will need to submit the email messages to the application. An asynchronous way to do this is to send a JMS message to the email application.
On an ESB like Apache Camel, what mechanism is actually "marching" (pulling/pushing) messages along the routes from endpoint to endpoint?
Does the Camel RouteBuilder just compose a graph of Endpoints and Routes and know which destination/next Endpoint to pass a message to after it visits a certain Endpoint or do the Endpoints themselves know which is the next destination for the message it has processed.
Either way, I'm confused:
If it is the RouteBuilder that knows the "flow" of messages through the system, then this RouteBuilder would need to know the business logic of when to Endpoint A should pass the message next to Endpoint B vs Endpoint C, but in all the Camel examples I see this business logic doesn't exist; and
It seems to be that putting that kind of "flow" business logic in the Endpoints themselves couples them together and defeats some of the basic principles of SOA/ESB/EIP, etc.
Under the hood I believe camel is constructing a pure graph where each node is a Camel endpoint/processor, and each edge is a route between two endpoints (a source and a destination). This graph is precisely what RouteBuilder is building when you invoke its API. When you go to start() a Camel route, the graph is most likely validated and translated into a series of Runnables that need to be executed, and probably uses some kind of custom Executor or thread management to handle these Runnables.
Thus, the execution of the Runnables (processors processing messages as they arrive) are handled by this custom Executor. This is the mechanism that "marches messages along", although the order in which the tasks are queued up is driven by the overarching structure of the graph composed by RouteBuilder.
I suggest to read this QA first
What exactly is Apache Camel?
... and the links it refers to, on some more background about Apache Camel.
The business logic can be any kind of logic, such as a Java bean (POJO). And Camel allows you to access your business logic in a losly coupled fashion. See for example these links
http://camel.apache.org/service-activator.html
http://camel.apache.org/bean-integration.html
http://camel.apache.org/bean.html
http://camel.apache.org/bean-binding.html
http://camel.apache.org/hiding-middleware.html
http://camel.apache.org/spring-remoting.html