Apache Camel: What marches messages along? - java

On an ESB like Apache Camel, what mechanism is actually "marching" (pulling/pushing) messages along the routes from endpoint to endpoint?
Does the Camel RouteBuilder just compose a graph of Endpoints and Routes and know which destination/next Endpoint to pass a message to after it visits a certain Endpoint or do the Endpoints themselves know which is the next destination for the message it has processed.
Either way, I'm confused:
If it is the RouteBuilder that knows the "flow" of messages through the system, then this RouteBuilder would need to know the business logic of when to Endpoint A should pass the message next to Endpoint B vs Endpoint C, but in all the Camel examples I see this business logic doesn't exist; and
It seems to be that putting that kind of "flow" business logic in the Endpoints themselves couples them together and defeats some of the basic principles of SOA/ESB/EIP, etc.

Under the hood I believe camel is constructing a pure graph where each node is a Camel endpoint/processor, and each edge is a route between two endpoints (a source and a destination). This graph is precisely what RouteBuilder is building when you invoke its API. When you go to start() a Camel route, the graph is most likely validated and translated into a series of Runnables that need to be executed, and probably uses some kind of custom Executor or thread management to handle these Runnables.
Thus, the execution of the Runnables (processors processing messages as they arrive) are handled by this custom Executor. This is the mechanism that "marches messages along", although the order in which the tasks are queued up is driven by the overarching structure of the graph composed by RouteBuilder.

I suggest to read this QA first
What exactly is Apache Camel?
... and the links it refers to, on some more background about Apache Camel.
The business logic can be any kind of logic, such as a Java bean (POJO). And Camel allows you to access your business logic in a losly coupled fashion. See for example these links
http://camel.apache.org/service-activator.html
http://camel.apache.org/bean-integration.html
http://camel.apache.org/bean.html
http://camel.apache.org/bean-binding.html
http://camel.apache.org/hiding-middleware.html
http://camel.apache.org/spring-remoting.html

Related

Apache Camel - Kafka component - single producer multiple consumer

I am creating two apache camel (blueprint XML) kafka projects, one is kafka-producer which accepts requests and stores it in kafka server, and other is kafka-consumer which picks ups messages from kafka server and processes them.
This setup is working fine for single topic and single consumer. However how do I create separate consumer groups within same kafka topic? How to route multiple consumer specific messages within same topic inside different consumer groups? Any help is appreciated. Thank you.
Your question is quite general as it's not very clear what's the problem you are trying to solve, therefore it's hard to understand if there's a better way to implement the solution.
Anyway let's start by saying that, as far as I can understand, you are looking for a Selective Consumer (EIP) which is something that's not supported out-of-the-box by Kafka and Consumer API. Selective Consumer can choose what message to pick from the queue or topic based on specific selectors' values that are put in advance by a producer. This feature must be implemented in the message broker as well, but kafka has not such a capability.
Kafka does implement a hybrid solution between pure pub/sub and queue. That being said, what you can do is subscribing to the topic with one or more consumer groups (more on that later) and filter out all messages you're not interested in, by inspecting messages themselves. In the messaging and EIP world, this pattern is known as Array of Filters. As you can imagine this happen after the message has been broadcasted to all subscribers; therefore if that solution does not fit your requirements or context, then you can think of implementing a Content Based Router which is intended to dispatch the message to a subset of consumers only under your centralized control (this would imply intermediate consumer-specific channels that could be other Kafka topics or seda/VM queues, of course).
Moving to the second question, here is the official Kafka Component website: https://camel.apache.org/components/latest/kafka-component.html.
In order to create different consumer groups, you just have to define multiple routes each of them having a dedicated groupId. By adding the groupdId property, you will inform the Consumer Group coordinators (that reside in Kafka brokers) about the existence of multiple separated groups of consumers and brokers will use those in order to discriminate and treat them separately (by sending them a copy of each log message stored in the topic)...
Here is an example:
public void configure() throws Exception {
from("kafka:myTopic?brokers={{kafkaBootstrapServers}}" +
"&groupId=myFirstConsumerGroup"
.log("Message received by myFirstConsumerGroup : ${body}");
from("kafka:myTopic?brokers={{kafkaBootstrapServers}}" +
"&groupId=mySecondConsumerGroup"
.log("Message received by mySecondConsumerGroup : ${body}");
}
As you can see, I created two routes in the same RouteBuilder, not to say in the same Java process. That's a very bad design decision in most of the use cases I can think of, because there's no single responsibility, segregated concerns and they will not scale. But again, it depends on your requirements/context.
Out of completeness, please consider taking a look at all other Kafka Component properties, as there may be many other configurations of your interest such as the number of consumer threads per group.
I tried to stay high level, in order to initiate the discussion... I'll edit my answer in case of new updates from you. Hope I helped!

RabbitMQ Microservices - Parallel processing

I'm working on an application in microservices architecture usingrabbitmq as messaging system.
calls between microservices are asynchronous http requests and each service is subscribed on specific queues
my question is seen that the calls are stateless, how to guarantee the parallelisation of the message commation not by routing-key in rabbitmq queue but by http call itself, that is to say for n call every service must be able to listen to only needed messages .
Sorry for the ambiguity, I'm trying to explain further:
The scenario is that we are in a micro service architecture, due to huge data response the call service will receive the answer in the listener rabbitmq queue.
So let's imagine that two calls are made simultaneously and both queries start loading data into the same queue, the calling service is waiting for messages and adds the received messages but cannot differentiate between the data of caller 1 and caller 2.
Is there a better implementation for the listener
Not sure I understood the question completely, but here is what I can suggest based on the description:
If each service is hooked to a particular listener and you don't want to associate a Routing-Key for the Queue+Listener integration, then can you try having header arguments. [You can use a QueueBuilder.withArguments API to set specific Header values that the Queue is supposed to listen to]
There needs to be a mechanism through which an exchange will bind to a particular queue and consequently to a Listener service.
Publisher -> Exchange ---> (with headers) binds to Queue -> Listener

Backpressure mechanism in Spring Web-Flux

I'm a starter in Spring Web-Flux. I wrote a controller as follows:
#RestController
public class FirstController
{
#GetMapping("/first")
public Mono<String> getAllTweets()
{
return Mono.just("I am First Mono")
}
}
I know one of the reactive benefits is Backpressure, and it can balance the request or the response rate. I want to realize how to have backpressure mechanism in Spring Web-Flux.
Backpressure in WebFlux
In order to understand how Backpressure works in the current implementation of the WebFlux framework, we have to recap the transport layer used by default here. As we may remember, the normal communication between browser and server (server to server communication usually the same as well) is done through the TCP connection. WebFlux also uses that transport for communication between a client and the server.
Then, in order to get the meaning of the backpressure control term, we have to recap what backpressure means from the Reactive Streams specification perspective.
The basic semantics define how the transmission of stream elements is regulated through back-pressure.
So, from that statement, we may conclude that in Reactive Streams the backpressure is a mechanism that regulates the demand through the transmission (notification) of how many elements recipient can consume; And here we have a tricky point. The TCP has a bytes abstraction rather than logical elements abstraction. What we usually want by saying backpressure control is the control of the number of logical elements sent/received to/from the network. Even though the TCP has its own flow control (see the meaning here and animation there) this flow control is still for bytes rather than for logical elements.
In the current implementation of the WebFlux module, the backpressure is regulated by the transport flow control, but it does not expose the real demand of the recipient. In order to finally see the interaction flow, please see the following diagram:
For simplicity, the above diagram shows the communication between two microservices where the left one sends streams of data, and the right one consumes that stream. The following numbered list provides a brief explanation of that diagram:
This is the WebFlux framework that takes proper care for conversion of logical elements to bytes and back and transferring/receiving them to/from the TCP (network).
This is the starting of long-running processing of the element which requests for next elements once the job is completed.
Here, while there is no demand from the business logic, the WebFlux enqueue bytes that come from the network without their acknowledgment (there is no demand from the business logic).
Because of the nature of TCP flow control, Service A may still send data to the network.
As we may notice from the diagram above, the demand exposed by the recipient is different from the demand of the sender (demand here in logical elements). It means that the demand of both is isolated and works only for WebFlux <-> Business logic (Service) interaction and exposes less the backpressure for Service A <-> Service B interaction. All that means that the backpressure control is not that fair in WebFlux as we expect.
All that means that the backpressure control is not that fair in WebFlux as we expect.
But I still want to know how to control backpressure
If we still want to have an unfair control of backpressure in WebFlux, we may do that with the support of Project Reactor operators such as limitRate(). The following example shows how we may use that operator:
#PostMapping("/tweets")
public Mono<Void> postAllTweets(Flux<Tweet> tweetsFlux) {
return tweetService.process(tweetsFlux.limitRate(10))
.then();
}
As we may see from the example, limitRate() operator allows defining the number of elements to be prefetched at once. That means that even if the final subscriber requests Long.MAX_VALUE elements, the limitRate operator split that demand into chunks and does not allow to consume more than that at once. The same we may do with elements sending process:
#GetMapping("/tweets")
public Flux<Tweet> getAllTweets() {
return tweetService.retreiveAll()
.limitRate(10);
}
The above example shows that even if WebFlux requests more then 10 elements at a time, the limitRate() throttles the demand to the prefetch size and prevents to consume more than the specified number of elements at once.
Another option is to implement own Subscriber or extend the BaseSubscriber from Project Reactor. For instance, The following is a naive example of how we may do that:
class MyCustomBackpressureSubscriber<T> extends BaseSubscriber<T> {
int consumed;
final int limit = 5;
#Override
protected void hookOnSubscribe(Subscription subscription) {
request(limit);
}
#Override
protected void hookOnNext(T value) {
// do business logic there
consumed++;
if (consumed == limit) {
consumed = 0;
request(limit);
}
}
}
Fair backpressure with RSocket Protocol
In order to achieve logical-elements backpressure through the network boundaries, we need an appropriate protocol for that. Fortunately, there is one called RScoket protocol. RSocket is an application-level protocol that allows transferring real demand through the network boundaries.
There is an RSocket-Java implementation of that protocol that allows to set up an RSocket server. In the case of a server to server communication, the same RSocket-Java library provides a client implementation as well. To learn more how to use RSocket-Java, please see the following examples here.
For browser-server communication, there is an RSocket-JS implementation which allows wiring the streaming communication between browser and server through WebSocket.
Known frameworks on top of RSocket
Nowadays there are a few frameworks, built on top of the RSocket protocol.
Proteus
One of the frameworks is a Proteus project which offers full-fledged microservices built on top of RSocket. Also, Proteus is well integrated with Spring framework so now we may achieve a fair backpressure control (see examples there)
Further readings
https://www.netifi.com/proteus
https://medium.com/netifi
http://scalecube.io/

Handling JMS-acknowledgements in Camel

We are implementing a distributed system based (among others) upon JMS and REST calls. Currently we are looking at two components A and B. Component A reads from an a ActiveMQ-Queue (via Camel-from), processes the message and sends it on to B via REST (this is done via Camel .to/.inOnly). B processes the message further.
In A this looks roughly like this:
from(activeMqInQueue)
.process(/* someBean */)
.inOnly(/* URI of B */)
.end();
Some time later, B will make an async call (decoupled by a seda queue) back to A. From the perspective of Camel, both calls have nothing to do with each other but from our point, it would be important to acknowledge the message, once we get an answer from B. Obviously, we have some form of handler, which can relate the outgoing and incoming call but what we are lacking is the possibility to explicitly acknowledge the original message.
How is this done or what pattern better suits our needs?

How to improve efficiency with Camel?

I have a problem of efficiency in my project which uses Camel with the Esper component.
I have several external datasources feeding information to camel endpoints. Each Camel endpoint that receives data transfers it to a route that processes it and then delivers it at an Esper endpoint.
The image below illustrates this behavior:
The efficiency problem is that all of this is done by a single Java thread. Thus if I have many sources, there is a huge bottleneck.
The following code accurately illustrates what is going on with the image:
public final void configure() throws OperationNotSupportedException{
RouteDefinition route = from("xmpp://localhost:5222/?blablabla...");
// apply some filter
FilterDefinition filterDefinition = route.filter().method(...);
// apply main processor
ExpressionNode expressionNode = filterDefinition.process(...);
// set destination
expressionNode = filterDefinition.to("esper://session_X");
}
To fix this problem, I have to handle this situation with a pool of threads or using some sort of parallel processing. I cannot use patterns like multicast, recipient list, etc because all of those send the same message to multiple endpoints / clients, which is not the case in my examples.
A possible solution would be having 1 thread per each "Datasource endpoint -> Route -> Esper endpoint" combination, like the image bellow:
Another possible solution is to have 1 thread receive everything from the datasources, and then dispatch it to multiple threads handling the route processing together with the other endpoint:
PS: I am open to any other possible suggestions you may have.
To achieve one of these I have considered using the Camel SEDA component component, however, this one does not seem to allow me to have dynamic thread pools, because the concurrentConsumers property is static. Furthermore, I am not sure if I can use a SEDA endpoint at all, because I believe (although I am not completely sure) that the syntax for an endpoint like .to("seda:esper://session_X?concurrentConsumers=10") is invalid for Camel.
So, at this point I am quite lost and I don't know what to do:
- Is SEDA the solution I am looking for?
- If yes, how do I integrate it with the Esper endpoint given the syntax problem?
- Are there any other solutions / Camel components that could fix my problem?
You must define a separate seda route that is distributing your message to the esper engine such as (using the fluent style):
public final void configure() throws OperationNotSupportedException{
from("xmpp://localhost:5222/?blablabla...")
.filter().method(...)
.process(...)
.to("seda:sub");
from("seda:sub?concurrentConsumers=10)
.to("esper://session_X");
}
That said, seda should only be used if loosing messages is not a problem. Otherwise you should use a more robust protocol such as jms that allows to persist messages.
EDIT:
Beside seda, you could use threads(), where you could customize the threading behaviour by defining an ExecutorService:
public final void configure() throws OperationNotSupportedException{
from("xmpp://localhost:5222/?blablabla...")
.filter().method(...)
.process(...)
.threads()
.executorService(Executors.newFixedThreadPool(2))
.to("esper://session_X");
}
If you using seda or threads(), you may loose transaction safety in case of failures. For this case or if you need to balance the workload to several remote hosts, you may use jms. More information about this solution is found here.

Categories