I have a java application(say A) which communicate with an application(say B) via TCP Socket. My java application is multithreaded, can handle up to 100 threads. To communicate between A --> B we have 10 sockets.
Challenges -
Connection Pooling - need connection pooling mechanism to handle n(say 100) number of thread(of application A), communicating to
application B via x(say 10) number of TCP Socket.
Multithreading - How can two thread access same socket send the request one by one and get back the response mapped to appropriate
thread.
Multiple request - Is it possible that two thread can send the request on single socket simultaneously.
Can we over come this challenge via any framework? Is it possible?
I heard that Spring Integration/ApacheCamel/Local MQ can resolve this solutions. Any examples.
With Spring Integration:
CachingClientConnectionFactory
TcpOutboundGateway (with CachingClientConnectionFactory).
Collaborating Outbound and Inbound Channel Adapters.
But you have to do your own request/reply collaboration (usually based on something in the message); the replies may not come back in the order they were sent. Since there is no standard way to perform that collaboration, the framework doesn't support it itself.
I was able to resolve the problem stated in question via jPOS.
jPOS can do multiplexing. It uses ISOMessage field 11 and 41 for matching the request and response.
jPOS also providing pooling mechanism.
I am writing a service which serves stateless incoming requests. The requests are all mathematical calculation, which does not take very long to execute ( max 2ms).
I use Tibco EMS to communicate between client/server. A client library is provided which wraps the client side logic (e.g. convert data into EMS message etc) and send the request to a request queue. The server side processes the request and send the response into a separate queue. This works fine.
The server side is multi-threaded. A new thread is created when a new incoming request is received. Requests are therefore handled concurrently.
The server side uses one single EMS connection to the EMS server. However, because EMS Session is not thread safe, if I want to be able to write the response to EMS queue in each thread, I have to create one session for each thread using the connectionFactory. This degraded the performance.
The time spent on traffic is around 3-4ms, i.e. Time between a request is sent and a response is received is around 5-6ms.(3-4ms for transportation, marshal/unmarshal, 2ms for calculation).
Is there any solution which allows me to concurrently send to a EMS queue without creating two much JMS objects?
Are there any other important rules I need to follow to further optimize the service? Some basic optimization guidelines are already followed:
Use CachedConnectionPool
Send JMS message as NON_PERSISTANT
Use one EMS connection for all requests.
Thank you very much.
The behavior you are experiencing is not specific to EMS. The behavior is dictated by the JMS specification itself. Here is an extract from section 2.8 of the JMS Specification:
There are two reasons for restricting concurrent access to Sessions. First, Sessions are the JMS entity that supports transactions. It is very difficult to implement transactions that are multithreaded. Second, Sessions support asynchronous message consumption. It is important that JMS not require that client code used for asynchronous message consumption be capable of handling multiple, concurrent messages. In addition, if a Session has been set up with multiple, asynchronous consumers, it is important that the client is not forced to handle the case where these separate consumers are concurrently executing. These restrictions make JMS easier to use for typical clients. More sophisticated clients can get the concurrency they desire by using multiple sessions.
If you want to avoid the creation (and destruction) of that many objects, you might want to pre-create a pool of threads, and allocate a session to each thread up front.
So the problem is I have fifteen clients which need to be able to communicate between each other. My question is how should this be done? Clearly one way is to simply make the clients also servers, but that means 120 unique connections necessary to fully connect the fifteen clients. I'd rather not do this as it seems messy.
Current solution:
Each new connection has the server spin off a separate thread for listening to it. Each client has a separate thread monitoring the channel for incoming information.
Server acts as a message router: Process 1 needs to send a message to Process 2 and sends a message to the server indicating intended recipient, sender, and message.
Upon receiving the message the server passes message to Process 2. The listening thread detects it and passes it to the process.
So on for each message between the clients.
This seems clunky. Is there a better methodology/package to use for this?
A UDP multicast system would work for this but will get complicated for you to do yourself (since you have to worry about synchronization and fault detection/correction yourself as well as nodes droping in and out of the group).
There are various middleware solutions including distributed caches that already address this problem pretty well. Look at Infinispan. If that's too high level and you just want a lower level solution, try JGroups. I only list those because I know they are quick and usable, but there are many others out there.
The RabbitMQ Java client has the following concepts:
Connection - a connection to a RabbitMQ server instance
Channel - ???
Consumer thread pool - a pool of threads that consume messages off the RabbitMQ server queues
Queue - a structure that holds messages in FIFO order
I'm trying to understand the relationship, and more importantly, the associations between them.
I'm still not quite sure what a Channel is, other than the fact that this is the structure that you publish and consume from, and that it is created from an open connection. If someone could explain to me what the "Channel" represents, it might help clear a few things up.
What is the relationship between Channel and Queue? Can the same Channel be used to communicate to multiples Queues, or does it have to be 1:1?
What is the relationship between Queue and the Consumer Pool? Can multiple Consumers be subscribed to the same Queue? Can multiple Queues be consumed by the same Consumer? Or is the relationship 1:1?
A Connection represents a real TCP connection to the message broker, whereas a Channel is a virtual connection (AMQP connection) inside it. This way you can use as many (virtual) connections as you want inside your application without overloading the broker with TCP connections.
You can use one Channel for everything. However, if you have multiple threads, it's suggested to use a different Channel for each thread.
Channel thread-safety in Java Client API Guide:
Channel instances are safe for use by multiple threads. Requests into
a Channel are serialized, with only one thread being able to run a
command on the Channel at a time. Even so, applications should prefer
using a Channel per thread instead of sharing the same Channel across
multiple threads.
There is no direct relation between Channel and Queue. A Channel is used to send AMQP commands to the broker. This can be the creation of a queue or similar, but these concepts are not tied together.
Each Consumer runs in its own thread allocated from the consumer thread pool. If multiple Consumers are subscribed to the same Queue, the broker uses round-robin to distribute the messages between them equally. See Tutorial two: "Work Queues".
It is also possible to attach the same Consumer to multiple Queues.
You can understand Consumers as callbacks. These are called everytime a message arrives on a Queue the Consumer is bound to. For the case of the Java Client, each Consumers has a method handleDelivery(...), which represents the callback method. What you typically do is, subclass DefaultConsumer and override handleDelivery(...). Note: If you attach the same Consumer instance to multiple queues, this method will be called by different threads. So take care of synchronization if necessary.
A good conceptual understanding of what the AMQP protocol does "under the hood" is useful here. I would offer that the documentation and API that AMQP 0.9.1 chose to deploy makes this particularly confusing, so the question itself is one which many people have to wrestle with.
TL;DR
A connection is the physical negotiated TCP socket with the AMQP server. Properly-implemented clients will have one of these per application, thread-safe, sharable among threads.
A channel is a single application session on the connection. A thread will have one or more of these sessions. AMQP architecture 0.9.1 is that these are not to be shared among threads, and should be closed/destroyed when the thread that created it is finished with it. They are also closed by the server when various protocol violations occur.
A consumer is a virtual construct that represents the presence of a "mailbox" on a particular channel. The use of a consumer tells the broker to push messages from a particular queue to that channel endpoint.
Connection Facts
First, as others have correctly pointed out, a connection is the object that represents the actual TCP connection to the server. Connections are specified at the protocol level in AMQP, and all communication with the broker happens over one or more connections.
Since it's an actual TCP connection, it has an IP Address and Port #.
Protocol parameters are negotiated on a per-client basis as part of setting up the connection (a process known as the handshake.
It is designed to be long-lived; there are few cases where connection closure is part of the protocol design.
From an OSI perspective, it probably resides somewhere around Layer 6
Heartbeats can be set up to monitor the connection status, as TCP does not contain anything in and of itself to do this.
It is best to have a dedicated thread manage reads and writes to the underlying TCP socket. Most, if not all, RabbitMQ clients do this. In that regard, they are generally thread-safe.
Relatively speaking, connections are "expensive" to create (due to the handshake), but practically speaking, this really doesn't matter. Most processes really will only need one connection object. But, you can maintain connections in a pool, if you find you need more throughput than a single thread/socket can provide (unlikely with current computing technology).
Channel Facts
A Channel is the application session that is opened for each piece of your app to communicate with the RabbitMQ broker. It operates over a single connection, and represents a session with the broker.
As it represents a logical part of application logic, each channel usually exists on its own thread.
Typically, all channels opened by your app will share a single connection (they are lightweight sessions that operate on top of the connection). Connections are thread-safe, so this is OK.
Most AMQP operations take place over channels.
From an OSI Layer perspective, channels are probably around Layer 7.
Channels are designed to be transient; part of the design of AMQP is that the channel is typically closed in response to an error (e.g. re-declaring a queue with different parameters before deleting the existing queue).
Since they are transient, channels should not be pooled by your app.
The server uses an integer to identify a channel. When the thread managing the connection receives a packet for a particular channel, it uses this number to tell the broker which channel/session the packet belongs to.
Channels are not generally thread-safe as it would make no sense to share them among threads. If you have another thread that needs to use the broker, a new channel is needed.
Consumer Facts
A consumer is an object defined by the AMQP protocol. It is neither a channel nor a connection, instead being something that your particular application uses as a "mailbox" of sorts to drop messages.
"Creating a consumer" means that you tell the broker (using a channel via a connection) that you would like messages pushed to you over that channel. In response, the broker will register that you have a consumer on the channel and begin pushing messages to you.
Each message pushed over the connection will reference both a channel number and a consumer number. In that way, the connection-managing thread (in this case, within the Java API) knows what to do with the message; then, the channel-handling thread also knows what to do with the message.
Consumer implementation has the widest amount of variation, because it's literally application-specific. In my implementation, I chose to spin off a task each time a message arrived via the consumer; thus, I had a thread managing the connection, a thread managing the channel (and by extension, the consumer), and one or more task threads for each message delivered via the consumer.
Closing a connection closes all channels on the connection. Closing a channel closes all consumers on the channel. It is also possible to cancel a consumer (without closing the channel). There are various cases when it makes sense to do any of the three things.
Typically, the implementation of a consumer in an AMQP client will allocate one dedicated channel to the consumer to avoid conflicts with the activities of other threads or code (including publishing).
In terms of what you mean by consumer thread pool, I suspect that Java client is doing something similar to what I programmed my client to do (mine was based off the .Net client, but heavily modified).
I found this article which explains all aspects of the AMQP model, of which, channel is one. I found it very helpful in rounding out my understanding
https://www.rabbitmq.com/tutorials/amqp-concepts.html
Some applications need multiple connections to an AMQP broker. However, it is undesirable to keep many TCP connections open at the same time because doing so consumes system resources and makes it more difficult to configure firewalls. AMQP 0-9-1 connections are multiplexed with channels that can be thought of as "lightweight connections that share a single TCP connection".
For applications that use multiple threads/processes for processing, it is very common to open a new channel per thread/process and not share channels between them.
Communication on a particular channel is completely separate from communication on another channel, therefore every AMQP method also carries a channel number that clients use to figure out which channel the method is for (and thus, which event handler needs to be invoked, for example).
There is a relation between like A TCP connection can have multiple Channels.
Channel: It is a virtual connection inside a connection. When publishing or consuming messages from a queue - it's all done over a channel Whereas Connection: It is a TCP connection between your application and the RabbitMQ broker.
In multi-threading architecture, you may need a separate connection per thread. That may lead to underutilization of TCP connection, also it adds overhead to the operating system to establish as many TCP connections it requires during the peak time of the network. The performance of the system could be drastically reduced. This is where the channel comes handy, it creates virtual connections inside a TCP connection. It straightaway reduces the overhead of the OS, also it allows us to perform asynchronous operations in a more faster, reliable and simultaneously way.
RabbitMQ RPC
I decided to use RabbitMQ RPC as described here.
My Setup
Incoming web requests (on Tomcat) will dispatch RPC requests over RabbitMQ to different services and assemble the results. I use one reply queue with one custom consumer that listens to all RPC responses and collects them with their correlation id in a simple hash map. Nothing fancy there.
This works great in a simple integration test on controller level.
Problem
When I try to do this in a web project deployed on Tomcat, Tomcat refuses to shut down. jstack and some debugging learned me a thread is spawn to listen for the RPC response and is blocking Tomcat from shutting down gracefully. I guess this is because the created thread is created on application level instead of request level and is not managed by Tomcat. When I set breakpoints in Servlet.destroy() or ServletContextListener.contextDestroyed(ServletContextEvent sce), they are not reached, so I see no way to manually clean things up.
Alternative
As an alternative, I could use a new reply queue (and simple QueueingConsumer) for each web request. I've tested this, it works and Tomcat shuts down as it should. But I'm wondering if this is the way to go.. Can a RabbitMQ cluster deal with thousands (or even millions) of short living queues/consumers? I can imagine queues aren't that big, but still.. constantly broadcasting to all cluster nodes.. the total memory footprint..
Question
So in short, is it wise do create a queue for each incoming web request or how should I setup RabbitMQ with one queue and consumer so Tomcat can shutdown gracefully?
I found a solution for my problem:
The Java client is creating his own threads. There is the possibility to add your own ExecutorService when creating a new connection. Doing so in the ServletContextListener.initialized() method, one can keep track of the ExecutorService and shut it down manually in the ServletContextListener.destroyed() method.
executorService.shutdown();
executorService.awaitTermination(20, TimeUnit.SECONDS);
I used Executors.newCachedThreadPool(); as the threads have many short executions, and they get cleaned up when being idle for more then 60s.
This is the link to the RabbitMQ Google group thread (thx to Michael Klishin for showing me the right direction)