RabbitMQ multi-threaded channels and queue binding

RabbitMQ multi-threaded channels and queue binding - java

I have inherited some legacy RabbitMQ code that is giving me some serious headaches. Can anyone help, ideally pointing to some "official" documentation where I can browse for similar questions?
We create some channels receive responses from workers which perform a search using channels like so:
channelIn.queueDeclare("", false, false, true, null);
channelIn.queueBind("", AmqpClient.NAME_EXCHANGE,
AmqpClient.ROUTING_KEY_ROOT_INCOMING + uniqueId);
My understanding from browsing mailing lists and forums is that
declaring a queue with an empty name allows the server auto-generate a unique name, and
queues must have a globally unique name.
Is this true?
Also, in the second line above, my understanding based on some liberal interpretation of blogs and mailing lists is that queuebind with an empty queue name automatically binds to the last created queue. It seems nice because then you wouldn't have to pull the auto-generated name out of the clunky DeclareOK object.
Is this true? If so, will this work in a multithreaded environment?
I.e. is it possible some channel will bind itself to another channel's queue, then if that other channel closes, the incorrectly bound channel would get an error trying to use the queue? (note that the queue was created with autodelete=true.) My testing leads me to think yes, but I'm not confident that's where the problem is.

I cannot be certain that this will work in a multithreaded environment. It may be fine a high percentage of the time but it is possible you will get the wrong queue. Why take the risk?
Wouldn't this be better and safer?
String queueName = channelIn.queueDeclare("", false, false, true, null).getQueue();
channelIn.queueBind(queueName, AmqpClient.NAME_EXCHANGE,
AmqpClient.ROUTING_KEY_ROOT_INCOMING + uniqueId);
Not exactly clunky.

Q: What happens when a queue is declared with no name?
A: The server picks a unique name for the queue. When no name is supplied, the RabbitMQ server will generate a unique-for-that-RabbitMQ-cluster name, create a queue with that name, and then transmit the name back to the client that called queue.declare. RabbitMQ does this in a thread-safe way internally (e.g. many clients calling queue.declare with blank names will never get the same name). Here is the documentation on this behavior.
Q: Do queue names need to be globally unique?
A: No, but they may need to be in your use case. Any number of publishers and subscribers can share a queue. Queue declarations are idempotent, so if 2 clients declare a queue with the same name and settings at the same time, or at different times, the server state will be the same as if just one declared it. Queues with blank names, however, will never collide. Consider declaring a queue with a blank name as if it were two operations: an RPC asking RabbitMQ "give me a globally unique name that you will reserve just for my use", and then idempotently declaring a queue with that name.
Q: Will queue.bind with a blank name bind to the last created queue in a multithreaded environment?
A: Yes, but you should not do that; it achieves nothing, is confusing, and has unspecified/poorly-specified behavior. This technique is largely pointless and prone to bugs in client code (What if lines got added between the declare and the add? Then it would be very hard to determine what queue was being bound).
Instead, use the return value of queueDeclare; that return value will contain the name of the queue that was declared. If you declared a queue with no name, the return value of queueDeclare will contain the new globally-unique name provided by RabbitMQ. You can provide that explicitly to subsequent calls that work with that queue (like binding it).
For an additional reason not to do this, the documentation regarding blank-queue-name behavior is highly ambiguous:
The client MUST either specify a queue name or have previously
declared a queue on the same channel
What does that mean? If more than one queue was declared, which one will be bound? What if the previously-declared queue was then deleted on that same channel? This seems like a very good reason to be as explicit as possible and not rely on this behavior.
Q: Can queues get deleted "underneath" channels connected to them?
A: Yes, in specific circumstances. Minor clarification on your question's terminology: channels don't "bind" themselves to queues: a channel can consume a queue, though. Think of a channel like a network port and a queue like a remote peer: you don't bind a port to a remote peer, but you can talk to more than one peer through the same port. Consumers are the equivalent of connected sockets; not channels. Anyway:
Channels don't matter here, but consumers and connections do (can have more than one consumer, even to the same queue, per channel; you can have more than one channel per connection). Here are the situations in which a queue can be deleted "underneath" a channel subscribing to it (I may have missed some, but these are all the non-disastrous--e.g. "the server exploded" conditions I know of):
A queue was declared with exclusive set to true, and the connection on which the queue was declared closes. The channel used to declare the queue can be closed, but so long as the connection stays open the queue will keep existing. Clients connected to the exclusive queue will see it disappear. However, clients may not be able to access the exclusive queue for consumption in the first place if it is "locked" to its declarer--the documentation is not clear on what "used" means with regards to exclusive locking.
A queue which is manually deleted via a queue.delete call. In this case, all consumers connected to the queue may encounter an error the next time they try to use it.
Note that in many client situations, consumers are often "passive"
enough that they won't realize that a queue is gone; they'll just
listen forever on what is effectively a closed socket. Publishing to
a queue, or attempting to redeclare it with passive (existence
poll) is guaranteed to surface the nonexistence; consumption alone
is not: sometimes you will see a "this queue was deleted!" error,
sometimes it will take minutes or hours to arrive, sometimes you
will never see such an error if all you're doing is consuming.
Q: Will auto_delete queues get deleted "underneath" one consumer when another consumer exits?
A: No. auto_delete queues are deleted sometime after the last consumer leaves the queue. So if you start two consumers on an auto_delete queue, you can exit one without disturbing the other. Here's the documentation on that behavior.
Additionally, queues which expire (via per-queue TTL) follow the same behavior: the queue will only go away sometime after the last consumer leaves.

Related

Manage delivery of JMS messages to multiple servers

Our app uses Spring Boot and JMS messages with Tibco. We have two production servers running and processing messages concurrently. Servers are listening to the same one queue. Each server has 10 concurrent listeners. I do not want the very same message gets processed by both servers at the same time. Nothing prevents our queue of having duplicate messages, like we can have two copies of the message A in the queue. If messages in the queue are: A, A, B, C, D, then if first A gets delivered to server1 and second A gets delivered to server2, and both servers process A at the same time, then they are chances of creating duplicate entities. I want to find a way to send all A messages to only one server. I can't use Message Selector b/c we have the same code base running on both servers. This is what I'm considering:
Based on the message, set properties in the headers. Once the message got delivered to the process() method, depending on which server is processing the message, either discard, simply return the message or process the message and acknowledge it. The problem with this solution is that since we need to dynamicacaly find out which server is processing the message, the server name needs to be hardcoded, meaning if the server moves, the code breaks!
Other solution - that might work - is the Destination field.
https://docs.spring.io/spring/docs/4.0.x/spring-framework-reference/html/jms.html
Destinations, like ConnectionFactories, are JMS administered objects
that can be stored and retrieved in JNDI. When configuring a Spring
application context you can use the JNDI factory class
JndiObjectFactoryBean / to perform dependency
injection on your object’s references to JMS destinations.
It's something I never done before. Is there anyway, to configure the Destination that it picks up the right server to route the message to? Meaning, if message1 is supposed to be delivered to server1, then it does not even gets delivered to server2 and remains in the queue until server1 consumes it?
What are other ways to implement this?
EDIT:
I still do not know what’s the best way to send certain messages to only one server for processing, however, accepted the response given to use database as validation, b/c this is what we consider to avoid creating duplicate entities when processing the data.

I think the idea of using the JMS Destination is a non-starter as there is nothing in the JMS specification which guarantees any kind of link between the destination and a broker. The destination is just an encapsulation for the provider-specific queue/topic name.
The bottom line here is that you either need to prevent the duplicate messages in the first place or have some way to coordinate the consumers to deal with the duplicates after they've been pulled off the queue. I think you could do either of these using an external system like a database, e.g.:
When producing the message check the database for an indication that the message was sent already. If no indication is found then write a record to the database (will need to use a primary key to prevent duplicates) and send the message. Otherwise don't send the message.
When consuming the message check the database for an indication that the message is being (or was) consumed already. If no indication is found then write a record to the database (will need to use a primary key to prevent duplicates) and process the message. Otherwise just acknowledge the message without processing it.

I suggest an alternative to "post DB sync".
Keep the servers and listeners as-is, and broadcast all+ the the processed messages on a topic. For servers just starting, you can use "durable subscribers" to not miss any messages.
If you broadcast each start and end of processing for messages A, B, C, etc AND consider adding a little pause (in milli), you should avoid collisions. It's the main risk of course.
It's not clear to me if you should validate for duplicate processing at the beginning or end of a message processing... it depends on your needs.
If this whole idea is not acceptable, DB validation might be the only option, but as stated in comments above, I fear for scaling.

How to publish to multiple queues with work queue behavior?

Using RabbitMQ, I have two types of consumers: FileConsumer writes messages to file and MailConsumer mails messages. There may be multiple consumers of each type, say three running MailConsumers and one FileConsumer instance.
How can I do this:
Each published message should be handled by exactly one FileConsumer instance and one MailConsumer instance
Publishing a message should be done once, not one time for each queue (if possible)
If there are no consumers connected, messages should be queued until consumed, not dropped
What type of exchange etc should I use to get this behavior? I'd really like to see some example/pseudo-code to make this clear.
This should be easy to do, but I couldn't figure it out from the docs. It seems the fanout example should work, but I'm confused with these "anonymous queues" which seems like it will lead to sending same message into each consumer.

If you create queue without auto-delete flag, then queues will stay alive even after consumers disconnection.
Note, that if you declare queue as persistent, it will be present even after broker restart.
If you will publish then messages with delivery-mode=2 property set (that mean that message will be persistent), such messages will stay in persistent (this is important to make queue persistent) queues even after broker restart.
Using fanout exchange type is not mandatory. You can also use topic for better message routing handling if you need that.
UPD: step-by-step way to get what you show with schema.
Declare persistent exchange, say main, as exchange.declare(exchange-name=main, type=fanout, durable=true).
Delcare two queues, say, files and mails as queue.declare(queue-name=files, durable=true) and queue.declare(queue-name=mails, durable=true)
Bind both queues to exchange as queue.bind(queue-name=files, exchange-name=main) and queue.bind(queue-name=mails, exchange-name=main).
At this point you can publish messages to main exchange (see note about delivery-mode above) and consume with any consumer number from queues, from files with FileConsumer and from mails with MailConsumer. Without any consumers on queues messages will be queued and stay in queue until they consumed (or broker restart is they are not persistent).

How to get number of consumers connected to Websphere MQ queue from Java

I am trying to get the number of consumers of a particular Websphere MQ queue from Java? I need to know whether someone is going to consume the messages before placing them on the queue.

First, it is worth noting that the design proposed is a very, VERY bad design. The effect is to turn async messaging back into synchronous messaging. This couples message producers to consumers, introduces location and resolution dependencies, breaks clustering, defeats WMQ's load distribution and balancing, embeds network topology into the application, and makes the whole system brittle. Please do not blame WMQ for not working correctly after intentionally defeating all its best features except the actual queue/dequeue operations.
However, to answer your question more directly, use the getOpenInputCount method of the queue object to obtain the number of open input handles. Here's how:
MQQueue outQ = qMgr.accessQueue(qName,
openOptions,
null, // default q manager
null, // no dynamic q name
null); // no alternate user id
int inCount = outQ.getOpenInputCount();
Note that you can only inquire the input handles on a local queue. If the queue is hosted on a QMgr other than the one where the message sender is connected, this method will not work. Of course it is the normal case that the message sender and receiver would reside on different QMgrs. However since you do not mention much about the design, I'll assume for purposes of this answer that connections from the message producer and consumer attach to the same QMgr. If that's not the case, we need to have a discussion about PCF and even stronger warnings about the design.

Temporary disable delivery of JMS message based on message property

I have a requirement which I am currently not aware of if it is possible at all. I would like to temporary disable the devliery of a JMS message if the message contains a specified property. Currently I am using HornetQ as message provider.
Let's make an example:
The queue contains of the following three entries:
{1, "foo", "A_CATEGORY"}
{2, "bar", "B_CATEGORY"}
{9, "bof", "A_CATEGORY"}
At a certain point the app must be able to tell the HornetQ message server that messages belonging to B_CATEGORY shouldn't be delivered at the moment (e.g. because the underlying database for B_CATEGORY objects gets updated). So the message with id 2 wouldn't be delivered at the moment, while 1 and 9 would be delivered as they have a different value for the category object.
It must happen out of the Java code without restarting the application at all. Is this possible at all?
Thanks for your help!
Just thought about an alternative design approach for this problem. Let's assume that the first Queue contains messages with all kind of categories (btw it isn't possible to create a queue per category as there could be a lot of them). This 'normal' queue is normaly configured (e.g. with no expiry, but DLQ).
Now if a listener consumes such a message and sees that it can't process messages belonging to a certain category, it puts it into a second queue. This queue is configured with redelivery delay and also an expiry time. If one sets now the expiry time quite high enough (of course not that the queue overflows) and the redelivery time not too short, then this should work out if there is no solution to the above question.
Of course one must calculate how many of those queue entries could be created during the time a category can't be processed. And also how long such an inavailability for a category could take so that the redelivery could be adjusted accordingly.

As far as I can tell, it is not possible with message driven beans.
A similar functionality is achievable with standard JMS consumer:
MessageConsumer c = session.createConsumer(destination);
while ( b-category-can-be-processed ) {
Message m = c.receive();
// process messages until b category is OK to be processed
}
c.close();
// now create a different consumer with message selector ignoring "B_CATEGORY"
MessageConsumer c1 = session.createConsumer(destination, "Category <> 'B_CATEGORY'");
while ( b-is-locked ) {
Message m = c1.receive();
// process messages until b category is locked
}
c1.close();
// go to start
This example assumes you're able to tell when to process B's again based on the messages received. If not, then you could resume the normal routine after certain time. The example also presents only a single thread of execution.
Exploring this path further, you could take a look at Spring's DefaultMessageListenerContainer — Spring message driven bean. It can do exactly what I described, but in a far more advanced way. It can be fed with a message selector, and it's live, you can change it any time you want. It handles messages in multiple threads, too, if you set the concurrentConsumers higher than 1.
As for your solution with redirecting messages to another queue while they cannot be processed, please notice that it generates extra traffic; you do want all your messages to be processed in the end, right? Why not leave them where they are and just fetch them in appropriate time? You won't have to estimate the redelivery delay ahead, which might be hard.

You could create a core queue (or a Subscription) with a filter and stop the queue using management API. Or if you are working embedded you could just cause pause at the Server Queue object.
As this would be a very custom feature, you could probably use it embedded, or make special adjustments at your own branch.

Multithread-safe JDBC Save or Update

We have a JMS queue of job statuses, and two identical processes pulling from the queue to persist the statuses via JDBC. When a job status is pulled from the queue, the database is checked to see if there is already a row for the job. If so, the existing row is updated with new status. If not, a row is created for this initial status.
What we are seeing is that a small percentage of new jobs are being added to the database twice. We are pretty sure this is because the job's initial status is quickly followed by a status update - one process gets one, another process the other. Both processes check to see if the job is new, and since it has not been recorded yet, both create a record for it.
So, my question is, how would you go about preventing this in a vendor-neutral way? Can it be done without locking the entire table?
EDIT: For those saying the "architecture" is unsound - I agree, but am not at liberty to change it.

Create a unique constraint on JOB_ID, and retry to persist the status in the event of a constraint violation exception.
That being said, I think your architecture is unsound: If two processes are pulling messages from the queue, it is not guaranteed they will write them to the database in queue order: one consumer might be a bit slower, a packet might be dropped, ..., causing the other consumer to persist the later messages first, causing them to be overridden with the earlier state.
One way to guard against that is to include sequence numbers in the messages, update the row only if the sequence number is as expected, and delay the update otherwise (this is vulnerable to lost messages, though ...).
Of course, the easiest way would be to have only one consumer ...

JDBC connections are not thread safe, so there's nothing to be done about that.
"...two identical processes pulling from the queue to persist the statuses via JDBC..."
I don't understand this at all. Why two identical processes? Wouldn't it be better to have a pool of message queue listeners, each of which would handle messages landing on the queue? Each listener would have its own thread; each one would be its own transaction. A Java EE app server allows you to configure the size of the message listener pool to match the load.
I think a design that duplicates a process like this is asking for trouble.
You could also change the isolation level on the JDBC connection. If you make it SERIALIZABLE you'll ensure ACID at the price of slower performance.
Since it's an asynchronous process, performance will only be an issue if you find that the listeners can't keep up with the messages landing on the queue. If that's the case, you can try increasing the size of the listener pool until you have adequate capacity to process the incoming messages.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.