Manage delivery of JMS messages to multiple servers - java

Our app uses Spring Boot and JMS messages with Tibco. We have two production servers running and processing messages concurrently. Servers are listening to the same one queue. Each server has 10 concurrent listeners. I do not want the very same message gets processed by both servers at the same time. Nothing prevents our queue of having duplicate messages, like we can have two copies of the message A in the queue. If messages in the queue are: A, A, B, C, D, then if first A gets delivered to server1 and second A gets delivered to server2, and both servers process A at the same time, then they are chances of creating duplicate entities. I want to find a way to send all A messages to only one server. I can't use Message Selector b/c we have the same code base running on both servers. This is what I'm considering:
Based on the message, set properties in the headers. Once the message got delivered to the process() method, depending on which server is processing the message, either discard, simply return the message or process the message and acknowledge it. The problem with this solution is that since we need to dynamicacaly find out which server is processing the message, the server name needs to be hardcoded, meaning if the server moves, the code breaks!
Other solution - that might work - is the Destination field.
https://docs.spring.io/spring/docs/4.0.x/spring-framework-reference/html/jms.html
Destinations, like ConnectionFactories, are JMS administered objects
that can be stored and retrieved in JNDI. When configuring a Spring
application context you can use the JNDI factory class
JndiObjectFactoryBean / to perform dependency
injection on your object’s references to JMS destinations.
It's something I never done before. Is there anyway, to configure the Destination that it picks up the right server to route the message to? Meaning, if message1 is supposed to be delivered to server1, then it does not even gets delivered to server2 and remains in the queue until server1 consumes it?
What are other ways to implement this?
EDIT:
I still do not know what’s the best way to send certain messages to only one server for processing, however, accepted the response given to use database as validation, b/c this is what we consider to avoid creating duplicate entities when processing the data.

I think the idea of using the JMS Destination is a non-starter as there is nothing in the JMS specification which guarantees any kind of link between the destination and a broker. The destination is just an encapsulation for the provider-specific queue/topic name.
The bottom line here is that you either need to prevent the duplicate messages in the first place or have some way to coordinate the consumers to deal with the duplicates after they've been pulled off the queue. I think you could do either of these using an external system like a database, e.g.:
When producing the message check the database for an indication that the message was sent already. If no indication is found then write a record to the database (will need to use a primary key to prevent duplicates) and send the message. Otherwise don't send the message.
When consuming the message check the database for an indication that the message is being (or was) consumed already. If no indication is found then write a record to the database (will need to use a primary key to prevent duplicates) and process the message. Otherwise just acknowledge the message without processing it.

I suggest an alternative to "post DB sync".
Keep the servers and listeners as-is, and broadcast all+ the the processed messages on a topic. For servers just starting, you can use "durable subscribers" to not miss any messages.
If you broadcast each start and end of processing for messages A, B, C, etc AND consider adding a little pause (in milli), you should avoid collisions. It's the main risk of course.
It's not clear to me if you should validate for duplicate processing at the beginning or end of a message processing... it depends on your needs.
If this whole idea is not acceptable, DB validation might be the only option, but as stated in comments above, I fear for scaling.

Related

Best Practice for resilience of messages across RabbitMQ queues

I am trying to understand the best use of RabbitMQ to satisfy the following problem.
As context I'm not concerned with performance in this use case (my peak TPS for this flow is 2 TPS) but I am concerned about resilience.
I have RabbitMQ installed in a cluster and ignoring dead letter queues the basic flow is I have a service receive a request, creates a persistent message which it queues, in a transaction, to a durable queue (at this point I'm happy the request is secured to disk). I then have another process listening for a message, which it reads (not using auto ack), does a bunch of stuff, writes a new message to a different exchange queue in a transaction (again now happy this message is secured to disk). Assuming the transaction completes successfully it manually acks the message back to the original consumer.
At this point my only failure scenario is is I have a failure between the commit of the transaction to write to my second queue and the return of the ack. This will lead to a message being potentially processed twice. Is there anything else I can do to plug this gap or do I have to figure out a way of handling duplicate messages.
As a final bit of context the services are written in java so using the java client libs.
Paul Fitz.
First of all, I suggest you to look a this guide here which has a lot of valid information on your topic.
From the RabbitMQ guide:
At the Producer
When using confirms, producers recovering from a channel or connection
failure should retransmit any messages for which an acknowledgement
has not been received from the broker. There is a possibility of
message duplication here, because the broker might have sent a
confirmation that never reached the producer (due to network failures,
etc). Therefore consumer applications will need to perform
deduplication or handle incoming messages in an idempotent manner.
At the Consumer
In the event of network failure (or a node crashing), messages can be
duplicated, and consumers must be prepared to handle them. If
possible, the simplest way to handle this is to ensure that your
consumers handle messages in an idempotent way rather than explicitly
deal with deduplication.
So, the point is that is not possibile in any way at all to guarantee that this "failure" scenario of yours will not happen. You will always have to deal with network failure, disk failure, put something here failure etc.
What you have to do here is to lean on the messaging architecture and implement if possibile "idempotency" of your messages (which means that even if you process the message twice is not going to happen anything wrong, check this).
If you can't than you should provide some kind of "processed message" list (for example you can use a guid inside every message) and check this list every time you receive a message; you can simply discard them in this case.
To be more "theorical", this post from Brave New Geek is very interesting:
Within the context of a distributed system, you cannot have
exactly-once message delivery.
Hope it helps :)

JMS Message Ordering and Transaction Rollback

We're building a system that will be sending messages from one application to another via JMS (Using Websphere MQ if that matters). These messages are of the form "Create x" or "Delete x". (The end result of this is that a third-party system needs to be informed of the Create and Delete messages, so the Read end of the JMS queue is going to talk to the third-party system, whilst the Write end of the JMS queue is just broadcasting messages out to be handled)
The problem that we're worried about here is if one of the messages fails. The initial thought here was simply to roll the failures back onto the JMS queue and let the normal retry mechanism handle it. That works until you get a Delete followed by a Create for the same identifier, e.g.
Delete 123 - Fails, gets rolled back on to the queue
Create 123 - Succeeds
Delete 123 - Retry from earlier failure
The end result of this is that the third party was told to Create 123 and then immediately to Delete 123, instead of the other way around.
Whilst not ideal, from what I've been reading up on it seems that Message Affinity would help here, so that we can guarantee that the messages are processed in the correct order. However, I'm not sure how message affinity will work when messages are processed and failed back onto the queue? (Message Affinity is generally considered a bad idea, but the load here isn't going to be great, and the risk of poison messages is very low. It's simply the risk that the third-party that we're interacting with has a brief outage that we're concerned with)
Failing that, are there any better thoughts on how to do this?
Edit - Further complications. The system we're building to integrate with the third-party is to replace a system they used from a different supplier until recently. As such, there's a bunch of data that is already in the third-party, but it turns out to be very difficult to actually get this out. (The third-party doesn't even send success/failure messages back, merely an acknowledgement of receipt!), so we don't actually know the initial state of the system.
The definitive way to address this is to include in the message a sequence such that the earlier message can't overwrite the later one.
Once upon a time, your transactions at the bank were processed in the order they arrived. However, it was this exact problem that caused that to change. People became aware that their account balance could be positively or negatively affected depending on the order in which the transactions were processed. When it was left to chance it was occasionally a problem for people but in general no malice was perceived.
Later, banks started to memo-post transactions during the day, then sort them into the order most favorable to the bank prior to processing them. For example, if the largest checks cleared first and the account ran out of money, several smaller checks might bounce, generating multiple bounce fees for the bank. Once this was discovered to have become widespread practice, it was changed to always apply the transactions in the order most favorable to the account holder. (At least here in the US.)
This is a common problem and it's been solved many times, the first being decades ago. There really is no excuse anymore for an Enterprise-grade application to both
Use asynchronous messaging in which delivery order by design cannot be guaranteed, and
Contain an inherent dependency on delivery order to assure the integrity of the transactions and data.
The mention of message affinities hints at the solution to this when approached as a transport problem. To guarantee message order delivery requires the following:
One and only one sender of messages
One and only one node from which messages are sent.
One and only one path between sender and receiver of messages.
One and only one queue at which messages are received.
All messages processed under syncpoint.
The ability to pend processing of any messages whilst an orphaned transaction exists (connection exception handling).
No backout queue on the input queue.
No DLQ if the messages traverse a channel.
No use of Priority delivery on any queue along the path.
One and only one receiver of messages.
No ability to scale the application other than by adding CPU and memory to the node(s) hosting the QMgr(s).
Or the problem could be addressed in the application design using memo posting, compensating transactions, or any of the other techniques commonly used to eliminate sequence dependencies.
(Side note: If the application in question is an off-the-shelf vendor package this would seem to me to create a credibility issue. No application can claim to be robust if something so commonplace as a backed out message can mess with data integrity.)
The one way to avoid the scenario you described above is to have Different classification of message failures; keeping in mind that your Message should be processed in order.(Message affinity)
Consider scenario :-
If application receives Delete X for which it haven't receive Create x before,
then classify this Error scenario as "Business Error" since this Error occured because
Producer of Message sent wrong message . Or we can say Producer of message sent message in wrong order.
Once you classify this error as "Business Error", you should not call rollback; instead of that insert this message into Database with "Business Error" as identification.
So in this way, you committed this message from queue and reduced risk of rollback and further reduce risk of inconsistent behaviour of your application.
Now Consider Another Scenario :-
If your application itself has some problem,( like , database or web server goes down or any such technical error) then in that case use Rollback mechanism of JMS queue and treate this Error as "Technical Error".
So if any "Technical Error" occured, JMS Queue will retry message untill your Application is able to accept and process those.
Once your application is up after this "Technical Error" and tried processing messages in sequential order,same rule applied here i.e. if "Business Error" occured then that message will no longer retried.
Note: The "Business Error" classification should be agreed by all parties, i.e. if you are marking any message as a "Business Error" it means this message is no longer useful and your Producer should sent a new "Delete x" for any Valid 'Create x".
Some of the "Business Error" you can take into accounts are--
Received "Delete x" before "Create x"
Received "Create x" after "Create x"
Received "Delete x" after valid "Delete x"

Handling Failed calls on the Consumer end (in a Producer/Consumer Model)

Let me try explaining the situation:
There is a messaging system that we are going to incorporate which could either be a Queue or Topic (JMS terms).
1 ) Producer/Publisher : There is a service A. A produces messages and writes to a Queue/Topic
2 ) Consumer/Subscriber : There is a service B. B asynchronously reads messages from Queue/Topic. B then calls a web service and passes the message to it. The webservice takes significant amount of time to process the message. (This action need not be processed real-time.)
The Message Broker is Tibco
My intention is : Not to miss out processing any message from A. Re-process it at a later point in time in case the processing failed for the first time (perhaps as a batch).
Question:
I was thinking of writing the message to a DB before making a webservice call. If the call succeeds, I would mark the message processed. Otherwise failed. Later, in a cron job, I would process all the requests that had initially failed.
Is writing to a DB a typical way of doing this?
Since you have a fail callback, you can just requeue your Message and have your Consumer/Subscriber pick it up and try again. If it failed because of some problem in the web service and you want to wait X time before trying again then you can do either schedule for the web service to be called at a later date for that specific Message (look into ScheduledExecutorService) or do as you described and use a cron job with some database entries.
If you only want it to try again once per message, then keep an internal counter either with the Message or within a Map<Message, Integer> as a counter for each Message.
Crudely put that is the technique, although there could be out-of-the-box solutions available which you can use. Typical ESB solutions support reliable messaging. Have a look at MuleESB or Apache ActiveMQ as well.
It might be interesting to take advantage of the EMS platform your already have (example 1) instead of building a custom solution (example 2).
But it all depends on the implementation language:
Example 1 - EMS is the "keeper" : If I were to solve such problem with TIBCO BusinessWorks, I would use the "JMS transaction" feature of BW. By encompassing the EMS read and the WS call within the same "group", you ask for them to be both applied, or not at all. If the call failed for some reason, the message would be returned to EMS.
Two problems with this solution : You might not have BW, and the first failed operation would block all the rest of the batch process (that may be the desired behavior).
FYI, I understand it is possible to use such feature in "pure java", but I never tried it : http://www.javaworld.com/javaworld/jw-02-2002/jw-0315-jms.html
Example 2 - A DB is the "keeper" : If you go with your "DB" method, your queue/topic customer continuously drops insert data in a DB, and all records represent a task to be executed. This feels an awful lot like the simple "mapping engine" problem every integration middleware aims to make easier. You could solve this with anything from a custom java code and multiples threads (DB inserter, WS job handlers, etc.) to an EAI middleware (like BW) or even a BPM engine (TIBCO has many solutions for that)
Of course, there are also other vendors... EMS is a JMS standard implementation, as you know.
I would recommend using the built in EMS (& JMS) features,as "guaranteed delivery" is what it's built for ;) - no db needed at all...
You need to be aware that the first decision will be:
do you need to deliver in order? (then only 1 JMS Session and Client Ack mode should be used)
how often and in what reoccuring times do you want to retry? (To not make an infinite loop of a message that couldn't be processed by that web service).
This is independent whatever kind of client you use (TIBCO BW or e.g. Java onMessage() in a MDB).
For "in order" delivery: make shure only 1 JMS Session processes the messages and it uses Client acknolwedge mode. After you process the message sucessfully, you need to acknowledge the message with either calling the JMS API "acknowledge()" method or in TIBCO BW by executing the "commit" activity.
In case of an error you don't execute the acknowledge for the method, so the message will be put back in the Queue for redelivery (you can see how many times it was redelivered in the JMS header).
EMS's Explicit Client Acknolwedge mode also enables you to do the same if order is not important and you need a few client threads to process the message.
For controlling how often the message get's processed use:
max redelivery properties of the EMS queue (e.g. you could put the message in the dead
letter queue afer x redelivery to not hold up other messages)
redelivery delay to put a "pause" in between redelivery. This is useful in case the
Web Service needs to recover after a crash and not gets stormed by the same message again and again in high intervall through redelivery.
Hope that helps
Cheers
Seb

ActiveMQ: How to handle broker failovers while using temporary queues

On my JMS applications we use temporary queues on Producers to be able to receive replies back from Consumer applications.
I am facing exactly same issue on my end as mentioned in this thread: http://activemq.2283324.n4.nabble.com/jira-Created-AMQ-3336-Temporary-Destination-errors-on-H-A-failover-in-broker-network-with-Failover-tt-td3551034.html#a3612738
Whenever I restarted an arbitrary broker in my network, I was getting many errors like this in my Consumer application log while trying to send reply to a temporary queue:
javax.jms.InvalidDestinationException:
Cannot publish to a deleted Destination: temp-queue://ID:...
Then I saw Gary's response there suggesting to use
jms.watchTopicAdvisories=false
as a url param on the client brokerURL. I promptly changed my client broker URLs with this additional parameter. However now I am seeing errors like this when I restart my brokers in network for this failover testing:
javax.jms.JMSException:
The destination temp-queue:
//ID:client.host-65070-1308610734958-2:1:1 does not exist.
I am using ActiveMQ 5.5 version. And my client broker URL looks like this:
failover:(tcp://amq-host1:61616,tcp://amq-host2.tred.aol.com:61616,tcp://amq-host3:61616,tcp://amq-host4:61616)?jms.useAsyncSend=true&timeout=5000&jms.watchTopicAdvisories=false
Additionally here is my activemq config XML for one of the 4 brokers:
amq1.xml
Can someone here please look into this problem and suggest me what mistake I am making in this setup.
Update:
To clarify further on how I am doing request-response in my code:
I already use a per producer destination (i.e. temporary queue) and set this in reply-to header of every message.
I am already sending a per message unique correlation identifier in JMSCorrelationID header.
As far as I know even Camel and Spring are also using temporary queue for request-response mechanism. Only difference is that Spring JMS implementation creates and destroys temporary queue for every message whereas I create temporary queue for the lifetime of the producer. This temporary queue is destroyed when client (producer) app shutsdown or by the AMQ broker when it realizes there are no active producer attached with this temporary queue.
I am already setting a message expiry on each message on Producer side so that message is not held up in a queue for too long (60 sec).
There is a broker attribute, org.apache.activemq.broker.BrokerService#cacheTempDestinations that should help in the failover: case.
Set that to true in xml configuration, and a temp destination will not be removed immediately when a client disconnects.
A fast failover: reconnect will be able to producer and/or consume from the temp queue again.
There is a timer task based on timeBeforePurgeTempDestinations (default 5 seconds) that handles cache removal.
One caveat though, I don't see any tests in activemq-core that make use of that attribute so I can't give you any guarantee on this one.
Temporary queues are created on the broker to which the requestor (producer) in your request-reply scenario connects. They are created from a javax.jms.Session, so on that session disconnecting, either because of client disconnect or broker failure/failover, those queues are permanently gone. None of the other brokers will understand what is meant when one of your consumers attempts to reply to those queues; hence your exception.
This requires an architectural shift in mindset assuming that you want to deal with failover and persist all your messages. Here is a general way that you could attack the problem:
Your reply-to headers should refer to a queue specific to the requestor process: e.g. queue:response.<client id>. The client id might be a standard name if you have a limited number of clients, or a UUID if you have a large number of these.
The outbound message should set a correlation identifier (simply a sting that lets you associate a request with a response - requestors after all might make more than one request at the same time). This is set in the JMSCorrelationID header, and ought to be copied from the request to the response message.
The requestor needs to set up a listener on that queue that will return the message body to the requesting thread based on that correllation id. There is some multithreading code that needs to be written for this, as you'll need to manually manage something like a map of correlation ids to originating threads (via Futures perhaps).
This is a similar approach to that taken by Apache Camel for request-response over messaging.
One thing to be mindful of is that the queue will not go away when the client does, so you should set a time to live on the response message such that it gets deleted from the broker if it has not been consumed, otherwise you will get a backlog of unconsumed messages. You will also need to set up a dead letter queue strategy to automatically discard expired messages.

Multithreaded JMS code : CLIENT_ACKNOWLEDGE or transacted session

Edited Question : I am working on a multithreaded JMS receiver and publisher code (stand alone multithreaded java application). MOM is MQSonic.
XML message is received from a Queue, stored procedures(takes 70 sec to execute) are called and response is send to Topic within 90 sec.
I need to handle a condition when broker is down or application is on scheduled shutdown. i.e. a condition in which messages are received from Queue and are being processed in java, in the mean time both Queue and Topic will be down. Then to handle those messages which are not on queue and not send to topic but are in java memory, I have following options:
(1) To create CLIENT_ACKNOWLEDGE session as :
connection.createSession(false, javax.jms.Session.CLIENT_ACKNOWLEDGE)
Here I will acknowledge message only after the successful completion of transactions(stored procedures)
(2) To use transacted session i.e., connection.createSession(true, -1). In this approach because of some exception in transaction (stored procedure) the message is rolled back and Redelivered. They are rolled back again and again and continue until I kill the program. Can I limit the number of redelivery of jms messages from queue?
Also in above two approached which one is better?
The interface progress.message.jclient.ConnectionFactory has a method setMaxDeliveryCount(java.lang.Integer value) where you can set the maximum number of times a message will be redelivered to your MessageConsumer. When this number of times is up, it will be moved to the SonicMQ.deadMessage queue.
You can check this in the book "Sonic MQ Application Programming Guide" on page 210 (in version 7.6).
As to your question about which is better... that depends on whether the stored procedure minds being executed multiple times. If that is a problem, you should use a transaction that spans the JMS queue and the database both (Sonic has support for XA transactions). If you don't mind executing multiple times, then I would go for not acknowledging the message and aborting the processing when you notice that the broker is down (when you attempt to acknowledge the message, most likely). This way, another processor is able to handle the message if the first one is unable to do so after a connection failure.
If the messages take variable time to process, you may also want to look at the SINGLE_MESSAGE_ACKNOWLEDGE mode of the Sonic JMS Session. Normally, calling acknowledge() on a message also acknowledges all messages that came before it. If you're processing them out of order, that's not what you want to happen. In single message acknowledge mode (which isn't in the JMS standard), acknowledge() only acknowledges the message on which it is called.
If you are worried about communicating with a message queue/broker/server/etc that might be down, and how that interrupts the overall flow of the larger process you are trying to design, then you should probably look into a JMS queue that supports clustering of servers so you can still reliably produce/consume messages when individual servers in the cluster go down.
Your question isn't 100% clear, but it seems the issue is that you're throwing an exception while processing a message when you really shouldn't be.
If there is an actual problem with the message, say the xml is malformed or it's invalid according to your data model, you do not want to roll back your transaction. You might want to log the error, but you have successfully processed that message, it's just that "success" in this case means that you've identified the message as problematic.
On the other hand, if there is a problem in processing the message that is caused by something external to the message (e.g. the database is down, or the destination topic is unavailable) you probably do want to roll the transaction back, however you also want to make sure you stop consuming messages until the problem is resolved otherwise you'll end up with the scenario you've described where you continually process the same message over and over and fail every time you try to access whatever resource is currently unavailable.
Without know what messaging provider you are using, I don't know whether this will help you.
MQ Series messages have a backout counter, that can be enabled by configuring the harden backout counter option on the queue.
When I have previously had this problem , I do as follows:
// get/receive message from queue
if ( backout counter > n ) {
move_message_to_app_dead_letter_queue();
return;
}
process_message();
The MQ series header fields are accessible as JMS properties.
Using the above approach would also help if you can use XA transactions to rollback or commit the database and the queue manager simultaneously.
However XA transactions do incur a significant performance penalty and with stored proc's this probably isn't possible.
An alternative approach would be to write the message immediately to a message_table as a blob, and then commit the message from the queue.
Put a trigger on the message_table to invoke the stored proc, and then add the JMS response mechanism into the stored proc.

Categories