Edited Question : I am working on a multithreaded JMS receiver and publisher code (stand alone multithreaded java application). MOM is MQSonic.
XML message is received from a Queue, stored procedures(takes 70 sec to execute) are called and response is send to Topic within 90 sec.
I need to handle a condition when broker is down or application is on scheduled shutdown. i.e. a condition in which messages are received from Queue and are being processed in java, in the mean time both Queue and Topic will be down. Then to handle those messages which are not on queue and not send to topic but are in java memory, I have following options:
(1) To create CLIENT_ACKNOWLEDGE session as :
connection.createSession(false, javax.jms.Session.CLIENT_ACKNOWLEDGE)
Here I will acknowledge message only after the successful completion of transactions(stored procedures)
(2) To use transacted session i.e., connection.createSession(true, -1). In this approach because of some exception in transaction (stored procedure) the message is rolled back and Redelivered. They are rolled back again and again and continue until I kill the program. Can I limit the number of redelivery of jms messages from queue?
Also in above two approached which one is better?
The interface progress.message.jclient.ConnectionFactory has a method setMaxDeliveryCount(java.lang.Integer value) where you can set the maximum number of times a message will be redelivered to your MessageConsumer. When this number of times is up, it will be moved to the SonicMQ.deadMessage queue.
You can check this in the book "Sonic MQ Application Programming Guide" on page 210 (in version 7.6).
As to your question about which is better... that depends on whether the stored procedure minds being executed multiple times. If that is a problem, you should use a transaction that spans the JMS queue and the database both (Sonic has support for XA transactions). If you don't mind executing multiple times, then I would go for not acknowledging the message and aborting the processing when you notice that the broker is down (when you attempt to acknowledge the message, most likely). This way, another processor is able to handle the message if the first one is unable to do so after a connection failure.
If the messages take variable time to process, you may also want to look at the SINGLE_MESSAGE_ACKNOWLEDGE mode of the Sonic JMS Session. Normally, calling acknowledge() on a message also acknowledges all messages that came before it. If you're processing them out of order, that's not what you want to happen. In single message acknowledge mode (which isn't in the JMS standard), acknowledge() only acknowledges the message on which it is called.
If you are worried about communicating with a message queue/broker/server/etc that might be down, and how that interrupts the overall flow of the larger process you are trying to design, then you should probably look into a JMS queue that supports clustering of servers so you can still reliably produce/consume messages when individual servers in the cluster go down.
Your question isn't 100% clear, but it seems the issue is that you're throwing an exception while processing a message when you really shouldn't be.
If there is an actual problem with the message, say the xml is malformed or it's invalid according to your data model, you do not want to roll back your transaction. You might want to log the error, but you have successfully processed that message, it's just that "success" in this case means that you've identified the message as problematic.
On the other hand, if there is a problem in processing the message that is caused by something external to the message (e.g. the database is down, or the destination topic is unavailable) you probably do want to roll the transaction back, however you also want to make sure you stop consuming messages until the problem is resolved otherwise you'll end up with the scenario you've described where you continually process the same message over and over and fail every time you try to access whatever resource is currently unavailable.
Without know what messaging provider you are using, I don't know whether this will help you.
MQ Series messages have a backout counter, that can be enabled by configuring the harden backout counter option on the queue.
When I have previously had this problem , I do as follows:
// get/receive message from queue
if ( backout counter > n ) {
move_message_to_app_dead_letter_queue();
return;
}
process_message();
The MQ series header fields are accessible as JMS properties.
Using the above approach would also help if you can use XA transactions to rollback or commit the database and the queue manager simultaneously.
However XA transactions do incur a significant performance penalty and with stored proc's this probably isn't possible.
An alternative approach would be to write the message immediately to a message_table as a blob, and then commit the message from the queue.
Put a trigger on the message_table to invoke the stored proc, and then add the JMS response mechanism into the stored proc.
Related
I have below configuration for rabbitmq
prefetchCount:1
ack-mode:auto.
I have one exchange and one queue is attached to that exchange and one consumer is attached to that queue. As per my understanding below steps will be happening if queue has multiple messages.
Queue write data on a channel.
As ack-mode is auto,as soon as queue writes message on channel,message is removed from queue.
Message comes to consumer,consumer start performing on that data.
As Queue has got acknowledgement for previous message.Queue writes next data on Channel.
Now,my doubt is,Suppose consumer is not finished with previous data yet.What will happen with that next data queue has written in channel?
Also,suppose prefetchCount is 10 and I have just once consumer attached to queue,where these 10 messages will reside?
The scenario you have described is one that is mentioned in the documentation for RabbitMQ, and elaborated in this blog post. Specifically, if you set a sufficiently large prefetch count, and have a relatively small publish rate, your RabbitMQ server turns into a fancy network switch. When acknowledgement mode is set to automatic, prefetch limiting is effectively disabled, as there are never unacknowledged messages. With automatic acknowledgement, the message is acknowledged as soon as it is delivered. This is the same as having an arbitrarily large prefetch count.
With prefetch >1, the messages are stored within a buffer in the client library. The exact data structure will depend upon the client library used, but to my knowledge, all implementations store the messages in RAM. Further, with automatic acknowledgements, you have no way of knowing when a specific consumer actually read and processed a message.
So, there are a few takeaways here:
Prefetch limit is irrelevant with automatic acknowledgements, as there are never any unacknowledged messages, thus
Automatic acknowledgements don't make much sense when using a consumer
Sufficiently-large prefetch when auto-ack is off, or any use of autoack = on will result in the message broker not doing any queuing, and instead doing routing only.
Now, here's a little bit of expert opinion. I find the whole notion of a message broker that "pushes" messages out to be a little backwards, and for this very reason- it's difficult to configure properly, and it is unclear what the benefit is. A queue system is a natural fit for a pull-based system. The processor can ask the broker for the next message when it is done processing the current message. This approach will ensure that load is balanced naturally and the messages don't get lost when processors disconnect or get knocked out.
Therefore, my recommendation is to drop the use of consumers altogether and switch over to using basic.get.
I am trying to understand the best use of RabbitMQ to satisfy the following problem.
As context I'm not concerned with performance in this use case (my peak TPS for this flow is 2 TPS) but I am concerned about resilience.
I have RabbitMQ installed in a cluster and ignoring dead letter queues the basic flow is I have a service receive a request, creates a persistent message which it queues, in a transaction, to a durable queue (at this point I'm happy the request is secured to disk). I then have another process listening for a message, which it reads (not using auto ack), does a bunch of stuff, writes a new message to a different exchange queue in a transaction (again now happy this message is secured to disk). Assuming the transaction completes successfully it manually acks the message back to the original consumer.
At this point my only failure scenario is is I have a failure between the commit of the transaction to write to my second queue and the return of the ack. This will lead to a message being potentially processed twice. Is there anything else I can do to plug this gap or do I have to figure out a way of handling duplicate messages.
As a final bit of context the services are written in java so using the java client libs.
Paul Fitz.
First of all, I suggest you to look a this guide here which has a lot of valid information on your topic.
From the RabbitMQ guide:
At the Producer
When using confirms, producers recovering from a channel or connection
failure should retransmit any messages for which an acknowledgement
has not been received from the broker. There is a possibility of
message duplication here, because the broker might have sent a
confirmation that never reached the producer (due to network failures,
etc). Therefore consumer applications will need to perform
deduplication or handle incoming messages in an idempotent manner.
At the Consumer
In the event of network failure (or a node crashing), messages can be
duplicated, and consumers must be prepared to handle them. If
possible, the simplest way to handle this is to ensure that your
consumers handle messages in an idempotent way rather than explicitly
deal with deduplication.
So, the point is that is not possibile in any way at all to guarantee that this "failure" scenario of yours will not happen. You will always have to deal with network failure, disk failure, put something here failure etc.
What you have to do here is to lean on the messaging architecture and implement if possibile "idempotency" of your messages (which means that even if you process the message twice is not going to happen anything wrong, check this).
If you can't than you should provide some kind of "processed message" list (for example you can use a guid inside every message) and check this list every time you receive a message; you can simply discard them in this case.
To be more "theorical", this post from Brave New Geek is very interesting:
Within the context of a distributed system, you cannot have
exactly-once message delivery.
Hope it helps :)
I am trying to solve the following case:
I am consuming messages, but take an outage in a system I am depending on for proper message processing (say a Database for example)
I am using CLIENT_ACKNOWLEDGE, and only calling the .acknowledge() method when no exception is thrown.
This works fine when I throw an exception, messages are not acknowledged, and I can see the unacknowledged queue building up. However, these messages have all already been delivered to the consumer.
Suppose now the Database comes back online, and any new message is processed successfully. So I call .acknowledge on them. I read that calling .acknowledge() acknowledges not only that message, but also all previously received messages in the consumer.
This is not what I want! I need these previously unacknowledged messages to be redelivered / retried. I would like to keep them on the queue and let JMS handle the retry, since maintaining a Collection in the consumer of "messages to be retried" might put at risk losing those messages ( since .acknowledge already ack'ed all of them + say the hardware failed).
Is there a way to explicitly acknowledge specific messages and not have this "acknowledge all prior messages" behavior?
Acknowledging specific message is not defined by JMS specification. Hence some JMS implementers provide per messaging acknowledging and some don't. You will need to check your JMS provider documentation.
Message queues generally will have an option on how the messages are delivered to a client, either First in first out (FIFO) or Priority based. Choose FIFO option so that all messages are delivered in the same order they came into a queue. When database goes offline and comes back, call recover method to redeliver all messages in the same order again.
You need to call recover on your session after the failure to restart message delivery from the first unacked message. From the JMS 1.1 spec section 4.4.11
When CLIENT_ACKNOWLEDGE mode is used, a client may build up a large
number of unacknowledged messages while attempting to process them. A
JMS provider should provide administrators with a way to limit client
over-run so that clients are not driven to resource exhaustion and
ensuing failure when some resource they are using is temporarily
blocked.
A session’s recover method is used to stop a session and restart it
with its first unacknowledged message. In effect, the session’s series
of delivered messages is reset to the point after its last
acknowledged message. The messages it now delivers may be different
from those that were originally delivered due to message expiration
and the arrival of higher-priority messages.
Let me try explaining the situation:
There is a messaging system that we are going to incorporate which could either be a Queue or Topic (JMS terms).
1 ) Producer/Publisher : There is a service A. A produces messages and writes to a Queue/Topic
2 ) Consumer/Subscriber : There is a service B. B asynchronously reads messages from Queue/Topic. B then calls a web service and passes the message to it. The webservice takes significant amount of time to process the message. (This action need not be processed real-time.)
The Message Broker is Tibco
My intention is : Not to miss out processing any message from A. Re-process it at a later point in time in case the processing failed for the first time (perhaps as a batch).
Question:
I was thinking of writing the message to a DB before making a webservice call. If the call succeeds, I would mark the message processed. Otherwise failed. Later, in a cron job, I would process all the requests that had initially failed.
Is writing to a DB a typical way of doing this?
Since you have a fail callback, you can just requeue your Message and have your Consumer/Subscriber pick it up and try again. If it failed because of some problem in the web service and you want to wait X time before trying again then you can do either schedule for the web service to be called at a later date for that specific Message (look into ScheduledExecutorService) or do as you described and use a cron job with some database entries.
If you only want it to try again once per message, then keep an internal counter either with the Message or within a Map<Message, Integer> as a counter for each Message.
Crudely put that is the technique, although there could be out-of-the-box solutions available which you can use. Typical ESB solutions support reliable messaging. Have a look at MuleESB or Apache ActiveMQ as well.
It might be interesting to take advantage of the EMS platform your already have (example 1) instead of building a custom solution (example 2).
But it all depends on the implementation language:
Example 1 - EMS is the "keeper" : If I were to solve such problem with TIBCO BusinessWorks, I would use the "JMS transaction" feature of BW. By encompassing the EMS read and the WS call within the same "group", you ask for them to be both applied, or not at all. If the call failed for some reason, the message would be returned to EMS.
Two problems with this solution : You might not have BW, and the first failed operation would block all the rest of the batch process (that may be the desired behavior).
FYI, I understand it is possible to use such feature in "pure java", but I never tried it : http://www.javaworld.com/javaworld/jw-02-2002/jw-0315-jms.html
Example 2 - A DB is the "keeper" : If you go with your "DB" method, your queue/topic customer continuously drops insert data in a DB, and all records represent a task to be executed. This feels an awful lot like the simple "mapping engine" problem every integration middleware aims to make easier. You could solve this with anything from a custom java code and multiples threads (DB inserter, WS job handlers, etc.) to an EAI middleware (like BW) or even a BPM engine (TIBCO has many solutions for that)
Of course, there are also other vendors... EMS is a JMS standard implementation, as you know.
I would recommend using the built in EMS (& JMS) features,as "guaranteed delivery" is what it's built for ;) - no db needed at all...
You need to be aware that the first decision will be:
do you need to deliver in order? (then only 1 JMS Session and Client Ack mode should be used)
how often and in what reoccuring times do you want to retry? (To not make an infinite loop of a message that couldn't be processed by that web service).
This is independent whatever kind of client you use (TIBCO BW or e.g. Java onMessage() in a MDB).
For "in order" delivery: make shure only 1 JMS Session processes the messages and it uses Client acknolwedge mode. After you process the message sucessfully, you need to acknowledge the message with either calling the JMS API "acknowledge()" method or in TIBCO BW by executing the "commit" activity.
In case of an error you don't execute the acknowledge for the method, so the message will be put back in the Queue for redelivery (you can see how many times it was redelivered in the JMS header).
EMS's Explicit Client Acknolwedge mode also enables you to do the same if order is not important and you need a few client threads to process the message.
For controlling how often the message get's processed use:
max redelivery properties of the EMS queue (e.g. you could put the message in the dead
letter queue afer x redelivery to not hold up other messages)
redelivery delay to put a "pause" in between redelivery. This is useful in case the
Web Service needs to recover after a crash and not gets stormed by the same message again and again in high intervall through redelivery.
Hope that helps
Cheers
Seb
I want to send a batch of 20k JMS messages to a same queue. I'm splitting the task up using 10 threads, so each will be processing 2k messages. I don't need transactions.
I was wondering if having one connection, one session, and 10 producers is the recommended way to go or not?
How about if I had one producer shared by all the threads? Would my messages be corrupt or would it be sent out synchronized (giving no performance gain)?
What's the general guideline of deciding whether to create a new connection or session if I'm always connecting to the same queue?
Thank you and sorry for asking a lot at once.
(Here's a similar question, but it didn't quite answer what I was looking for. Long lived JMS sessions. Is Keeping JMS connections / JMS sessions allways open a bad pratice? )
Is it OK if some of the messages are duplicated or lost? When the JMS client connects to the JMS broker over the network there are three phases to any API call.
The API call, including any message data, is transmitted over the wire to the broker.
The API call is executed by the broker.
The result code and any message data is transmitted back to the client.
Consider the producer for a minute. If the connection is broken in the first step then the broker never got the message and the app would need to send it again. If the connection is broken in the third step then the message has been successfully sent and sending it again would produce a duplicate message. The app cannot tell the difference between these and so the only safe choice is to resend the message on error. If the session is transacted the message can be safely resent in all cases because if the original had made it to the broker, it will be rolled back.
Consider the consumer. If the connection is lost in the third step then the message is deleted from the queue but never made it back to the client. But if the session is transacted the message will be redelivered when the application reconnects.
Outside of transactions there is the possibility of lost or duplicate messages. Inside of a transaction the same window of ambiguity exists but it is on the COMMIT call rather then the PUT or GET. With transacted sessions it is possible to send or receive a message twice but not to lose one.
The JMS spec recognizes this window of ambiguity and provides the following guidance:
If a failure occurs between
the time a client commits its work on
a Session and the commit method
returns, the client cannot determine
if the transaction was committed or
rolled back. The same ambiguity exists
when a failure occurs between the
non-transactional send of a PERSISTENT
message and the return from the
sending method.
It is up to a JMS application to deal
with this ambiguity. In some cases,
this may cause a client to produce
functionally duplicate messages.
A message that is redelivered due to
session recovery is not considered a
duplicate message.
JMS sessions should always be transacted except for cases where it really is OK to lose messages. If the sessions are transacted then you'd need session and connection per-thread due to the JMS thread model.
Any advice about performance impacts would be vendor-specific but in general persistent messages outside of syncpoint are hardened to disk before the API call returns. But a transacted call can return before the persistent message is written to disk so long as the message is persisted before the COMMIT returns. If the vendor optimizes based on this, then it is much more performant to write several messages to disk and then commit them in batches. This allows the broker to optimize writes and disk flushes by disk block rather than per-message. The number of messages to put in the transaction decreases with the size of the message and beyond a certain message size dwindles back down to one.
If your 20k messages are relatively small (measured in k and not mb) then you probably want to use transacted sessions per thread and tune the commit interval.
In most scenarios it is sufficient to work with one connection and multiple sessions, using one session per thread. In some environments you can gain additional performance by using multiple connections:
Some messaging systems support a cluster mode, where connections get loadbalanced to different nodes. With multiple connections you can use the performance of multiple nodes in this scenario. (which of course does only help, when the bottleneck is on the side of the message broker).
The best solution would be to us a pool of connections, and give the administrator some options to configure the behaviour in the specific area.
I was wondering if having one connection, one session, and 10 producers
is the recommended way to go or not?
Sure but point to note here is that you are using single thread only i.e the one which you create while creating Session object. All 10 producers are bound to this Session Object and consequently to same thread.
How about if I had one producer shared by all the threads? Would my messages
be corrupt or would it be sent out synchronized (giving no performance gain)?
Very bad idea I would say. JMS specs clearly say Session should not be shared by more than one thread. It is not thread safe.
What's the general guideline of deciding whether to create a new connection
or session if I'm always connecting to the same queue?
If your System supports multithreading then you can create multiple sessions(each session corresponds to single thread) from a single connection. Each session than can have multiple producers/consumer but all these must not be shared among threads.
From what I investigat this topic, one session means one thread. This is based on JMS specs. If you want the multiple threading (multiple producers/consumers), multiple sessions needs to be created, one connection is fine.
In theory Connections are thread-safe but all the others are not, so you should create one session per thread.
In reality, it depends on the JMS implementation you are using.