JMS - Asynchronous Processing - Dealing with Parent / Child processes Dependencies - java

Issue: I have a single process request start that breaks up into multiple levels of queues / MDBs to speed up processing through parallelism. The question is, what is the best way to know when each level of processing is complete to do a close out process? Keep in mind I am dealing with high volumes of messages so performance has to be taken into consideration.
Technology Stack:
EJB 3.0 MDB's
Hibernate 4.2.11 ()
Spring 4.0.1
Websphere MQ's
Oracle 11g database.
Solution 1 Attempted: Parent process polls sub process until its complete at each sub-level. This means an open MDB session that will consistently poll a response queue for messages at each level for it's sub process to complete.
Advantage: You avoid calls constantly to the database to determine, "Am I done yet" .
Disadvantage:
This solution maintains consistent open input connections to MQ while it waits and polls for the process to complete. Number of connections go up as you scale.
If any messages are lost, it will throw off the count of the polling mechanism. Not very reliable.
If you have persistence turned on ( which you should most times ) it will re-process the initial request since it will still be open, re-doing the entire request.
Solution 2 Proposal:
Instead of using polling mechanism, use more MDB's binded to response queues at each level of processing. Have everything work in isolation.
How to determine if the process is complete? As each message comes to response MDB , it can check the database status table to determine if its complete.
Advantage:
List item
All messages work in isolation.
Better suited for supporting high availability and persistent messages.
Prevents any long running open processes against the MQ.
Disadvantage: This could mean many calls to the database for determining completeness. I think this would be a major scalability issue as the number of messages go up.
I haven't used Spring batch and Spring Integration a lot but maybe that is where I should be looking for solution. Hoping someone with a lot of experience in message flow's with MDB's and MQ can give me some direction in terms of scaling / determining when a process is complete.

A different solution would be to use the JMS bus for the children to notify the parent process when their sub task is complete. The architecture would look like this.
Parent has a list of child tasks (with ids) that need to be completed.
A new topic in the MQ where the child can post completion status.
A new MDB that is bound to the completion status topic that updates the parent (for e.g. remove the completed tasks from the list).
When the completion status for all the sub tasks is received, the parent is done.
This way, there is no additional load due to polling on the children (solution 1) or the database (solution 2).
This would also allow you to extend the architecture to introduce more sophisticated control mechanisms. For e.g. you can monitor the progress of the tasks / make sure none of them crashed by posting periodic progress messages from children to the status topic.

Related

Queueing a message in JMS, for delayed processing

I have a piece of middleware that sits between two JMS queues. From one it reads, processes some data into the database, and writes to the other.
Here is a small diagram to depict the design:
With that in mind, I have some interesting logic that I would like to integrate into the service.
Scenario 1: Say the middleware service receives a message from Queue 1, and hits the database to store portions of that message. If all goes well, it constructs a new message with some data, and writes it to Queue 2.
Scenario 2: Say that the database complains about something, when the service attempts to perform some logic after getting a message from Queue 1.In this case, instead of writing a message to Queue 2, I would re-try to perform the database functionality in incremental timeouts. i.e Try again in 5 sec., then 30 sec, then 1 minute if still down. The catch of course, is to be able to read other messages independently of this re-try. i.e Re-try to process this one request, while listening for other requests.
With that in mind, what is both the correct and most modern way to construct a future proof solution?
After reading some posts on the net, it seems that I have several options.
One, I could spin off a new thread once a new message is received, so that I can both perform the "re-try" functionality and listen to new requests.
Two, I could possibly send the message back to the Queue, with a delay. i.e If the process failed to execute in the db, write the message to the JMS queue by adding some amount of delay to it.
I am more fond of the first solution, however, I wanted to get the opinion of the community if there is a newer/better way to solve for this functionality in java 7. Is there something built into JMS to support this sort of "send message back for reprocessing at a specific time"?
JMS 2.0 specification describes the concept of delayed delivery of messages. See "What's new" section of https://java.net/projects/jms-spec/pages/JMS20FinalReleaseMany JMS providers have implemented the delayed delivery feature.
But I wonder how the delayed delivery will help your scenario. Since the database writes have issues, subsequent messages processing and attempt to write to database might end up in same situation. I guess it might be better to sort out issues with database updates and then pickup messages from queue.

How to effectively process lot of objects on a list on server side

I have a List which contains a lot of objects.
The problem is that i have to process these objects (process includes cloning, deep copy, and making DB calls, running business logic etc etc.
Doing this in a normal fashion, first come first serve is really time consuming and in a web application , this generally results in transaction timeouts at the server side (as this processing is anync from client perspective).
How do i process those objects so as to take minimal time and not overload the DB.
I'm using java 7 on server environment.
I'm already using a messaging solution , rabbitmq, which gets me the item and its quantity. problem occurs when i try to deep copy items to mimic real items (business logic every item should be uniquely processed) and save them to DB.
After some discussions, the viable solution is using a ABQ (array blocking queues) which is processed by a pool of threads.
Following are the thought out benefits:
1) we wont have to manage the 3rd party queues created e.g. rabbitmq
2) At any point in time the blocking queue wont have all the items to be processed as the consumer threads will be simultaneously processing them, so it will leave lesser memory footprint.
#cody123 i'm using spring batch for retry mechanisms in this case.
After another round of profiling i found that the bottle neck was the DB connection pool having low number of max connections.
I deduced this by running the same transaction without db thread pool and it went perfectly well and completed without any exception.
However combining the previous approach i.e. managing an ABQ and light commits with HA DB will be the best solution.

Handling Failed calls on the Consumer end (in a Producer/Consumer Model)

Let me try explaining the situation:
There is a messaging system that we are going to incorporate which could either be a Queue or Topic (JMS terms).
1 ) Producer/Publisher : There is a service A. A produces messages and writes to a Queue/Topic
2 ) Consumer/Subscriber : There is a service B. B asynchronously reads messages from Queue/Topic. B then calls a web service and passes the message to it. The webservice takes significant amount of time to process the message. (This action need not be processed real-time.)
The Message Broker is Tibco
My intention is : Not to miss out processing any message from A. Re-process it at a later point in time in case the processing failed for the first time (perhaps as a batch).
Question:
I was thinking of writing the message to a DB before making a webservice call. If the call succeeds, I would mark the message processed. Otherwise failed. Later, in a cron job, I would process all the requests that had initially failed.
Is writing to a DB a typical way of doing this?
Since you have a fail callback, you can just requeue your Message and have your Consumer/Subscriber pick it up and try again. If it failed because of some problem in the web service and you want to wait X time before trying again then you can do either schedule for the web service to be called at a later date for that specific Message (look into ScheduledExecutorService) or do as you described and use a cron job with some database entries.
If you only want it to try again once per message, then keep an internal counter either with the Message or within a Map<Message, Integer> as a counter for each Message.
Crudely put that is the technique, although there could be out-of-the-box solutions available which you can use. Typical ESB solutions support reliable messaging. Have a look at MuleESB or Apache ActiveMQ as well.
It might be interesting to take advantage of the EMS platform your already have (example 1) instead of building a custom solution (example 2).
But it all depends on the implementation language:
Example 1 - EMS is the "keeper" : If I were to solve such problem with TIBCO BusinessWorks, I would use the "JMS transaction" feature of BW. By encompassing the EMS read and the WS call within the same "group", you ask for them to be both applied, or not at all. If the call failed for some reason, the message would be returned to EMS.
Two problems with this solution : You might not have BW, and the first failed operation would block all the rest of the batch process (that may be the desired behavior).
FYI, I understand it is possible to use such feature in "pure java", but I never tried it : http://www.javaworld.com/javaworld/jw-02-2002/jw-0315-jms.html
Example 2 - A DB is the "keeper" : If you go with your "DB" method, your queue/topic customer continuously drops insert data in a DB, and all records represent a task to be executed. This feels an awful lot like the simple "mapping engine" problem every integration middleware aims to make easier. You could solve this with anything from a custom java code and multiples threads (DB inserter, WS job handlers, etc.) to an EAI middleware (like BW) or even a BPM engine (TIBCO has many solutions for that)
Of course, there are also other vendors... EMS is a JMS standard implementation, as you know.
I would recommend using the built in EMS (& JMS) features,as "guaranteed delivery" is what it's built for ;) - no db needed at all...
You need to be aware that the first decision will be:
do you need to deliver in order? (then only 1 JMS Session and Client Ack mode should be used)
how often and in what reoccuring times do you want to retry? (To not make an infinite loop of a message that couldn't be processed by that web service).
This is independent whatever kind of client you use (TIBCO BW or e.g. Java onMessage() in a MDB).
For "in order" delivery: make shure only 1 JMS Session processes the messages and it uses Client acknolwedge mode. After you process the message sucessfully, you need to acknowledge the message with either calling the JMS API "acknowledge()" method or in TIBCO BW by executing the "commit" activity.
In case of an error you don't execute the acknowledge for the method, so the message will be put back in the Queue for redelivery (you can see how many times it was redelivered in the JMS header).
EMS's Explicit Client Acknolwedge mode also enables you to do the same if order is not important and you need a few client threads to process the message.
For controlling how often the message get's processed use:
max redelivery properties of the EMS queue (e.g. you could put the message in the dead
letter queue afer x redelivery to not hold up other messages)
redelivery delay to put a "pause" in between redelivery. This is useful in case the
Web Service needs to recover after a crash and not gets stormed by the same message again and again in high intervall through redelivery.
Hope that helps
Cheers
Seb

How can I unregister from a JMS message group without killing the entire consumer (ActiveMQ)

In my distributed application, I am dispatching processing requests to a JMS queue. I have multiple nodes consuming from that queue (load balancing). Processing the requests requires a rather large chunk of user-specific data to be loaded into memory and I obviously want to keep that data in memory for subsequent requests. Thus, I'm using JMSXGroupId with the user-id to make sure, that all requests for a specific user are handled by the node that already has the data cached.
After some time, when the user is no longer active, I want to unload the data on the node. At that same time, I would like that node to give up ownership of the associated JMS message group.
I know I give up ownership of the group by shutting down the corresponding consumer. However, that would mean I'd lose ownership of all groups associated with that consumer and not just the one for which I just unloaded the cached data.
Is there a way of giving up ownership of a specific group on the consumer side?
A broker-independant way would be preferable but I'd settle for an ActiveMQ specific solution if that is the only way. Also, feel free to suggest how this might be done with your favorite message broker.
You can't do this right now without closing the consumer.
Consumer != connection btw - so why aren't you using multiple consumers per connection - one per group ?

Multithread-safe JDBC Save or Update

We have a JMS queue of job statuses, and two identical processes pulling from the queue to persist the statuses via JDBC. When a job status is pulled from the queue, the database is checked to see if there is already a row for the job. If so, the existing row is updated with new status. If not, a row is created for this initial status.
What we are seeing is that a small percentage of new jobs are being added to the database twice. We are pretty sure this is because the job's initial status is quickly followed by a status update - one process gets one, another process the other. Both processes check to see if the job is new, and since it has not been recorded yet, both create a record for it.
So, my question is, how would you go about preventing this in a vendor-neutral way? Can it be done without locking the entire table?
EDIT: For those saying the "architecture" is unsound - I agree, but am not at liberty to change it.
Create a unique constraint on JOB_ID, and retry to persist the status in the event of a constraint violation exception.
That being said, I think your architecture is unsound: If two processes are pulling messages from the queue, it is not guaranteed they will write them to the database in queue order: one consumer might be a bit slower, a packet might be dropped, ..., causing the other consumer to persist the later messages first, causing them to be overridden with the earlier state.
One way to guard against that is to include sequence numbers in the messages, update the row only if the sequence number is as expected, and delay the update otherwise (this is vulnerable to lost messages, though ...).
Of course, the easiest way would be to have only one consumer ...
JDBC connections are not thread safe, so there's nothing to be done about that.
"...two identical processes pulling from the queue to persist the statuses via JDBC..."
I don't understand this at all. Why two identical processes? Wouldn't it be better to have a pool of message queue listeners, each of which would handle messages landing on the queue? Each listener would have its own thread; each one would be its own transaction. A Java EE app server allows you to configure the size of the message listener pool to match the load.
I think a design that duplicates a process like this is asking for trouble.
You could also change the isolation level on the JDBC connection. If you make it SERIALIZABLE you'll ensure ACID at the price of slower performance.
Since it's an asynchronous process, performance will only be an issue if you find that the listeners can't keep up with the messages landing on the queue. If that's the case, you can try increasing the size of the listener pool until you have adequate capacity to process the incoming messages.

Categories