JMS with Consumer performing long running process (10mins) - java

I have a scenario in Spring boot applications - where I will be getting requests from other applications via REST "Run Process" Service and these will be placed on MQ queues. Then consumers will process one by one and the consumer calls another REST "Initiate Request" service which will take around 10mins to get a response back. I am looking for ideas/solutions where I can fire the REST "Initiate Request" service and forget then stop the consumer. Once "Initiate Request" completes its processing, the system will send an event notification indicating this process has completed/failed. Based on this I would like to proceed with the next Queue item. Is there a way to Stop and Start consumers based on notification to avoid long-running threads? If you have come across this problem, let me know how you have resolved it.
There are other solutions like
Consumers to persist the data to database and process row by row.
Using webflux we can avoid JMS-Consumer thread but not consumer calling REST

Related

How to create an infinite receiver/listener in Azure Service Bus - Java Spring Boot

Requirement is to create a background listener process that will receive and process the message from the service bus subscriptions.
I have searched many resources but couldnt able to find a solution. The message should not be in DELETE mode but rather be in PEEK_LOCK mode so that we can abandon the messages whenever an error occurs in the processing logic where it will re queue again.
Service bus tier : Standard.
Thank you.

Processing specific AWS SQS messages in specific threads

The Java application posts async jobs to AWS and gets back a JobID. When the async job is finished, a message will appear in an SQS queue with that JobID. Each JobID is handled by a different thread. Each of those threads also polls SQS for messages until it finds the message which contains its JobID. Additionally, the application is distributed into multiple services so there can't be a single SQS processor.
I saw that SQS returns a maximum of 10 messages and after they are returned, a visibility timeout is applied so that they are not re-sent to other consumers. However, my consumers are the threads that want to consume only a single message and let the rest be consumed by other threads. Should I set the visibility timeout to 0? Will this make it so all consumers get the same set of 10 messages on every request? What's the best way for each consumer to sift through all the messages and find the one it wants?
TL;DR: SQS has 100 messages and there are 100 consumers, one for each message. How should I go about having each consumer find the message it wants (based on a JobID).
EDIT: I know that this is not an appropriate usage of SQS and I'd be very glad to not use it at all but our main integration is with Amazon Textract for which it is mandatory to use SQS for its asynchronous operations. Each Textract request is processed by a different thread which means that they each need to get back a specific SQS message, consumers are not universal. Not to mention the possibility of a clustered environment for which I'd like to avoid having to do any synchronization...
EDIT 2: This is for an on-premises, Setup.exe based, dev-hands-off application where we want to minimize the amount of unneeded AWS services used (both for cost and for customer setup/maintenance reasons) as well as the use of external components, again to minimize customer deployment/maintenance/servers. I understand that we are living in the world of microservices but there are still applications that want to benefit from intelligent services without being cloud-native themselves.
This is not an appropriate architecture for using Amazon SQS. Your processes should not be trying to find a specific message from an Amazon SQS queue.
You should find a different architecture for this message-passing task. Some ideas:
Create a message in Amazon S3 with an 'expected' Key. Have the each thread look for that object as a return message. (This is effectively using Amazon S3 as a Key-Value Store.)
Have a single Lambda function retrieve messages from SQS and update a database (or S3 as above). Then, have the threads consult the database instead of SQS.
I think you need to put something in between SQS and your threads. Like a DynamoDB table. You could have a Lambda function that processes all the SQS messages and just translates them into DynamoDB records. Then your different threads could easily check for the specific records they are interested in using a DynamoDB query.
Just because Textract mandates that you use SQS doesn't mean the final step in your architecture has to read those messages directly from SQS. In this case SQS is just a message bus that can integrate with other services in AWS, and those services are your building blocks you can use to create the architecture you need.

Using Spring Cloud Streaming, how to send messages to binder synchronously?

I'm building a cloud stream application using Spring with Azure Event Hub and Service bus.
In my use case I'm trying to achieve the following functionality :
Application that receives messages from a single binder (event hub)
Process the messages in a few steps, step A, B, C for example
Each step in the process creates objects
Stream the objects which were created in each step to different binders. If sending messages failed in any step, don't proceed
The question is, does the message sending is sync or async? Will it wait until all messages sent in step A before executing the next steps?
To make sure the message production happens synchronously you can simply set the spring.cloud.stream.eventhub.bindings.<channel-name>.producer.sync property to true.
The related documentation can be found here.
You can see the property referenced in the code here and here.

How to check if data is getting processed in Spring Integration or sitting idle

This is regarding Spring integration(SI) Application where in my case there are many endpoints present. So usually when data enters into this application, it takes about 60 sec to get processed completly.
I am now trying to build shutdown mechanism for this application which will do following things :-
It will first stop ingestion layer endpoint (in my case a kafka listener), so that no more message will enter into application
It will then wait for 60 secs before getting shutdown. So that existing message gets processed.
But this wait time is hardcoded and i want to check if application is processing any data or not. If yes then wait for 30 secs and then check again. If no data are getting processed then shutdown the application.
Kindly let me know if there are any ways which i can check if data are present in any of the SI endpoints.
There is no hooks like this in the out-of-the-box components. And probably it is even not possible to implement that since all the component in the framework are stateless.
Now tell me, please, what makes you think that you need to implement your own shutdown mechanism. Why the regular ApplicationContext.close() is not enough for you?
See more about lifecycle in the docs: https://docs.spring.io/spring-framework/docs/current/reference/html/core.html#beans-factory-nature
With that on board the framework indeed stops inbound endpoints first to not let external data to enter the application while it is in the shutdown state. Then it stops all other internal endpoints to stop processing their incoming messages. But all on-the-fly messages are still processed. The application context is not done if there is something executing.
If that still not enough for you, I'd suggest something like an AtomicInteger activeCount as a global bean. You incrementAndGet() it when the message is emitted by your mentioned Kafka listener. When you done processing the message in the end of flow your call its decrementAndGet(). And when your custom shutdown function is in progress, you just check the number of that activeCount.get() to be sure that it is 0 to kill your process gracefully.
But again: we don't need all of that because the standard ApplicationContext.close() covers us.

Architecture of Java Servlet Browser push notification

I am implementing sending of browser push notifications via Google Cloud Messaging and Firefox Push Notification System. For this, we have to make HTTP Post requests to GCM and FPNS.
To make HTTP request to GCM/FPNS we should have user registration IDs. Using JavaScript we are collecting registration IDs and storing it in Cassandra. Each record contains user registration information (Registration ID and browser type).
When we make an HTTP request to GCM/FPNS we should send registration IDs along with the request to GCM/FPNS based on browser type (if user registration ID belongs to Chrome we will make GCM request otherwise FPNS request). For example, if we have 10,000 records we should make around 10,000 requests to FPNS/GCM.
Once GCM/FPNS receives the user registration IDs, it will send a push notification to the browser. In browser, we have JavaScript code (Service Worker) to handle the notification event.
For above requirement, synchronous servlet architecture is not good enough. Because to process 10,000 records, it may take assuming 10 to 15 minutes, even if we are using multithreading. It may cause tomcat memory leakage and an out of memory exception.
When I was searching online, people are suggesting asynchronous servlet architecture. Once we take the request from the client to send the notification we will have respond immediately (something like 200 Ok Added to queue) and also this request should be added to Message Queue (JMS). From JMS we use multithreading to make asynchronous HTTP requests.
I am not finding the correct way of doing this. Can you suggest a way of implementing this functionality (Architecture Design and control flow)?
Short of changing to something like PubNub, I would create a worker queue. This could be done with JMS or just a shared Queue (search for producer/consumer). JMS would be, in my opinion, the easiest though it gets harder to distribute in a cluster.
Basically you could continue to have a synchronous servlet - it would take the message, put it on the queue, and return the 200. Placing a message on the queue would have very minimal blocking - a couple of milliseconds at best.
As you indicated, on the queue consumer side you would then have to handle many requests. Depending on the latency requirements of your system you may need to thread or off load that. It really depends on how fast you need to send the messages.
For a totally different architecture, you could consider a "queue in the cloud". I've used Amazon SQS for things like this. You wouldn't even have a servlet - the message would go straight to SQS and then something else would pull it off and process it.
For reference I don't work for Amazon or PubNub.

Categories