Waiting for all threads to finish in Spring Integration - java

I have a self-executable jar program that relies heavily on Spring Integration. The problem I am having is that the program is terminating before the other Spring beans have completely finished.
Below is a cut-down version of the code I'm using, I can supply more code/configuration if needed. The entry point is a main() method, which bootstraps Spring and starts the import process:
public static void main(String[] args) {
ctx = new ClassPathXmlApplicationContext("flow.xml");
DataImporter importer = (DataImporter)ctx.getBean("MyImporterBean");
try {
importer.startImport();
} catch (Exception e) {
e.printStackTrace();
} finally {
ctx.close();
}
}
The DataImporter contains a simple loop that fires messages to a Spring Integration gateway. This delivers an active "push" approach to the flow, rather than the common approach of polling for data. This is where my problem comes in:
public void startImport() throws Exception {
for (Item item : items) {
gatewayBean.publish(item);
Thread.sleep(200); // Yield period
}
}
For completeness, the flow XML looks something like this:
<gateway default-request-channel="inChannel" service-interface="GatewayBean" />
<splitter input-channel="inChannel" output-channel="splitChannel" />
<payload-type-router input-channel="splitChannel">
<mapping type="Item" channel="itemChannel" />
<mapping type="SomeOtherItem" channel="anotherChannel" />
</payload-type-router>
<outbound-channel-adapter channel="itemChannel" ref="DAOBean" method="persist" />
The flow starts and processes items effectively, but once the startImport() loop finishes the main thread terminates and tears down all the Spring Integration threads immediately. This results in a race condition, the last (n) items are not completely processed when the program terminates.
I have an idea of maintaining a reference count of the items I am processing, but this is proving to be quite complicated, since the flow often splits/routes the messages to multiple service activators - meaning it is difficult to determine if each item has "finished".
What I think I need is some way to either check that no Spring beans are still executing, or to flag that all items sent to the gateway have been completely processed before terminating.
My question is, how might I go about doing either of these, or is there a better approach to my problem I haven't thought of?

You're not using a request-response pattern here.
outbound-channel-adapter is a fire and forget action, if you want to wait for the response you should use an outbound-gateway that will wait for response, and connect the response to the original gateway, then in java sendAndReceive not just publish.

If you can get an Item to determine, whether it is still needed or not (processingFinished() or something similar executed in the back-end-stages), you can register all Items at a central authority, which keeps track of the number of non-finished Items and effecitvely determines a termination-condition.
If this approach is feasible, you could even think of packaging the items into FutureTask-objects or make use of similar concepts from java.util.concurrent.
Edit: Second Idea:
Have you thought about making the channels more intelligent? A sender closes the channel once it does not send any more data. In this scenario, the worker-beans do not have to be deamon threads but can determine their termination-criterion based on a closed and empty input channel.

Related

Spring AMQP #RabbitListener is not ready to receive messages on #ApplicationReadyEvent. Queues/Bindings declared too slow?

we have a larger multi service java spring app that declares about 100 exchanges and queues in RabbitMQ on startup. Some are declared explicitly via Beans, but most of them are declared implicitly via #RabbitListener Annotations.
#Component
#RabbitListener(
bindings = #QueueBinding(key = {"example.routingkey"},
exchange = #Exchange(value = "example.exchange", type = ExchangeTypes.TOPIC),
value = #Queue(name = "example_queue", autoDelete = "true", exclusive = "true")))
public class ExampleListener{
#RabbitHandler
public void handleRequest(final ExampleRequest request) {
System.out.println("got request!");
}
There are quite a lot of these listeners in the whole application.
The services of the application sometimes talk to each other via RabbitMq, so take a example Publisher that publishes a message to the Example Exchange that the above ExampleListener is bound to.
If that publish happens too early in the application lifecycle (but AFTER all the Spring Lifecycle Events are through, so after ApplicationReadyEvent, ContextStartedEvent), the binding of the Example Queue to the Example Exchange has not yet happend and the very first publish and reply chain will fail. In other words, the above Example Listener would not print "got request".
We "fixed" this problem by simply waiting 3 seconds before we start sending any RabbitMq messages to give it time to declare all queues,exchanges and bindings but this seems like a very suboptimal solution.
Does anyone else have some advice on how to fix this problem? It is quite hard to recreate as I would guess that it only occurs with a large amount of queues/exchanges/bindings that RabbitMq can not create fast enough. Forcing Spring to synchronize this creation process and wait for a confirmation by RabbitMq would probably fix this but as I see it, there is no built in way to do this.
Are you using multiple connection factories?
Or are you setting usePublisherConnection on the RabbitTemplate? (which is recommended, especially for a complex application like yours).
Normally, a single connection is used and all users of it will block until the admin has declared all the elements (it is run as a connection listener).
If the template is using a different connection factory, it will not block because a different connection is used.
If that is the case, and you are using the CachingConnectionFactory, you can call createConnection().close() on the consumer connection factory during initialization, before sending any messages. That call will block until all the declarations are done.

#async vs message queue difference

I have a spring boot project, deploying in two servers and using nginx. One method in the project will do:
set some key-values in redis
insert something in db
After 1, I want to do 2 in async way.
One solution is to let doDB() be a springboot #async method:
Class A {
public void ***() {
doRedis() // 1.set some key-values in redis
doDB() // 2.insert something in db
}
}
Class B {
#async
doDB()
}
Another solution is to send message to MQ:
Class A {
public void ***() {
doRedis() // 1.set some key-values in redis
sendMessage()
}
}
Class B {
onMessage(){
doDB()
}
}
If Class A and B are both in the spring boot project, just deploying this project in two servers. I think using #async is enough, there is no need to use MQ to achieve the async way because there is no difference between server one to do Class B doDB() and server two to do Class B doDB(). If class B is in another project, then using MQ is good because it's decoupling for project one doing redis work and project two doing db work.
Is it right? Thanks!
Basically, you are right, if it is going to be in the same application within the same server, no need for MQ because async is already has a queue. But there are some key points you should be decided on even if in the same application
if you care about ordering message queue is more meaningful, you can use async in this case too but you have to configure the thread pool to use only one thread to process async
if you care about losing messages and if some bad things happen before processing messages, you should use an MQ that saves messages to the disk or somewhere to process the rest of the messages later on
if your application gets a lot of requests and you did not carefully set threads in the async thread pool, you could get overflow errors or other problems with using machine resources
Choose within capabilities within your application, do not over-engineer from day one, you spin up from what you have and what you already tested

Spring DeferredResult setting result not resuming request

I'm experiencing some issues using DeferredResult with Spring, I think I misunderstand something about them.
These DeferredResult are used for long polling.
I have an application on which multiple users can be logged in, and which displays a list of items from a database. The same items are visible for all the connected users. Any user can, at a given time, select one of these items and interact with it, but then the other users cannot interact with this specific item anymore (say like an item is "locked" for the others as soon as someone select it until the user "releases" it).
Anyway, I'm using long polling to notify the others when one user "takes" an item, that this is item is now "locked", to update their interface.
Say I have for example a url for the polling like:
/polling and another like : /take/{itemId}
I have a web application using Spring MVC with the "old-style" XML configuration with the parameters for asynchronous processes:
<task:annotation-driven> in my servlet config, and more importantly (I think) <async-supported>true</async-supported> in the web.xml.
When a user calls /polling, the request is returning a DeferredResult:
private final BlockingQueue<DeferredResult<List<MyItem>>> pendingRequests = new LinkedBlockingQueue<>();
#RequestMapping("/polling")
public DeferredResult<List<MyItem>> poll(...) {
final DeferredResult<List<MyItem>> def = new DeferredResult<>(60000l);
// ... deferred result init like set onCompletion(), onTimeout()...
// add the deferred to a queue
pendingRequests.add(def);
return def;
}
Then when a user "takes" an item and calls /take/{itemId}, something like this:
#RequestMapping("/take/{itemId}")
public void takeItem(...) {
// ..marking the item as taken and saving it into DB..
// and then, notifying the other pending requests
// the item has been taken by someone
List<MyItem> updatedItemsList = getLastItemsFromDb();
for (DeferredResult<...> d : pendingRequests) {
d.setResult(updatedItemsList);
}
}
Note that in the updatedItemsList list, the specific is now marked as "taken".
So here it is, this seems to work fine that way. Immediately after setting result on each items, the corresponding request resumes and "breaks" the long polling process without waiting for the request timeout so that the front javascript can update the list and make a new long request.
The problem is that I recently tried to "convert" this web application by using Spring Boot, and all the "Java/annotation-based" configuration.
And this behavior does not work anymore. Indeed, it's like setting the DeferredResult (in the for loop in the 2nd request) does not trigger the requests to resume anymore, and it has to wait for the timeout to return the result.
However, I found that calling the setResult method on the DeferredResult objects in a TaskExecutor for example, makes everything work like before.
My question is, why ? Why does this work fine with the XML-style configuration, without any executor or explicit background setResult() call, and this does not work anymore with the Java based configuration ?
Did I miss something ?
FYI, the Java configuration is rather "classic", I set #EnableAutoConfiguration, #EnableAsync, extends from WebMvcConfigurerAdapter.
Thanks in advance for reading this and taking time to reply !

Using Spring #Async and ThreadPoolTaskScheduler with pool-size=1

We have a service implementation in our Spring-based web application that increments some statistics counters in the db. Since we don't want to mess up response time for the user we defined them asynchronous using Spring's #Async:
public interface ReportingService {
#Async
Future<Void> incrementLoginCounter(Long userid);
#Async
Future<Void> incrementReadCounter(Long userid, Long productId);
}
And the spring task configuration like this:
<task:annotation-driven executor="taskExecutor" />
<task:executor id="taskExecutor" pool-size="10" />
Now, having the pool-size="10", we have concurrency issues when two threads try two create the same initial record that will contain the counter.
Is it a good idea here to set the pool-size="1" to avoid those conflicts? Does this have any side affects? We have quite a few places that fire async operations to update statistics.
The side-effects would depend on the speed at which tasks are added to the executor in comparison to how quickly a single thread can process them. If the number of tasks being added per second is greater than the number that a single thread can process in a second you run the risk of the queue increasing in size over time until you finally get an out of memory error.
Check out the executor section at this page Task Execution. They state that having an unbounded queue is not a good idea.
If you know that you can process tasks faster than they will be added then you are probably safe. If not, you should add a queue capacity and handle the input thread blocking if the queue reaches this size.
Looking at the two examples you posted, instead of a constant stream of #Async calls, consider updating a JVM local variable upon client requests, and then have a background thread write that to the database every now and then. Along the lines of (mind the semi-pseudo-code):
class DefaultReportingService implements ReportingService {
ConcurrentMap<Long, AtomicLong> numLogins;
public void incrementLoginCounterForUser(Long userId) {
numLogins.get(userId).incrementAndGet();
}
#Scheduled(..)
void saveLoginCountersToDb() {
for (Map.Entry<Long, AtomicLong> entry : numLogins.entrySet()) {
AtomicLong counter = entry.getValue();
Long toBeSummedWithTheValueInDb = counter.getAndSet(0L);
// ...
}
}
}

Finishing a HttpServletResponse but continue processing

I have a situation that seems to fit the Async Servlet 3.0 / Comet situation but all I need to do is return a 200 response code (or other) after accepting the incoming parameters.
Is there a way for a HttpServlet to complete the http request/response handshake and yet continue processing?
Something like...
doPost( req, response ) {
// verify input params...
response.setStatus( SC_OK );
response.close();
// execute long query
}
EDIT: Looking at the javax.servlet package - the proper phrasing to my question is
How do I commit a response?
as in Servlet.isCommitted()
Here's how I've handled this situation:
When the app starts up, create an ExecutorService with Executors.newFixedThreadPool(numThreads) (there are other types of executors, but I suggest starting with this one)
In doPost(), create an instance of Runnable which will perform the desired processing - your task - and submit it to the ExecutorService like so: executor.execute(task)
Finally, you should return the HTTP Status 202 Accepted, and, if possible, a Location header indicating where a client will be able to check up on the status of the processing.
I highly recommend you read Java Concurrency in Practice, it's a fantastic and very practical book.
On possibility for your servlet to accept a request for processing in the background, is for the servlet to hand off processing to a separate thread which then executes in the background.
Using Spring, you can invoke a separate Thread using the a TaskExecutor. The advantage of using spring over standard JDK 5 java.util.concurrent.Executor is that if you're on application servers that need to use managed threads (IBM websphere or Oracle weblogic), you can use the WorkManagerTaskExecutor to hook into the CommonJ work managers.
Another alternative would be to move the long query logic into a Message Driven Bean or Message Driven POJO (Spring JMS can help here) and let the servlet simply post a message on a JMS queue. That would have the advantage that should the load on your web container become too great because of your long running query, you could easily move the MDB onto a different (dedicated) system.
You can continue processing in a separate Thread.
The response is commited once you return from doPost() method.
This example can help
void doPost(){
// do something
final ExecutorService executor = Executors.newSingleThreadExecutor();
executor.execute(new Runnable() {
#Override
public void run() {
// processing after response
}
});}

Categories