How to continue on client when heavy server computation is done

How to continue on client when heavy server computation is done - java

This might be a simple problem, but I can't seem to find a good solution right now.
I've got:
OldApp - a Java application started from the command line (no web front here)
NewApp - a Java application with a REST api behind Apache
I want OldApp to call NewApp through its REST api and when NewApp is done, OldApp should continue.
My problem is that NewApp is doing a lot of stuff that might take a lot of time which in some cases causes a timeout in Apache, and then sends a 502 error to OldApp. The computations continue in NewApp, but OldApp does not know when NewApp is done.
One solution I thought of is fork a thread in NewApp and store some kind of ID for the API request, and return it to OldApp. Then OldApp could poll NewApp to see if the thread is done, and if so - continue. Otherwise - keep polling.
Are there any good design patterns for something like this? Am I complicating things? Any tips on how to think?

If NewApp is taking a long time, it should immediately return a 202 Accepted. The response should contain a Location header indicating where the user can go to look up the result when it's done, and an estimate of when the request will be done.
OldApp should wait until the estimate time is reached, then submit a new GET call to the location. The response from that GET will either be the expected data, or an entity with a new estimated time. OldApp can then try again at the later time, repeating until the expected data is available.
So The conversation might look like:
POST /widgets
response:
202 Accepted
Location: "http://server/v1/widgets/12345"
{
"estimatedAvailableAt": "<whenever>"
}
.
GET /widgets/12345
response:
200 OK
Location: "http://server/v1/widgets/12345"
{
"estimatedAvailableAt": "<wheneverElse>"
}
.
GET /widgets/12345
response:
200 OK
Location: "http://server/v1/widgets/12345"
{
"myProperty": "myValue",
...
}

Yes, that's exactly what people are doing with REST now. Because there no way to connect from server to client, client just polls very often. There also some improved method called "long polling", when connection between client and server has big timeout, and server send information back to connected client when it becomes available.

The question is on java and servlets ... So I would suggest looking at Servlet 3.0 asynchronous support.
Talking from a design perspective, you would need to return a 202 accepted with an Id and an URL to the job. The oldApp needs to check for the result of the operation using the URL.
The thread that you fork on the server needs to implement the Callable interface. I would also recommend using a thread pool for this. The GET url for the Job that was forked can check the Future object status and return it to the user.

Related

Correct way to handle a REST API path that can be called synchronously or asynchronously

I'm working on a Spring Boot REST API that handles document and can launch a check on a document.
I have a document resource: /doc:
Create a doc with POST /doc
Rest of the CRUD actions with /doc/{id}
Now I can launch a check on a doc, check can be seen either as an action or as a sub-resource.
It's pretty straightforward to launch (create) a check on a document: POST /doc/{id}/check
The check can however take some time so I want to give the user the choice to launch a synchronous or asynchronous check.
How would I handle this path wise?
Should the user choose sync or async check through a query parameter on POST /doc/{id}/check?
Should I create 2 separate paths?
Also in the case of an async check, I would create a temporary Task resource that can be pooled to know the status of the check.
But then if both check and task are returned from the same path it gets confusing, no?
I read an article that says the resource returned in async should be a check resource filled as much as possible but with a link to the task that can be pooled.
That seems like a good way; I would return a partial check if async with a link to the /task/{id} associated with the check.
However I'm still confused as to what path my API should offer to let the user pick between sync and async checks.
How would you handle it path and resource wise?

Basically it's up to you. Usually if it's a big chunk of data you want to query like /resource/{id} most APIs I have used use GET for synchronous requests and POST for async request returning task or job ID.
For POST in your case if the creation/checking takes time I would consider always doing it asynchronous and returning HTTP 202 Accepted and doc/{id}/check/{id} url where the user can see the result if it is ready or some status that it is still working.
If you want to give them a choice to wait or not it's up to you how to do it. There is a standard header that can be used to modify behavior. For example Expect: 202-accepted for async calls and no header or Expect: 201-created for synchronous calls. This makes the API a bit less clear even though it is a standard. Most people (including me) would probably stick to adding a parameter to the URL for clarification. I don't think it should be in the POST data because it should be data related to the object you are creating

There are multiple questions here. I would try to answer one by one
Checking the health of a resource can be done with query param
/doc/{id} - GET Get the resource details
/doc/{id}?healthCheck=true&async=true GET - Get the resource details and trigger an async health check
For the async health check the response as you mentioned will be 202 and the response contains the link to the health status URL
HTTP/1.1 202 Accepted
Location: /doc/12345/status
If the client sends a GET request to this endpoint, the response should contain the current status of the request. Optionally, it could also include an estimated time to completion or a link to cancel the operation.
Reference
https://learn.microsoft.com/en-us/azure/architecture/best-practices/api-design

Is it legal to write and close the response in JSP, then do some extra job?

I'm working on a web app, which is communicating with the server with AJAX requests. A special type of "close" request takes 5 secs, which the web app should just fire-and-forget, the result is irrelevant. Due to browser behaviors (only limited number of simultaneous AJAX requests are performed), a 5-sec request may stuck other AJAX requests, which is unacceptable.
The smart folks here in StackOverflow has adviced me to write a small server-side proxy, which the web app should call instead of the original 5-sec one. The proxy should response immediatelly, close response channel, then perform a HTTP request and wait for it, spending the 5 secs server-side, instead of client-side. (The original question is here: See Is there a way to perform fire-and-forget AJAX request? )
The server is a Tomcat with JSP, and I can write small JPS pages. (I'm not an experienced JSP ninja, but I don't afraid of Java.) My question is: is it legal to write such a JSP, or what's the best practice:
send the response,
close reply channel (is out.close() enough?), in order to end the AJAX request at client-side,
fire and process (actually: just drop response) a HTTP request "in background", which may take as long as 5 secs?

It's not (only) your browser you should worry about. Blocking a tomcat thread for 5s severly limits your max-users as well (how many requests per second do you need to handle ultimately?)
So making it "more" asynchronous in the server might make sense.
Doing it in JSP (with Sriplets?!) alone will noway be a robust implementation - but if you need to do it that way, you should think about starting the "work to do" in a separate Thread.
So instead of
<%
do_something_heavy();
%>
You'll do like
<%
new Thread(new Runnable() {
public void run() {
do_something_heavy();
}
}).start();
%>
There's other options as well (JMS, ExecutorService, Spring #Async...) but this should get you started quick.

First the best is to separate business logic from view: it means write java code on a servlet and delegate only the view aspect to the jsp.
To execute your task asynchronously in the servlet code you can:
Invoke a submit method of an ExecutorService
Make a call to a JMS
Manually create a thread and start it
Then you can forward to the jsp.
TIP: It is possible to assign an id to the long task and return it in the jsp with a link to monitor the status of the task.
Basically you do something like that:
Accept the request
Start asynchronously a thread to execute the long task
Return immediately without waiting for the long task termination
Or using an id:
Accept the request
Calculate the id of the task
Start asynchronously a thread to execute the long task with the desired id
Return immediately a link with the id of the long task without waiting for the termination

Servlet 3.0: Can't send an asynchronous response?

I'm having trouble establishing AsyncContexts for users and using them to push notifications to them. On page load I have some jQuery code to send the request:
$.post("TestServlet",{
action: "registerAsynchronousContext"
},function(data, textStatus, jqXHR){
alert("Server received async request"); //Placed here for debugging
}, "json");
And in "TestServlet" I have this code in the doPost method:
HttpSession userSession = request.getSession();
String userIDString = userSession.getAttribute("id").toString();
String paramAction = request.getParameter("action");
if(paramAction.equals("registerAsynchronousContext"))
{
AsyncContext userAsyncContext = request.startAsync();
HashMap<String, AsyncContext> userAsynchronousContextHashMap = (HashMap<String, AsyncContext>)getServletContext().getAttribute("userAsynchronousContextHashMap");
userAsynchronousContextHashMap.put(userIDString, userAsyncContext);
getServletContext().setAttribute("userAsynchronousContextHashMap", userAsynchronousContextHashMap);
System.out.println("Put asynchronous request in global map");
}
//userAsynchronousContextHashMap is created by a ContextListener on the start of the web-app
However, according to Opera Dragonfly (a debugging tool like Firebug), it appears that the server sends an HTTP 500 response about 30000ms after the request is sent.
Any responses created with userAsyncContext.getResponse().getWriter().print(SOME_JSON) and sent before the HTTP 500 response is not received by the browser, and I don't know why. Using the regular response object to send a response (response.print(SOME_JSON)) is received by the browser ONLY if all the code in the "if" statement dealing with AsyncContext is not present.
Can someone help me out? I have a feeling this is due to my misunderstanding of how the asynchronous API works. I thought that I would be able to store these AsyncContexts in a global map, then retrieve them and use their response objects to push things to the clients. However, it doesn't seem as if the AsyncContexts can write back to the clients.
Any help would be appreaciated.

I solved the issue. It seems as though there were several problems wrong with my approach:
In Glassfish, AsyncContext objects all have a default timeout period of 30,000 milliseconds (.5 minutes). Once this period expires, the entire response is committed back to the client, meaning you won't be able to use it again.
If you're implementing long-polling this might not be much of an issue (since you'll end up sending another request after the response anyway), but if you wish to implement streaming (sending data to back to the client without committing the response) you'll want to either increase the timeout, or get rid of it all together.
This can be accomplished with an AsyncContext's .setTimeout() method. Do note that while the spec states: "A timeout value of zero or less indicates no timeout.", Glassfish (at this time) seems to interpret 0 as being "immediate response required", and any negative number as "no timeout".
If you're implementing streaming , you must use the printwriter's .flush() method to push the data to the client after you're done using its .print() .println() or .write() methods to write the data.
On the client side, if you've streamed the data, it will trigger a readyState of 3 ("interactive", which means that the browser is in the process of receiving a response). If you are using jQuery, there is no easy way to handle readyStates of 3, so you're going to have to revert to regular Javascript in order to both send the request and handle the response if you're implementing streaming.

I have noticed that in Glassfish if you use AsyncContext and use .setTimeOut() to a negative number the connection is broken anyway, to fix this I had to go to my Glassfish admin web configurator : asadmin set
configs.config.server-config.network-config.protocols.protocol.http-listener-1.http. And set timeout to -1. All this to avoid glassfish finish the connections after 30 sec.

Java, PhpBB and creation of new topic

I need to programmly create topics on my board. I use Java and HtmlUnit for this.
But there is one problem — if program post once all is okay (forum response is http 200), but if start program again then PhpBB response is «http 304» and redirection to category where new topic should be located but topic not added. The question is how to fix this?
Here is WireShark dump of first successfull topic addition (login, posting):
http://a2k.in/2ai
And here is same request but with 304 redirect:
http://a2k.in/2aj
Posting is from admin account with not posting time limitations.
Here is posting from browser (Chrome) log:
http://a2k.in/2ak
What is the problem? The difference between my request and browser request is in header «Cache-Control: max-age=0», «Origin: http://localhost». Maybe there is problem in cache-controller?

maybe a bit late.. but just saw this...
had the same problem when posting more then one thread.
looks like phpbb has some kind of flood protection.
At least for my implementation it helped to simply add a timer /delay between posts... (think i got it set to somewhere around 3 sec. may work with one or two as well not sure... wasnt in a hurry.)

Salesforce/PHP - Bulk Outbound message (SOAP), Time out issue - See update #2

Salesforce can send up to 100 requests inside 1 SOAP message. While sending this type of Bulk Ooutbound message request my PHP script finishes executing but SF fails to accept the ACK used to clear the message queue on the Salesforce side of things. Looking at the Outbound message log (monitoring) I see all the messages in a pending state with the Delivery Failure Reason "java.net.SocketTimeoutException: Read timed out". If my script has finished execution, why do I get this error?
I have tried these methods to increase the execution time on my server as I have no access on the Salesforce side:
set_time_limit(0); // in the script
max_execution_time = 360 ; Maximum execution time of each script, in seconds
max_input_time = 360 ; Maximum amount of time each script may spend parsing request data
memory_limit = 32M ; Maximum amount of memory a script may consume
I used the high settings just for testing.
Any thoughts as to why this is failing the ACK delivery back to Salesforce?
Here is some of the code:
This is how I accept and send the ACK file for the imcoming SOAP request
$data = 'php://input';
$content = file_get_contents($data);
if($content) {
respond('true');
} else {
respond('false');
}
The respond function
function respond($tf) {
$ACK = <<<ACK
<?xml version = "1.0" encoding = "utf-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<notifications xmlns="http://soap.sforce.com/2005/09/outbound">
<Ack>$tf</Ack>
</notifications>
</soapenv:Body>
</soapenv:Envelope>
ACK;
print trim($ACK);
}
These are in a generic script that I include into the script that uses the data for a specific workflow. I can process about 25 requests (That are in 1 SOAP response) but once I go over that I get the timeout error in the Salesforce queue. for 50 requests is usually takes my PHP script 86.77 seconds.
Could it be Apache? PHP?
I have also tested just accepting the 100 request SOAP response and just accepting and sending the ACK the queue clears out, so I know it's on my side of things.
I show no errors in the apache log, the script runs fine.
I did find some info on the Salesforce site but still no luck. Here is the link.
Also I'm using the PHP Toolkit 11 (From Salesforce).
Other forum with good SF help
Thanks for any insight into this,
--Phill
UPDATE:
If I receive the incoming message and print the response, should this happen first regardless if I do anything else after? Or does it wait for my process to finish and then print the response?
UPDATE #2:
okay I think I have the problem:
PHP uses the single thread processing approach and will not send back the ACK file until the thread has completed it's processing. Is there a way to make this a mutli thread process?
Thread #1 - accept the incoming SOAP request and send back the ACK
Thread #2 - Process the SOAP request
I know I could break it up into like a DB table or flat file, but is there a way to accomplish this without doing that?
I'm going to try to close the socket after the ACK submission and continue the processing, cross my fingers it will work.

Sounds like the outbound message is hitting the timeout. Other users have reported timeouts as low as 10 seconds (see forum link below). The sandbox instance that I use (cs1) is timing out after about 1 minute, from my testing. It's possible that the timeout is an organization or instance level setting that Salesforce controls.
Two things you could try:
Open a support ticket with
Salesforce to see if they can
increase the timeout value for
outbound messages. From my
experience, there are lot of
settings that they can modify on the
organization level - this might be
one of them.
Offload processing of your data, so
that the ACK is sent immediately
back to Salesforce. Then the actual
processing of your data will take
place asynchronously. ie. Message
queue, separate thread, etc.
Some other resources that might be helpful:
related Salesforce forum discussion
Outbound messaging documentation

I think they timeout the thing waiting for Your script to end.
There is a way You could try to fix this.
Output the envelope with ack message at the beginning and then flush the thing so that their server gets it before You end processing. No threading, just plain priorities rethinking :)
read this for best info on flushing content

Are you 100% sure that Salesforce will wait the amount of time your scripts need too run? 80 seconds seem like a loong time too me.
If all requests failed I would guess that Salesforce expects you to set the Content-Type header appropriately, but this does not seem to be the case.

I don't know about Salesforce, but if you want to make some multithreading with PHP you should take a look at this code example and more precisely to pcntl_fork().
N.B: pcntl is not enabled by default and won't work on Windows platforms.

So what I've done is:
Accept all incoming OBM's, parse them into a DB
When this is done kick of a process that runs in the background (Actually I send it to the background so the script can end)
Send ACK file back
By just accepting the raw data, parsing into fields and inserting it into a DB is fairly quick. Then I issue a Linux Command Line command that also send the processing script to run in the background. Then I send the ACK file to SF and the script ends within the allotted time. It is cumbersome to split the script process into two separate stages but it works.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.