We have two AppEngine (Java) apps. One of them uses URLFetch to the other to create an appointment. In the receiver, we've added a feature where we use the Channel API to see if there are any open channels and let them know about the new data.
The URLFetch call is failing with a SocketTimeoutException. All the code in the receiver is executed (including all open channels being notified) but the calling app still gets a SocketTimeoutException. When I comment out the channel notification line, no error.
This happens only in the deployed app, not in dev mode. Also, the call doesn't come close to reaching the 60-second (or even the old 10-second) timeout allowed by URLFetch.
The default deadline for urlfetch is 5s, so if your application take more than 5s to load and execute the handler it will return a SocketTimeoutException.
As described in the documentation, you can set a longer deadline for your urlfetch call using setConnectTimeout or setReadTimeout
In addition, it is a good idea to move the api call that can be deferred (i.e not necessary to build the http response) to a task queue:
the deadline for task queue request is longer (10 minutes, instead of 60s)
the task will be retried if failing
urlfetch timeout is longer too (10 minutes)
Related
My team maintains an application (written on Java) which processes long running batch jobs. These jobs needs to be run on a defined sequence. Hence, the application starts a socket server on a pre-defined port to accept job execution requests. It keeps the socket open until the job completes (with success or failure). This way the job scheduler knows when one job ends and upon successful completion of the job, it triggers the next job in the pre-defined sequence. If the job fails, scheduler sends out an alert.
This is a setup we have had for over a decade. We have some jobs which runs for a few minutes and other which takes a couple hours (depending on the volume) to complete. The setup has worked without any issues.
Now, we need to move this application to a container (RedHat OpenShift Container Platform) and the infra policy in place allows only default HTTPS port be exposed. The scheduler sits outside OCP and cannot access any port other than the default HTTPS port.
In theory, we could use the HTTPS, set Client timeout to a very large duration and try to mimic the the current setup with TCP socket. But would this setup be reliable enough as HTTP protocol is designed to serve short-lived requests?
There isn't a reliable way to keep a connection alive for a long period over the internet, because of nodes (routers, load balancers, proxies, nat gateways, etc) that may be sitting between your client and server, they might drop mid connection under load, some of them will happily ignore your HTTP keep alive request, or have an internal max connection duration time that will kill long running TCP connections, you may find it works for you today but there is no guarantee it will work for you tomorrow.
So you'll probably need to submit the job as a short lived request and check the status via other means:
Push based strategy by sending a webhook URL as part of the job submission and have the server call it (possibly with retries) on job completion to notify interested parties.
Pull based strategy by having the server return a job ID on submission, then have the client check periodically. Due to the nature of your job durations, you may want to implement this with some form of exponential backoff up to a certain limit, for example, first check after waiting for 2 seconds, then wait for 4 seconds before next check, then 8 seconds, and so on, up to a maximum of time you are happy to wait between each check. So that you can find out about short job completions sooner and not check too frequently for long jobs.
When your worked with socket and TCPprotocol you were in control on how long to keep connections open. With HTTP you are only in control of logical connections and not physical ones. Actual connections are controlled by OS and usually IT people can configure all those timeouts. But by default how it works is that when you even close logical connection the real connection is no closed in anticipation of next communication. It is closed by OS and not controlled by your code. However even if it closes and your next request comes after that it is opened transparently to you. SO it doesn't really matter if it closed or not. It should be transparent to your code. So in short I assume that you can move to HTTP/HTTPS with no problems. But you will have to test and see. Also about other options on server to client communications you can look at my answer to this question: How to continues send data from backend to frontend when something changes
We have had bad experiences with long standing HTTP/HTTPS connections. We used to schedule short jobs (only a couple of minutes) via HTTP and wait for it to finish and send a response. This worked fine, until the jobs got longer (hours) and some network infrastructure closed the inactive connections. We ended up only submitting the request via HTTP, get an immediate response and then implemented a polling to wait for the response. At the time, the migration was pretty quick for us, but since then we have migrated it even further to use "webhooks", e.g. allow the processor of the job to signal it's state back to the server using a known webhook address.
IMHO, you should improve your scheduler to a REST API server, Websocket isn't effective in this scenario, the connection will inactive most of time
The jobs can be short-lived or long running. So, When a long running job fails in the middle, how does the restart of the job happen? Does it start from beginning again?
In a similar scenario, we had a database to keep track of the progress of the job (no of records successfully processed). So, the jobs can resume after a failure. With such a design, another webservice can monitor the status of the job by looking at the database. So, the main process is not impacted by constant polling by the client.
We have a spring java app using EWS to connect to our on prem 2016 Exchange server and 'stream' pulling emails. Every 30 minutes a new 30 minute subscription is made (via new thread). We assume old connection just expires.
When one instance is running in our environment, it works perfectly fine, but when two instances run, after some time one instance will eventually start throwing error about
You have exceeded the available concurrent connections for your account. Try again once your other requests have completed.
It seems like an issue which is then hit by throttling. I found that the Exchange servers config is:
EWSMaxConcurrency=27, MaxStreamingConcurrency=10,
HangingConnectionLimit=10
Our code previously didn't explicitly close connections and unsubscribe (was running fine without when one instance). We tried including both but the issue still persists and we noticed the close method for StreamingSubscriptionConnection throws error. The team that handles the Exchange server can find errors referencing the exceeding connection count error above, but nothing relating to the close connection error
...[m.e.w.d.n.StreamingSubscriptionConnection.close(349)]: java.lang.Exception: microsoft.exchange.webservices.data.notification.StreamingSubscriptionConnection
Currently we don't have much ability to make changes on the exchange server side. I'm not familiar with SOAP messages but I was planning to look into how to monitor them to see what inbound and outbound messages there are for some insights
For the service I set service.setTraceEnabled(true) and service.setTraceFlags(EnumSet.allOf(TraceFlags.class)
However I only see trace messages in console when an email arrives. I dont see any messages during start up when a subscription/connection is created
Can anyone help provide any advice on how I can monitor these subscription related messages?
I tried using SOAPUI but I'm having difficulty applying our server's WSDL. I considered using the Tunnelij plugin for intellij but I'm not too familiar with how to set it up either
My suspicion is that there is some intermittent latency issue on Exchange server side, perhaps response messages are not coming back in a timely manner, and this may be screwing up. I presume if I monitor these SOAP messages then I should see more than 10 requests to subscribe before that error appears
The EWS Logs on the CAS (Client Access Server) should have details about the throttling issue. Are you using Impersonation in you Application if you not using Impersonation then the concurrent connections are charged against the account your using with Impersonation that get charged against the account your impersonating. The difference here is that a single user can have no more the 10 streaming subscriptions (unless you modify the web.config) if your using impersonation than you can scale your application to 1000's of users see https://github.com/MicrosoftDocs/office-developer-exchange-docs/blob/main/docs/exchange-web-services/how-to-maintain-affinity-between-group-of-subscriptions-and-mailbox-server.md
When I send a stop signal(either with kill -SIGINT <pid> or System.exit(0) or environment.getApplicationContext().getServer().stop()) to the application, it waits for the shutdownGracePeriod (by default 30 sec or whatever I configure in .yml file) and also it does not accept new request. However, my requirement is to make the server wait for the ongoing request to complete before stopping. The ongoing request may take 30 sec or 30 minutes, it is unknown. Can somebody suggest me the way to achieve this?
Note: I've referred to the below links but could not achieve.
How to shutdown dropwizard application?
shutdownGracePeriod
We've used in-app healthcheck combined with some external load balancer service and prestop scripts. A healthcheck is turned off by the prestop script, then healthcheck says it is unhealthy so no new requests are sent by the load balancer (but existing ones are processed), only after draining period a stop signal is sent to the application.
Even this though has a specified time limit. I don't know how you would monitor requests that last an unknown amount of time.
I have JBoss 5 with ejb3 beans deployed to it.
If bean method execution takes a very long time (I checked that for 2 hours), then the client does not receive the answer when the EJB method execution finishes (with exception or not).
The client is blocked waiting for response from socket.
Why does that happen?
Most likely this is caused by a (stateful) router, packet filter, load balancer, SSL box whatever in between: They just terminate the connection after a certain time of inactivity, and the real endpoints are not notified. Experience shows that it's normally out of your control to have suitable timeouts in each device.
Anyway in your case, instead of curing the symptoms: A running request needs an open TCP connection, and possibly blocks a thread. So consider changing the design of your system from synchronous to asynchronous:
Use polling here, every minute should be enough. So you have a function to submit a task, and another one which returns "not yet ready" or "here is the result".
Use JMS queues in your client to submit tasks and to receive results
I need to add a timeout to a J2ME application that uses ksoap 2 to connect to a web service.
I've tried the method described as a possible pseudo timeout at http://ksoap2.sourceforge.net/doc/api/org/ksoap2/transport/HttpTransport.html, but it doesn't seem to function on this device.
I'd run the connection on another thread and kill it if a timer fires but there's no way to kill a thread before it finishes executing in J2ME per http://developers.sun.com/mobility/midp/articles/threading2/ (this is an embedded device, so I can't just leave an indefinite number of threads blocking in the background). I can't use the poll a boolean method since it's the single attempt to open the connection that blocks.
The system timeout seems to vary between device modal and is too long for my purposes.
Does anybody have any thoughts as to something that might work?
I ended up using the Socket class which has the setSoTimeout() method.
Could mention that I made a modification to the KSoap2 v2.5.2 to support timeout for HttpTransportSE class. It will throw a SocketTimeoutException when timeout occurs.
It's both jar and src is found at this url http://www.lightsoft.se/?p=707
Keep in mind you are not dealing with fully functional computers. On some devices, you just can't interrupt network operations, especially the TCP connect.
This is what we do,
Before making the connection, create another monitoring timer thread on a a short frequency (say 2 seconds).
In the monitoring thread, you can send some message to the device pretending you are making progress if time limit is not reached.
If a certain time limit is reached, try to interrupt the other thread by sending Thread.interrupt(). This call is available in MIDP.
On the connection thread, just quit if being interrupted.
This works great on all emulators but the connection thread doesn't get the exception till 5 minutes later on some phones.