How to handle websocket #OnError

How to handle websocket #OnError - java

What's the correct way of handling a websocket error besides logging it?
Regarding onError(), the Endpoint documentation states that:
Developers may implement this method when the web socket session
creates some kind of error that is not modeled in the web socket
protocol. This may for example be a notification that an incoming
message is too big to handle, or that the incoming message could not
be encoded.
There are a number of categories of exception that this method is
(currently) defined to handle:
connection problems, for example, a socket failure that occurs before the web socket connection can be formally closed. These are modeled as SessionExceptions
runtime errors thrown by developer created message handlers calls.
conversion errors encoding incoming messages before any message handler has been called. These are modeled as DecodeExceptions
Are all of these types of exceptions fatal, causing the websocket to close?
Should the onError() method close the websocket (call Session.close()) if an error occurs?
So far, I assumed that it's my responsibility to cleanly close the session, informing the client about the close reason. This is why my onError() tried invoking session.close() if session.isOpen() returned true, but this caused tomcat (8.0.15) to throw a NullPointerException:
...
Caused by: java.lang.NullPointerException
at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.onWritePossible(WsRemoteEndpointImplServer.java:96)
at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:81)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:444)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessage(WsRemoteEndpointImplBase.java:335)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessageBlock(WsRemoteEndpointImplBase.java:264)
at org.apache.tomcat.websocket.WsSession.sendCloseMessage(WsSession.java:536)
at org.apache.tomcat.websocket.WsSession.doClose(WsSession.java:464)
at org.apache.tomcat.websocket.WsSession.close(WsSession.java:441)
at my.package.MyEndpoint.onWebSocketError(MyEndpoint.java:229)
... 18 more
Is this a tomcat bug, a misunderstanding on my part, or both?
Edit: It seems that the Java EE websocket example dukeeetf2 assumes that errors are fatal; and that there's no need to close the session. The errors are logged, and the session is removed:
#OnError
public void error(Session session, Throwable t) {
/* Remove this connection from the queue */
queue.remove(session);
logger.log(Level.INFO, t.toString());
logger.log(Level.INFO, "Connection error.");
}

#OnError method invocation does not mean that Session will be closed; You can do whatever you want, it depends in the contract specified by your application.
stacktrace from tomcat implementation seems like a bug.
ad dukeeetf2 sample - seems like this code contains other assumptions - Endpoints does not throw an exception, so everything caught here is from underlying WebSocket framework implementation. That does not really mean that there is an "Connection Error"; I would maybe do close right away (if this is how I wan't my application to handle errors); this implementation could result in opened connections without any messages.

I saw this is a bit dated but ended up here today when looking for this info.
Depending on how you rely on the state of the websocket, you need to close the session manually, at least for the javax.websocket implementation.
In my case, the error happening was causing a problem for the websession client administration implementation, so I closed the session as in the above example.
I think it depends on what you need, but it certainly does not do a close session in this implementation.

Related

What is the right thing to do if a socketChannel.close() got IOException?

I have a class that wraps socketChannel and has a close() method as follows:
public void close() {
// ... logic ...
try {
socketChannel.close();
} catch (IOException e) {
// ???
}
this.isConnected = false;
}
I want in the end of this operation that socketChannel will be closed and not registered to its selector. I read and found that the above code is sufficient for that, but what happens if I got an IOException?
My feeling is that "swallowing" it is enough, but am i missing something?

The answer will depend on whether it matters that the close threw an exception. And if it matters, the next question is whether you need to do something about it ... other than reporting it.
Scenario #1.
A web server gets an exception when closing the output stream it sent the response on. A typical cause is that the user closed his web browser or lost his network connection at the wrong moment. The server-side exception doesn't matter (to the server / server admin) and is not even worth logging.
Scenario #2.
You are doing something that involves talking to multiple servers, and it is important to know that they all "got the message". If an exception occurs in the close, that may be an indication that that didn't happen. Probably you need to log this. Maybe you need to tell the servers. Maybe you need to cause some enclosing transaction to rollback.

Websocket implementation of Tomcat 8.0.33 doesn't throw an IOException immediately when sendText() method is called(server side)

I am using Apache Tomcat 8.0.33.
I was going through Java documentation about RemoteEndpoint.Basic
which says that sendText(String text) blocks until all of the message
has been transmitted.
But I noticed that when the client loses internet connection and
sendText() method is called on the server side, it doesn't thrown an
IOException immediately and the method returns normally.
IOException is thrown later and the onError() method is called.
Is
this a normal behaviour? Shouldn't the sendText() method block until
all the message has been transmitted successfully or throw an
IOException immediately if there's any problem?

Yes, this behavior is normal.
Depending on how the client disconnects, there server might not know and the message will sit in the network buffer until the network stack figures out that
the client has gone away.

Rather mysterious SocketException with Java 1.6 on CentOS 4

I have a JUnit test of a JAX-RS web service. The test launches embedded tomcat, and then talks to it via the Apache CXF JAX-RS client.
Consider this backtrace:
Caused by: java.net.SocketException: Socket Closed
at java.net.PlainSocketImpl.getOption(PlainSocketImpl.java:286)
at java.net.Socket.getSoTimeout(Socket.java:1032)
at sun.net.www.http.HttpClient.available(HttpClient.java:356)
at sun.net.www.http.HttpClient.New(HttpClient.java:273)
at sun.net.www.http.HttpClient.New(HttpClient.java:310)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:987)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:923)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:841)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1031)
This fails only on CentOS 4.8. The same unit test (which launches an embedded tomcat and then talks to a web service in it) works just fine on a wide variety of other systems. Note the extreme oddity of this backtrace: HttpHRLConnection has called HttpClient to get a new connection, and that later class has apparently closed its own socket before the connection has been returned where any code of mine could get to it.
Further, the test has friends that do the same server setup of the same service and talk to it without issues.
Even further, the following incantation (slightly abbreviated) is a workaround:
#Before
public void pingServiceToWorkAroundCentos() {
try {
/* ... code to make a connection to the service and close it ... */
} catch (Throwable t) {
// do nothing
}
}
In other words, if I arrange for an extra throwaway connection before running each of the test cases, that uses up whatever this problem is.
What could this be?

Since there is only a backtrace and no code here, I am assuming that there is some sort of race condition or bug where the socket is being closed prior by another thread while this current thread is attempting to get the OutputStream.
Looking at the source for the JDK I see this...
public Object getOption(int opt) throws SocketException {
if (isClosedOrPending()) {
throw new SocketException("Socket Closed");
}
... snip ...
the isClosedOrPending method checks whether the internal FD is null or if a close is pending, i.e. close has been called on the socket.
Good luck tracking it down.

Nothing mysterious about it. You have closed the socket and then continued to use it.
Closing either the input or the output stream of the socket closes the other stream and the socket.

I am pretty sure this is a JDK bug.
HttpClient was modified in a recent commit:
http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/diff/e6dc1d9bc70b/src/share/classes/sun/net/www/http/HttpClient.java
The getSoTimeout() call needs to be in a try/catch block, for now unfortunately the only real option is to downgrade the JDK.

Looks similar to an issue we ran into where the httpclient pooled connections were kept alive longer than the corresponding server side connections in tomcat. Basically this results in stale connections in the httpclient connection pool. When httpclient tries to use these, they basically fail. I believe httpclient actually recovers from this using the standard retry handler.
The solution is to double check your timeout settings client and serverside and your retry policy.

Apache HttpClient Interim Error: NoHttpResponseException

I have a webservice which is accepting a POST method with XML. It is working fine then at some random occasion, it fails to communicate to the server throwing IOException with message The target server failed to respond. The subsequent calls work fine.
It happens mostly, when i make some calls and then leave my application idle for like 10-15 min. the first call which I make after that returns this error.
I tried couple of things ...
I setup the retry handler like
HttpRequestRetryHandler retryHandler = new HttpRequestRetryHandler() {
public boolean retryRequest(IOException e, int retryCount, HttpContext httpCtx) {
if (retryCount >= 3){
Logger.warn(CALLER, "Maximum tries reached, exception would be thrown to outer block");
return false;
}
if (e instanceof org.apache.http.NoHttpResponseException){
Logger.warn(CALLER, "No response from server on "+retryCount+" call");
return true;
}
return false;
}
};
httpPost.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, retryHandler);
but this retry never got called. (yes I am using right instanceof clause). While debugging this class never being called.
I even tried setting up HttpProtocolParams.setUseExpectContinue(httpClient.getParams(), false); but no use. Can someone suggest what I can do now?
IMPORTANT
Besides figuring out why I am getting the exception, one of the important concerns I have is why isn't the retryhandler working here?

Most likely persistent connections that are kept alive by the connection manager become stale. That is, the target server shuts down the connection on its end without HttpClient being able to react to that event, while the connection is being idle, thus rendering the connection half-closed or 'stale'. Usually this is not a problem. HttpClient employs several techniques to verify connection validity upon its lease from the pool. Even if the stale connection check is disabled and a stale connection is used to transmit a request message the request execution usually fails in the write operation with SocketException and gets automatically retried. However under some circumstances the write operation can terminate without an exception and the subsequent read operation returns -1 (end of stream). In this case HttpClient has no other choice but to assume the request succeeded but the server failed to respond most likely due to an unexpected error on the server side.
The simplest way to remedy the situation is to evict expired connections and connections that have been idle longer than, say, 1 minute from the pool after a period of inactivity. For details please see the 2.5. Connection eviction policy of the HttpClient 4.5 tutorial.

Accepted answer is right but lacks solution. To avoid this error, you can add setHttpRequestRetryHandler (or setRetryHandler for apache components 4.4) for your HTTP client like in this answer.

HttpClient 4.4 suffered from a bug in this area relating to validating possibly stale connections before returning to the requestor. It didn't validate whether a connection was stale, and this then results in an immediate NoHttpResponseException.
This issue was resolved in HttpClient 4.4.1. See this JIRA and the release notes

Solution: change the ReuseStrategy to never
Since this problem is very complex and there are so many different factors which can fail I was happy to find this solution in another post: How to solve org.apache.http.NoHttpResponseException
Never reuse connections:
configure in org.apache.http.impl.client.AbstractHttpClient:
httpClient.setReuseStrategy(new NoConnectionReuseStrategy());
The same can be configured on a org.apache.http.impl.client.HttpClientBuilder builder:
builder.setConnectionReuseStrategy(new NoConnectionReuseStrategy());

Although accepted answer is right, but IMHO is just a workaround.
To be clear: it's a perfectly normal situation that a persistent connection may become stale. But unfortunately it's very bad when the HTTP client library cannot handle it properly.
Since this faulty behavior in Apache HttpClient was not fixed for many years, I definitely would prefer to switch to a library that can easily recover from a stale connection problem, e.g. OkHttp.
Why?
OkHttp pools http connections by default.
It gracefully recovers from situations when http connection becomes stale and request cannot be retried due to being not idempotent (e.g. POST). I cannot say it about Apache HttpClient (mentioned NoHttpResponseException).
Supports HTTP/2.0 from early drafts and beta versions.
When I switched to OkHttp, my problems with NoHttpResponseException disappeared forever.

Nowadays, most HTTP connections are considered persistent unless declared otherwise. However, to save server ressources the connection is rarely kept open forever, the default connection timeout for many servers is rather short, for example 5 seconds for the Apache httpd 2.2 and above.
The org.apache.http.NoHttpResponseException error comes most likely from one persistent connection that was closed by the server.
It's possible to set the maximum time to keep unused connections open in the Apache Http client pool, in milliseconds.
With Spring Boot, one way to achieve this:
public class RestTemplateCustomizers {
static public class MaxConnectionTimeCustomizer implements RestTemplateCustomizer {
#Override
public void customize(RestTemplate restTemplate) {
HttpClient httpClient = HttpClientBuilder
.create()
.setConnectionTimeToLive(1000, TimeUnit.MILLISECONDS)
.build();
restTemplate.setRequestFactory(
new HttpComponentsClientHttpRequestFactory(httpClient));
}
}
}
// In your service that uses a RestTemplate
public MyRestService(RestTemplateBuilder builder ) {
restTemplate = builder
.customizers(new RestTemplateCustomizers.MaxConnectionTimeCustomizer())
.build();
}

This can happen if disableContentCompression() is set on a pooling manager assigned to your HttpClient, and the target server is trying to use gzip compression.

Same problem for me on apache http client 4.5.5
adding default header
Connection: close
resolve the problem

Use PoolingHttpClientConnectionManager instead of BasicHttpClientConnectionManager
BasicHttpClientConnectionManager will make an effort to reuse the connection for subsequent requests with the same route. It will, however, close the existing connection and re-open it for the given route.

I have faced same issue, I resolved by adding "connection: close" as extention,
Step 1: create a new class ConnectionCloseExtension
import com.github.tomakehurst.wiremock.common.FileSource;
import com.github.tomakehurst.wiremock.extension.Parameters;
import com.github.tomakehurst.wiremock.extension.ResponseTransformer;
import com.github.tomakehurst.wiremock.http.HttpHeader;
import com.github.tomakehurst.wiremock.http.HttpHeaders;
import com.github.tomakehurst.wiremock.http.Request;
import com.github.tomakehurst.wiremock.http.Response;
public class ConnectionCloseExtension extends ResponseTransformer {
#Override
public Response transform(Request request, Response response, FileSource files, Parameters parameters) {
return Response.Builder
.like(response)
.headers(HttpHeaders.copyOf(response.getHeaders())
.plus(new HttpHeader("Connection", "Close")))
.build();
}
#Override
public String getName() {
return "ConnectionCloseExtension";
}
}
Step 2: set extension class in wireMockServer like below,
final WireMockServer wireMockServer = new WireMockServer(options()
.extensions(ConnectionCloseExtension.class)
.port(httpPort));

How to find out if JMS Connection is there?

In JMS it is easy to find out if a connection is lost, a exception happens. But how do I find out if the connection is there again?
Scenario: I use JMS to communicate with my server. Now my connection breaks (server is down), which results in a exception. So far so good. If the server is up again and the connection is reestablished, how do I know that?
I don't see any Listeners which would facilitate such information.

Ahhh...the old exception handling/reconnection conundrum.
There are some transport providers that will automatically reconnect your application for you and some who make the app drive reconnection. In general the reconnections hide the exception from the application. The down side is that you don't want the app to hang forever if all the remote messaging nodes are down so ultimately, you must include some reconnection logic.
Now here's the interesting part - how do you handle the exceptions in a provider neutral way? The JMS exception is practically worthless. For example, a "security exception" can be that the Java security policies are too restrictive, that the file system permissions are too restrictive, that the LDAP credentials failed, that the connection to the transport failed, that the open of the queue or topic failed or any of dozens of other security-related problems. It's the linked exception that has the details from the transport provider that really help debug the problem. My clients have generally taken one of three different approaches here...
Treat all errors the same. Close all objects and reinitialize them. this is JMS portable.
Allow the app to inspect the linked exceptions to distinguish between fatal and transient errors (i.e. auth error vs. queue full). Not provider portable.
Provider-specific error-handling classes. A hybrid of the other two.
In your case, the queue and topic objects are probably only valid in the context of the original connection. Assuming a provider who reconnects automatically the fact that you got an exception means reconnect failed and the context for queue and topic objects could not be restored. Close all objects and reconnect.
Whether you want to do something more provider-specific such as distinguish between transient and permanent errors is one of those "it depends" things and you'll have to figure that out on a case-by-case basis.

The best way to monitor for connection exception is setting an exception listener, for example:
ConnectionFactory connectionFactory = (ConnectionFactory) context.lookup("jmsContextName");
connection = connectionFactory.createConnection();
connection.setExceptionListener(new ExceptionListener() {
#Override
public void onException(JMSException exception) {
logger.error("ExceptionListener triggered: " + exception.getMessage(), exception);
try {
Thread.sleep(5000); // Wait 5 seconds (JMS server restarted?)
restartJSMConnection();
} catch (InterruptedException e) {
logger.error("Error pausing thread" + e.getMessage());
}
}
});
connection.start();

JMS spec does not describe any transport protocol, it does not say anything about connections (i.e. should broker keep them alive or establish a new connection for every session). So, I think what you mean by
Now my connection breaks (server is down), which results in a exception.
is that you are trying to send a message and you are getting a JmsException.
I think, the only way to see if broker is up is to try to send a message.

Your only option in the case of a Connection based JMSException is to attempt to reestablish the connection in your exception handler, and retry the operation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.