So basically I am simulating a 'Connection reset by peer' locally using Toxiproxy or Wiremock (same behaviour for both of them) and I get the exception below.
The tricky part is that it will bypass any defined exception handlers and there's no way to revert the flow if there's a failure.
I tried configuring the WebClient as mentioned here: https://github.com/reactor/reactor-netty/issues/388 by either removing the connection pool or simply setting the HttpClient keepAlive property to false.
I defined a global exception handling mechanism by extending ErrorWebExceptionHandler but it doesn't go through.
Do you have any suggestions on how to manage this one? Or a proper configuration which I could try for either TcpClient / HttpClient or WebClient? What am I missing?
There must be a way to handle this properly..
[26/04/22 20:02:28] lvl=ERROR [ioEventLoopGroup-4-3] r.n.r.PooledConnectionProvider - [id: 0xaf559a88, L:0.0.0.0/0.0.0.0:56755 ! R:localhost/127.0.0.1:8989] Pooled connection observed an error
java.io.IOException: Connection reset by peer
at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:233)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223)
at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:358)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:247)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1140)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:347)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:697)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:632)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:549)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
Related
I know there are a ton of these posts, but this is a little different. We are using vended code for part of our data processing system, and part of the system sends emails to clients if certain events take place on data insertion or deletion. Recently we have started getting address already in use exceptions. We checked the repository history, and nothing has changed in our code in the last 6 months for this system. We have already tried the typical solutions for this issue including increasing the number of connections allowed to the port with little success. We had a meeting with the vendor, and I asked if anything had changed in their code, and if they would assure that all connections in their code are explicitly closed. They indicated that they are explicitly closing all sockets. However they didn't show us the code so there is no way for us to know if this is true other than taking their word for it. So, the only thing I can think of to do is continue to increase the number of connections to the port until we stop getting bind exceptions. So, what is the industry standard for max number of connections to port 25; is there one? Also if anyone has any other suggestions I would greatly appreciate it? Thanks so much in advance, Robert
20210505112127.716 ERROR m.fiserv.ppx.business.notification.EmailNotifier : MessagingException from notify
javax.mail.MessagingException: Could not connect to SMTP host: SERVER.URL.COM, port: 25;
nested exception is:
java.net.BindException: Address already in use: connect
at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1545)
at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:453)
Caused by:
java.net.BindException: Address already in use: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:90)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:380)
20210505131529.950 ERROR erv.ppx.web.controller.AuditReportViewController : Error while generating HTML
net.sf.jasperreports.engine.JRException: Error writing to OutputStream writer : CorpAdminAuditReport
at net.sf.jasperreports.engine.export.JRHtmlExporter.exportReport(JRHtmlExporter.java:496)
at com.fiserv.ppx.web.controller.AuditReportViewController.generateReport(AuditReportViewController.java:184)
Caused by:
com.ibm.wsspi.webcontainer.ClosedConnectionException: OutputStream encountered error during write
at com.ibm.ws.webcontainer.channel.WCCByteBufferOutputStream.write(WCCByteBufferOutputStream.java:188)
at com.ibm.ws.webcontainer.srt.SRTOutputStream.write(SRTOutputStream.java:97)
20210505140706.240 ERROR com.fiserv.ppx.business.db.DBConnectionUtil : Exception in getting for AppServer connection from DataSource.
com.ibm.websphere.ce.cm.ConnectionWaitTimeoutException: J2CA1010E: Connection not available; timed out waiting for 180,005 seconds.
at com.ibm.ws.rsadapter.AdapterUtil.toSQLException(AdapterUtil.java:1680)
at com.ibm.ws.rsadapter.jdbc.WSJdbcDataSource.getConnection(WSJdbcDataSource.java:661)
at com.ibm.ws.rsadapter.jdbc.WSJdbcDataSource.getConnection(WSJdbcDataSource.java:611)
Caused by:
com.ibm.websphere.ce.j2c.ConnectionWaitTimeoutException: J2CA1010E: Connection not available; timed out waiting for 180,005 seconds.
at com.ibm.ejs.j2c.FreePool.createOrWaitForConnection(FreePool.java:1781)
at com.ibm.ejs.j2c.PoolManager.reserve(PoolManager.java:3834)
at com.ibm.ejs.j2c.PoolManager.reserve(PoolManager.java:3082)
20210505140731.341 ERROR com.fiserv.ppx.business.db.DBConnectionUtil : Exception in getting for AppServer connection from DataSource.
com.ibm.websphere.ce.cm.ConnectionWaitTimeoutException: J2CA1010E: Connection not available; timed out waiting for 180,010 seconds.
at com.ibm.ws.rsadapter.AdapterUtil.toSQLException(AdapterUtil.java:1680)
at com.ibm.ws.rsadapter.jdbc.WSJdbcDataSource.getConnection(WSJdbcDataSource.java:661)
at com.ibm.ws.rsadapter.jdbc.WSJdbcDataSource.getConnection(WSJdbcDataSource.java:611)
Caused by:
com.ibm.websphere.ce.j2c.ConnectionWaitTimeoutException: J2CA1010E: Connection not available; timed out waiting for 180,010 seconds.
at com.ibm.ejs.j2c.FreePool.createOrWaitForConnection(FreePool.java:1781)
at com.ibm.ejs.j2c.PoolManager.reserve(PoolManager.java:3904)
at com.ibm.ejs.j2c.PoolManager.reserve(PoolManager.java:3082)
20210505140731.341 ERROR com.fiserv.ppx.sso.controller.SSOController : SSO Configuration error
java.lang.NullPointerException
at com.fiserv.ppx.business.db.PPXDbTransactionManager.<init>(PPXDbTransactionManager.java:60)
at com.fiserv.ppx.sso.impl.SSOLoginAuthenticator.authenticateSSOUser(SSOLoginAuthenticator.java:157)
I'm trying to use Reactor Netty TcpClient in reactive way to interact with hosts, that may be unreachable. Here is an example of a channel initialization logic:
ConnectionProvider connectionProvider = ConnectionProvider.fixed("fixed", 50);
TcpClient.create(connectionProvider)
.host(host).port(port)
.wiretap(true)
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 50)
.doOnConnect(x -> log.trace("Connect to {}:{}", host, port))
.doOnConnected(conn -> log.trace("Connected {}", conn.channel()))
.connect()
.subscribe(this::utilizeConnection);
the output, that i receiving :
2019-09-04 08:23:13.612 TRACE 71988 --- [ioEventLoop-4-3] c.c.pcb.poc.network.tcp.NettyTcpSender : Connect to 192.168.88.210:2000
2019-09-04 08:23:13.684 WARN 71988 --- [actor-tcp-nio-4] io.netty.util.concurrent.DefaultPromise : An exception was thrown by reactor.netty.resources.PooledConnectionProvider$DisposableAcquire.operationComplete()
reactor.core.Exceptions$ErrorCallbackNotImplemented: io.netty.channel.ConnectTimeoutException: connection timed out: /192.168.88.210:2000
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: /192.168.88.210:2000
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:267) ~[netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) ~[netty-common-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127) ~[netty-common-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [netty-common-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:405) [netty-common-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) [netty-common-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.36.Final.jar:4.1.36.Final]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Assembly trace from producer [reactor.core.publisher.MonoCreate] :
reactor.core.publisher.Mono.create(Mono.java:183)
reactor.netty.resources.PooledConnectionProvider.acquire(PooledConnectionProvider.java:130)
Error has been observed by the following operator(s):
|_ Mono.create ⇢ reactor.netty.resources.PooledConnectionProvider.acquire(PooledConnectionProvider.java:130)
|_ Mono.doOnSubscribe ⇢ reactor.netty.tcp.TcpClientDoOn.connect(TcpClientDoOn.java:58)
The 'inbound' and 'outbound' are having a dedicated method to handle their errors, but they works on top of a Connection instance that won't be created if you got the 'connection timeout'.
I tried:
The exception, that i receiving is wrapped in 'ErrorCallbackNotImplemented'. But I wasn't able to find any way to implement any 'ErrorCallback'
The log contains a warning message from 'io.netty.util.concurrent.DefaultPromise' . but I wasn't able to find a way how to make own Promise to handle it in a right way.
No any configurations i've found that may somehow intercept connection timeouts.
workaround. The blocked approach to create a connection ( .block() instead of .subscribe()) will allow me to catch any Connection creating exceptions within plain try-catch block, but i'll lose the benefits of reactive approach with such workaround.
Do somebody may suggest me at least something to help me to find a right way to handle a 'io.netty.channel.ConnectTimeoutException'?
Do not forget to implement your error callback
Usually reactor.core.Exceptions$ErrorCallbackNotImplemented happens when there is subscription over labmda based .subscribe method (same for Mono and Flux).
If you are going to look at the sources here and here, you will find the place where reactor.core.Exceptions$ErrorCallbackNotImplemented is thrown!
Action Points
In order to handle the original io.netty.channel.ConnectTimeoutException I would recommend looking at Handling Errors section of the original Project Reactor documentation
I have a service-to-service connection that is intermittently throwing SSLHandshakeExceptions from a jersey client.
public static class MyClientFilter extends ClientFilter{
#Override
public ClientResponse handle(ClientRequest cr) throws ClientHandlerException {
try {
return getNext().handle(cr);
} catch (ClientHandlerException e) {
Throwable rootCause = e.getCause() != null ? e.getCause() : e;
if (ConnectException.class.isInstance(rootCause) ||
SocketException.class.isInstance(rootCause) ||
SSLHandshakeException.class.isInstance(rootCause) //maybe?
) {
//do some retry logic
}
}
}
}
The fact that it is only happening intermittently (very rarely) says to me that my certificates and TLS are all configured correctly. In my client I am attempting to retry connections if they fail due to connection or socket exceptions. I am considering making an SSLHandshakeException also a retry-able exception because in my case it seems like it should be, but I am wondering if an SSLHandshakeException could be caused by a connection or socket issue and, if so, is there a way to tell?
Update:
The message of the exception seems to indicate that it could be a connection issue that is not related to SSL configuration:
Caused by: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1564)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:249)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
... 44 common frames omitted
Can a SSLHandshakeException be a retry-able exception?
It is not entirely clear what you are asking:
Does SSLHandshakeException itself retry? No. Of course not.
Are you permitted to retry a connection attempt following a SSLHandshakeException? Yes, you are permitted to retry.
Is advisable to retry? It probably will just fail again, but it depends on what is causing the connection to fail.
Is advisable to retry repeatedly? Definitely not.
Really what this boils down to is diagnosing the cause of the connection failures. To do this you will need to enable client-side debug logging for the SSL connections.
A common cause for this kind of problem is that the client and server cannot negotiate a mutually acceptable SSL/TLS protocol version or cryptographic suite. This typically happens when one end is using an old SSL / TLS stack that is (by current standards) insecure. If this is the root cause then retrying won't help.
It is also possible ... but extremely unlikely ... that the server or the network "glitched" at just the wrong time.
The message of the exception seems to indicate that it could be a connection issue that is not related to SSL configuration.
Actually, I doubt it. It is standard behavior for a server to simply close the connection if the negotiation has failed; see RFC 8446 Section 4.1 for the details. The client will see that as a broken connection.
I am writing an HTTP client with Netty 4.1.12.Final and I have unit tests simulating the crash of the HTTP server in order to be able to handle it.
I noticed that, when it happens, the exceptionCaught callback method of my inbound handler is called with:
java.io.IOException: Une connexion existante a dû être fermée par l’hôte distant
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1100)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:372)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
Where the english equivalent of the exception message is quite probably:
java.io.IOException: An existing connection was forcibly closed by the remote host
Since this callback method is also called when an exception is thrown from my channelRead0 method of my inbound handler, I am asking a few questions:
Should I always consider an IOException "received" in the exceptionCaught callback as an indication that there is no point in continuing using the channel?
Since channelRead0 is declared to throw Exception, should I catch all IOException inside it in order to be sure that, when "receiving" an IOException in the exceptionCaught callback, it is related to the Channel?
Is there a way to know if an exception "received" in the exceptionCaught callback is related to I/O operations or to handlers operations?
Thank you for any hint!
1) If we are talking about a TCP connection then yes every IOException will result in having the connection closed automatically by netty as there is no way to recover.
2) I think I don't understand the question completely as each exception passed through the exceptionCaught(...) method is related the the channel which can be obtained by ctx.channel()
3) No there is no way in general. That said if its a TCP connection and its an IOException and its triggered by the actual transport we will close the connection.
I'm using a Jetty based servlet to do RPC and I'm having an issue where a request that takes a long time throws the following exception on the server:
2012-02-11 21:07:07,673 [btpool0-4] DEBUG org.mortbay.log - EXCEPTION
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at org.mortbay.io.ByteArrayBuffer.readFrom(ByteArrayBuffer.java:168)
at org.mortbay.io.bio.StreamEndPoint.fill(StreamEndPoint.java:99)
at org.mortbay.jetty.bio.SocketConnector$Connection.fill(SocketConnector
.java:190)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:277)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:203)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:357)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.
java:217)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool
.java:475) 2012-02-11 21:07:07,674 [btpool0-4] DEBUG org.mortbay.log
- EOF
I tried setting the Connection,Keep-Alive http request property but that had no effect and from what I can gather, http 1.1 (which I'm pretty sure I'm using) is persistent by default.
So I think there are 2 ways I can try to address this:
figure out how to prevent the timeout exception from being
thrown at all
Have the client issue the initial request without waiting
for a response, and then ping with separate requests to check when
the server is done.
Update (2/12/2012): I set the maxIdleTime as Tim suggested and that did extend the time before the timeout occurred, but then I started getting a new exception:
2012-02-11 23:24:01,187 [btpool0-1] DEBUG org.mortbay.log - EXCEPTION
java.io.IOException: An existing connection was forcibly closed by the
remote host at sun.nio.ch.SocketDispatcher.read0(Native Method) at
sun.nio.ch.SocketDispatcher.read(Unknown Source) at
sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) at
sun.nio.ch.IOUtil.read(Unknown Source) at
sun.nio.ch.SocketChannelImpl.read(Unknown Source) at
org.mortbay.io.nio.ChannelEndPoint.fill(ChannelEndPoint.java:129) at
org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:277) at
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:203) at
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:357) at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:329)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:475)
So something outside of Jetty was killing the connection, I suspect most likely a firewall. So what I ended up doing was making the server process the request with multiple threads; the original thread would immediately respond to the http request and a second thread would be kicked off to perform the action that was taking a long time. The client would then poll with http requests to check when the action on the server was complete.
This is a socket timeout, so nothing you do at the HTTP level can fix it - hence your keep alive not achieving anything.
Try setting the maxIdleTime on the SocketConnector
See here: http://docs.codehaus.org/display/JETTY/Configuring+Connectors ( archive link )