I have a REST service that calls another remote service.
Most of the time the communication works fine, but occasionally, I encounter
org.apache.cxf.jaxrs.client.ClientWebApplicationException:
org.apache.cxf.interceptor.Fault: Could not send Message.
org.apache.cxf.interceptor.Fault: Could not send Message.
SocketException invoking https://url: Unexpected end of file from server
I did some research and found it's the remote server shut down the connection unexpectedly.
It really puzzles me, because everything (input, header, etc) is the same and I was testing with a small amount of requests only like (50-100), I have tried both in sequence and in parallel, only a few will encounter this issue.
Why would this happen? Do I have to ask the remote server to reconfigure, or do I have to implement a retry pattern in my service in this case?
Any hint?
Thanks
P.S
I am using
org.apache.cxf.jaxrs.client.WebClient
to invoke the remote service. Will it make any difference if I switch to HttpClient?
Related
The problem:
I am having some strange behaviour from a Jetty server (rest over https) when some client connections are closed (client-side) before the server has had time to reply. Normally this is well managed and expected by a webserver/application server but in a specific instance something breaks the server that stops replying.
I am trying to reproduce programmatically and locally the issue, opening a client connection and closing it before the server has had time to reply, but I do not have much experience with a situation like this, normally the clients I write are expected to not die immediately.
I am not interested in the language/application I have to use to replicate my case, it can be a Java program, a netcat command, telnet, dotnetcore... The only limit I have is that it should run on a Kubernetes pod, if possible.
I am trying to use Java to open a socket then close it immediately, or to create an Http client and stop it immediately after a request sent, but with no luck at the moment.
At the same time I am looking at netcat, but I fear it's too low level for a rest request.
I am upgrading a standalone Java app that uses IBM MQ to send messages to a local Websphere 8.5 server. The existing app uses a bunch of different jars for the MQ code (mq, mqbind mqjms, connector-api, jms).
For the new one I saw that there is now an all-encompassing "allclient" MQ JAR (https://mvnrepository.com/artifact/com.ibm.mq/com.ibm.mq.allclient/9.2.0.0) so I decided to use that.
It appears to work just fine for the first few messages, but after sending 4-5 messages, all subsequent messages will then fail with a code 2594 (https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_9.1.0/com.ibm.mq.tro.doc/q120510_.htm):
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2594;AMQ9204: Connection to host 'localhost(5558)' rejected. [1=com.ibm.mq.jmqi.JmqiException[CC=2;RC=2594;AMQ9503: Channel negotiation failed. [3=WAS.JMS.SVRCONN ]],3=localhost(5558),5=RemoteConnection.initSess]
at com.ibm.mq.jmqi.remote.api.RemoteFAP$Connector.jmqiConnect(RemoteFAP.java:13588)
at com.ibm.mq.jmqi.remote.api.RemoteFAP$Connector.access$100(RemoteFAP.java:13125)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1430)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiConnect(RemoteFAP.java:1389)
at com.ibm.mq.ese.jmqi.InterceptedJmqiImpl.jmqiConnect(InterceptedJmqiImpl.java:377)
at com.ibm.mq.ese.jmqi.ESEJMQI.jmqiConnect(ESEJMQI.java:562)
at com.ibm.mq.MQSESSION.MQCONNX_j(MQSESSION.java:916)
at com.ibm.mq.MQManagedConnectionJ11.<init>(MQManagedConnectionJ11.java:240)
On the server side, I get the following in the console:
CWSIC3712E: A WebSphere MQ client, previously connected from host 127.0.0.1:58963 on transport chain InboundBasicMQLink, has been disconnected because of exception java.io.IOException: Async IO operation failed (1), reason: RC: 55 The specified network resource or device is no longer available.
After this error occurs, any subsequent attempts to send a message will fail with the same error. I have to restart the app at which point the same thing repeats: first 4-5 messages send before the fails begin. If I switch back to using the old JARs without changing the code, I'm able to send an unlimited number of messages without any issues.
The reason code is confusing to me ("An MQCONN or MQCONNX call was issued from a client connected application, but it failed to agree a password protection algorithm with the queue manager.") because if it's truly a password issue, why do the first couple messages send without issue? It does not seem to be an issue with closing/disconnecting the queue/manager because I'll wait a few seconds between each send and can breakpoint/println and see that each time they are being closed before the next send.
Any ideas?
I found a partial workaround to this problem:
In the Java code, for our MQMessage object, we declare a "replyToQueueName". If I remove that setter, the problem seems to go away (we can send as many messages as we want without error).
I'm not certain why in this particular case that seems to work. The failure that occurs happens on the MQQueueManager declaration which is much higher up in the code than the replyToQueueName setter. This combined with this bug only occurring after 4-5 messages are sent seems to maybe indicate that something is not being "closed" properly but as far as I know there is no way to "close" a message and we are already closing/disconnecting the manager and the queue.
i have created a microservice for a webcam-stream with grpc. the streaming works fine but the cancellation of the stream works only on client side.
if the client calls CancellableContext.cancel, the streaming of video stops but the server is still streaming a video with the cam. if the cancellation is called the server throws a Transport faild Exception.
could this exception can be catched for stopping streaming or other operations on the server-side?
ClientCall<KameraStreamRequest, KameraStreamResponse> call = (ClientCall) imageStreamBlockingStub.getChannel().newCall(ImageStreamServiceGrpc.METHOD_IMAGE_DATA_STREAM, imageStreamBlockingStub.getCallOptions());
call.sendMessage(KameraStreamRequest.newBuilder().setStreamState(StreamState.STOP).build());
The streaming is started with a simple request which has an enum with State.START. if i call the code above to change the state to STOP i become an exception:
Exception in thread "main" java.lang.IllegalStateException: Not started
at com.google.common.base.Preconditions.checkState(Preconditions.java:174)
at io.grpc.internal.ClientCallImpl.sendMessage(ClientCallImpl.java:388)
at org.cpm42.grpcservice.ImageStreamClient.cancelStream(ImageStreamClient.java:70)
at org.cpm42.main.StreamClientMainClass.main(StreamClientMainClass.java:21)
i have read that this could be a bug.
is it possible to get Sessions or Connections or something else on server-side?
thanks
This may be a slightly different problem, but for a simple client/server project I needed to add: channel.shutdown().awaitTermination(5, SECONDS); to the client method making requests to get rid of the Exception.
io.grpc.netty.NettyServerTransport notifyTerminated
SEVERE: Transport failed
java.io.IOException: An existing connection was forcibly closed by the
remote host.
What does the rest of your stacktrace say?
Not sure if it helps. But you could try on the server side to poll the gRPC Context if the call has been cancelled and act accordingly:
import io.grpc.Context;
{ ...
Context.current().isCancelled()
...}
I'm creating a small utility which receives a lot of HTTP requests. It is written in java and uses embedded-jetty to handle requests via https.
I have a load-testing tool for it, but when it is being run for some time it starts to throw exceptions:
java.net.BindException: Address already in use: connect
(note, this is on sender's side, not in my project)
As I understand this means no more free sockets were found in system when another connect was called. Throughput is about 1000 requests per second, and failures start to appear somewhere after 20000 to 50000 requests.
However when I use the same load testing tool with another program (a kind of simple consumer, written in scala using netty by some colleague - it simply receives all requests and returns empty ok response) - there is no so problem with sockets (though typical speed is 1.5-2 times slower).
I wonder if this could be fixed by telling Jetty somehow to close connections immediately after response was sent. Anyway each new request is sent via new connection. I tried to play with Connector#setIdleTimeout - it seems to be 30000 by default but have not succeeded.
What can I do to fix this - or at least to research the matter deeper to find its cause (if I am wrong in my suggestions)?
UPD Thanks for suggestions, I think I am not allowed to post the source, but I get the idea that I should study client's code (this will make me busy for some time since it is written in scala).
I found that really there was a problem with client - it sends requests with Connection: Keep-Alive in header, though creates new HttpURLConnection for each request and calls disconnect() method after it.
To solve this trouble on the server-side it was sufficient to send Connection: close in response header, since I have no allowance to change testing utility.
I have a Java-based client that receives data from a Tomcat 6.0.24 server webapp via JAX-WS. I recently upgraded the server with new functionality that can take a long time (over 30 seconds) to run for certain inputs.
It turns out that for these long operations, some kind of timeout is occurring. The client gets an HTTP 400 Bad Request error, and shortly afterwards (according to my log timestamps, at least) the server reports a Broken Pipe.
Here's the client's error message:
com.sun.xml.internal.ws.client.ClientTransportException: The server sent HTTP status code 400: Bad Request
And the server's:
javax.xml.ws.WebServiceException: javax.xml.stream.XMLStreamException: ClientAbortException: java.net.SocketException: Broken pipe
I've tried experimenting with adding timeout settings on the service's BindingProvider, but that doesn't seem to change anything. The default timeout is supposed to be infinite, right?
I don't know if it's relevant, but it might be worth noting that the client is an OSGI bundle running in a Karaf OSGI framework.
Bottom line, I have no idea what's going on here. Note that the new functionality does work when it doesn't have to run for too long. Also note that the size of the new functionality's response is not any larger than usual - it just takes longer to calculate.
In the end, the problem was caused by some sort of anti-DoS measure on the server's public gateway. Unfortunately, the IT department refused to fix it, forcing me to switch to polling-based communication. Oh well.