Some Spring WebSocket Sessions never disconnect - java

I have a websocket solution for duplex communication between mobile apps and a java backend system. I am using Spring WebSockets, with STOMP. I have implemented a ping-pong solution to keep websockets open longer than 30 seconds because I need longer sessions than that. Sometimes I get these errors in the logs, which seem to come from checkSession() in Spring's SubProtocolWebSocketHandler.
server.log: 07:38:41,090 ERROR [org.springframework.web.socket.messaging.SubProtocolWebSocketHandler] (ajp-http-executor-threads - 14526905) No messages received after 60205 ms. Closing StandardWebSocketSession[id=214a10, uri=/base/api/websocket].
They are not very frequent, but happens every day and the time of 60 seconds seem appropriate since it's hardcoded into the Spring class mentioned above. But then after running the application for a while I start getting large amounts of these really long-lived 'timeouts':
server.log: 00:09:25,961 ERROR [org.springframework.web.socket.messaging.SubProtocolWebSocketHandler] (ajp-http-executor-threads - 14199679) No messages received after 208049286 ms. Closing StandardWebSocketSession[id=11a9d9, uri=/base/api/websocket].
And at about this time the application starts experiencing problems.
I've been trying to search for this behavior but havn't found it anywhere on the web. Has anyone seen this problem before, know a solution, or can explain it to me?

We found some things:
We have added our own ping/pong functionality on STOMP level that runs every 30 seconds.
The mobile client had a bug that caused them to keep replying to the pings even when going into screensaving mode. This meant that the websocket was never closed or timed out.
On each pong message that the server received the Spring check found that no 'real' messages had been received for a very long time and triggered the log to be written. It then tries to close the websocket with this code:
session.close(CloseStatus.SESSION_NOT_RELIABLE);
but I suspect this doesn't close the session correctly. And even if it did, the mobile clients would try to reconnect. So when 30 more seconds have passed another pong message is sent to the server causing yet another one of these logs to be written. And so on forever...
The solution was to write some server-side code to close old websockets based on this project and also to fix the bug in the mobile clients that made them respond to ping/pong even when being in screensaver mode.
Oh, one thing that might be good for other people to know is that clients should never be trusted and we saw that they could sometimes send multiple request for websockets within one millisecond, so make sure to handle these 'duplicate requests' some way!

I am also facing the same problem.
net stat on Linux output shows tcp connections and status as below:
1 LISTEN
13 ESTABLISHED
67 CLOSE_WAIT
67 TCP connections are waiting to be closed but these are never getting closed.

Related

How can I monitor EWS SOAP messages relating to subscription creation

We have a spring java app using EWS to connect to our on prem 2016 Exchange server and 'stream' pulling emails. Every 30 minutes a new 30 minute subscription is made (via new thread). We assume old connection just expires.
When one instance is running in our environment, it works perfectly fine, but when two instances run, after some time one instance will eventually start throwing error about
You have exceeded the available concurrent connections for your account. Try again once your other requests have completed.
It seems like an issue which is then hit by throttling. I found that the Exchange servers config is:
EWSMaxConcurrency=27, MaxStreamingConcurrency=10,
HangingConnectionLimit=10
Our code previously didn't explicitly close connections and unsubscribe (was running fine without when one instance). We tried including both but the issue still persists and we noticed the close method for StreamingSubscriptionConnection throws error. The team that handles the Exchange server can find errors referencing the exceeding connection count error above, but nothing relating to the close connection error
...[m.e.w.d.n.StreamingSubscriptionConnection.close(349)]: java.lang.Exception: microsoft.exchange.webservices.data.notification.StreamingSubscriptionConnection
Currently we don't have much ability to make changes on the exchange server side. I'm not familiar with SOAP messages but I was planning to look into how to monitor them to see what inbound and outbound messages there are for some insights
For the service I set service.setTraceEnabled(true) and service.setTraceFlags(EnumSet.allOf(TraceFlags.class)
However I only see trace messages in console when an email arrives. I dont see any messages during start up when a subscription/connection is created
Can anyone help provide any advice on how I can monitor these subscription related messages?
I tried using SOAPUI but I'm having difficulty applying our server's WSDL. I considered using the Tunnelij plugin for intellij but I'm not too familiar with how to set it up either
My suspicion is that there is some intermittent latency issue on Exchange server side, perhaps response messages are not coming back in a timely manner, and this may be screwing up. I presume if I monitor these SOAP messages then I should see more than 10 requests to subscribe before that error appears
The EWS Logs on the CAS (Client Access Server) should have details about the throttling issue. Are you using Impersonation in you Application if you not using Impersonation then the concurrent connections are charged against the account your using with Impersonation that get charged against the account your impersonating. The difference here is that a single user can have no more the 10 streaming subscriptions (unless you modify the web.config) if your using impersonation than you can scale your application to 1000's of users see https://github.com/MicrosoftDocs/office-developer-exchange-docs/blob/main/docs/exchange-web-services/how-to-maintain-affinity-between-group-of-subscriptions-and-mailbox-server.md

Okhttp websocket server Shutdown detection

I am developing a nodemcu websocket server android client app using java.i successfully created client and connected to it through a websocket client service.i can detect server failure/closed when sending data.but can't detect it at the time of failure that is if server powered off cant know untill some data is send.how to know the server failure at the time of failure.using okhttp 4.1.0 library.can anyone help
how to know the server failure at the time of failure.using okhttp 4.1.0 library.can anyone help
You can't. It's not possible, but, there are workarounds, see below.
Why isn't it possible? Internally, the internet is packet switched, which means data is first gathered up into packets, and then these packets are sent.
Most of the stuff you do on the web feels like it is 'streams' instead (you send 1 character, and one character arrives on the other side). But that's all based on protocols that are built on top of the packet nature of the internet.
When you have an open connection between 2 computers via the internet, no data is actually being sent, at all. It's not like you have a line reserved. Old telephone networks did work like that: When you dialled somebody, you got a dedicated line, and once the line got interrupted, you'd hear beeps to indicate this.
That is not how the internet works. Those wires and everything in between have no idea that there is an open connection at all. That's just some bits in memory on your computer and on the server which lets them identify certain packets as part of the longer conversation those 2 machines were having, is all.
Thus we arrive at why this isn't possible: Given that no packets are flowing whatsoever until one side actually sends data to the other, it is impossible to tell the difference between 'no data being sent right now' and 'somebody tripped over the power cable in the server park'. That's why you don't get that info until you send something (and the reason you get that is only because when you send something, the protocol dictates that the server sends you back a confirmation of receiving what you sent. If that takes too long, your computer will send it a few more times just in case the packet just got lost somewhere, and will eventually give up and conclude that the server can no longer be reached or crashed or lost power, and only then do you get the IOException).
Workarounds
A simple one is to upgrade your own protocol: Dictate that the server or client (doesn't matter who takes the responsibility to do this) sends a do-nothing message at least once a minute. You can then conclude after not receiving that for 100 seconds or so that the connection is probably dead. You can start a timer for 100 seconds, reset it every time you receive any data whatsoever. If the timer ever runs out? Connection is likely dead.
This is somewhat take on this idea built into the protocol that lets you make connections that feel like streams of data. That protocol is called TCP/IP, and the feature is called KeepAlive.
The problem is, you possibly don't get to dictate the TCP/IP settings for your websocket connection. If you can, you can turn on keepalive (for example in java, you use Socket to make raw TCP/IP connections, and it has a .setSoKeepAlive(true) method. Check the API if you can get at the socket or otherwise scan the docs for 'keepalive' and see if there's anything there.
I bet there won't be, which means you have to use the trick I mentioned above: Update your server code to use a timer to send a 'hello!' 60 seconds after any conversation, and update your client code to give up on the connection once 100 seconds have passed (give it 40 additional seconds; sometimes the internet gets a little backed up or servers get a little busy).

IllegalArgumentException in ByteBuffer during websocket send

I have a fairly complex websocket based application running on an up to date Tomcat 8 server.
At fairly random intervals I get this exception, simultaneously on all connected users.
java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:275)
at org.apache.tomcat.websocket.PerMessageDeflate.sendMessagePart(PerMessageDeflate.java:377)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:284)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:258)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendPartialBytes(WsRemoteEndpointImplBase.java:161)
at org.apache.tomcat.websocket.WsRemoteEndpointBasic.sendBinary(WsRemoteEndpointBasic.java:56)
at org.springframework.web.socket.adapter.standard.StandardWebSocketSession.sendBinaryMessage(StandardWebSocketSession.java:202)
at org.springframework.web.socket.adapter.AbstractWebSocketSession.sendMessage(AbstractWebSocketSession.java:105)
at org.infpls.royale.server.game.session.RoyaleSession.sendImmiediate(RoyaleSession.java:69)
at org.infpls.royale.server.game.session.SessionThread.run(SessionThread.java:46)
After this exception is thrown, the web socket is left in a PARTIAL_WRITING state and disconnects on the next write attempt.
I've seen it happen 15 minutes after starting Tomcat and I've seen it happen after idling on the server for 8 hours. I cannot find any correlation to what users are doing on the server and when this exception is thrown.
The problem seems to be happening fairly deep into Spring/Java NIO code and I am not sure how to debug this.
Any help would be greatly appreciated!
I took a shot in the dark after reading some slightly related issues from other people and instead of passing the bytebuffer off to the client threads to send it, I made full copies of the byte buffer for each thread that would send it.
I also switched from using the websocket.sendBinary(ByteBuffer bb) to using websocket.sendBinary(byte[] bb) instead.
That seems to have done the trick as I have not seen this bug happen again while running the server in production mode with pretty heavy load for 12 hours. If i find further information about this I will update it here.

Red5 crashes after couple seconds when using RTMPT

we've been having this problem for a long time and still cannot find out where is the problem. Our application uses RTMP for videostreaming and if the webclient cannot connect it skips to RTMPT (RTMP over HTTP). This causes the video to freeze after couple seconds of playback.
I have already found some forums where people seems to be havoing the same issue, but none of the proposed solutions worked. One suggestion was to turn of the video recording, but it didn't work. I have also read, that it seems to be a thread problem in the red5, but before hacking into the RED5 I would like to know, if maybe somebody has a patch or anything which repairs this.
One thing more, we've been testing this on Macs if that should count. Thank you very much in advance.
the very first thing you should look at is really the red5/error log.
Also Red5 occassionally produces output that might be not in the log but just to plain std.out
There is a red5-debug.sh or red5-highpref.sh that does output/log everything to a file called std.out.
You should use those logs to start your analysis. Eventually you will already see something into it. For example exception like:
broken pipe
connection closed due to too long xxx
handshake error
encoding issue in packet xyz
unexpected connection closed
call xyz cannot be handled
too many connections
heap space error
too many open files
Some of them are operating system specific, like for example the number of open files. Some are not.
Also it is very important that you are using the latest revision of Red5 and not an old version. You did not tell us what version you are using.
However, just from symptoms like video freezes *occassional disconnects* or similar you won't be able to start a real analysis of the problem.
Sebastian
Were you connected to the server when the video freezed? Or after that? I am not sure but I think connection closed which caused the stream to freeze.Just check in the access logs of Red5 if there are any logs for 'idle' packets(possibly after a 'send' packet(s) and more than one in number).
Another thing you could have a look at is your web server log files because RTMPT is over HTTP. I once had a problem with my anti DDOS program on the server. RTMPT will make many connections after each other and these TCP connections remain alive for about 4 minutes by default. You can easily get hundreds connections at the same time being seen as a DDOS-attack and as a result the IP-adres of the client will be banned.

Jetty interrupting connection

I have a application that return a long request that returns a stream (a huge json)
The application is written in Java and I'm using Jetty as server.
The problem is after sometimes getting data, it stops. I made some tests and sometimes I got 10, 15, 40%.. doesn't matter.. Jetty interrupts the connection at some moment. I already isolated only one machine without other requests and it happens the same way.
I do not know how to debug, cause I didn't see any error. It only interrupts.
Any help is appreciate
What version of jetty?
Is this on a slow connection perhaps?
If so, you are likely encountering idle timeouts.
It can happen like this ....
Server has a large amount of data to send, it sends until there is TCP backpressure from the client telling it "woah!".
So the Server waits until the TCP layer says its ok to start sending again.
The client is slow.
This wait is longer than the configured Idle Timeout for that connector.
The server closes the connection.

Categories