The problem: I'm having some packet loss internally. I mean internally because did capture all the traffic with wireshark and confirm the packet arrived at server, but did not arrive at channelRead0 method.
I built a SIP Server using Netty. The system uses UDP to communicate with other sip endpoints and works fine at low load.
My doubt is about design. Since SIP is a session protocol, on every packet received, I need to check what session it belongs to. The heavy workload surely is on the synchronized list that holds all sessions (I know need to optimize this on the future).
The whole system logic is inside channelRead0 method and this probably is the reason i'm losing some packets. The problem start to happens at around 500 pkt/sec.
There is no database connection (yet), the only I/O is writing log to a file which has almost no impact.
The question: How should I proper design this to handle 5000 pkts/sec? Maybe put all packets in a synchronized queue and handle them later?
Thanks for all help
I have a server setup using MINA version 2.
I don't have much experience with sockets and tcp.
The problem is if I make a connection to my server, and then unplug my internet and close the connection, (Server doesn't get notification of the connection being closed) the server will forever think that my connection is still active and valid.
The server will continue to send messages to my connection, and doesn't throw any exceptions even though there is nothing on my computer binded to the local port.
How can I test that the connection still exists?
I've tried running MINA logging in debug mode, and logging the
IoSession.isConnected() IoSession.isActive IoSession.isClosing
They always return true, true, false. Also, in debug mode, there was no useful information stating that the connection was lost. It just logged the regular "sent message" stuff, as if there was nothing wrong.
From using Flash actionscript, I have had experiences where flash will throw errors that it's operating on an invalid socket. That leads me to believe that it's saying the socket on the server is no longer valid for the connection. So in other words if flash can detect invalid sockets, a Java server should be able to detect it too correct?
If there is truly no way to detect dead connections, I can always setup a connection keep alive routine where the client is constantly sending an "I'm here" message to the server, and the server closes sessions that havent had an incoming message for a period of seconds.
EDIT: After learning that "sockets" are private and never shared over the network I managed to find better results for my issue and I found this SO thread.
Java socket API: How to tell if a connection has been closed?
IOException 'Connection reset by peer' Doesn't occur when I write to
the IoSession in MINA.
Is there any way at all in Java to detect when an ACK to a TCP packet was not received after sending a packet? An ACK Timeout?
Yet apparantly, my computer should send a RST to the server? According to this answer.
But that seems like a bad way of port scanning. Is this how port scanning works? Port scanners send data to a port and the victim's service responds with a RST? Sorry I think I need a new question for all this. But it's odd that MINA doesn't throw connection reset by peer when it sends data. So then my computer doesn't send a RST.
The concept of socket or connection in Internet protocols is an illusion. It's a convenient abstraction that is provided to you by the operating system and the TCP stack, but in reality, it's all fake.
Under the hood, everything on the Internet takes the form of individual packets.
From the perspective of a computer sending packets to another computer, there is no built-in way to know whether that computer is actually receiving the packets, unless that computer (or some other computer in between, like a router) tells you that the packets were, or were not, received.
From the perspective of a computer expecting to receive packets from another computer, there is no way to know in advance whether any packets are coming, will ever come, or in what order -- until they actually arrive. And once they arrive, just the fact that you received one packet does not mean you'll receive any more in the future.
That's why I say connections or sockets are an illusion. The way that the operating system determines whether a connection is "alive" or not, is simply by waiting an arbitrary amount of time. After that amount of time -- called a timeout -- if one side of the TCP connection doesn't hear back from the other side, it will just assume that the other end has been disconnected, and arbitrarily set the connection status to "closed", "dead" or "terminated" ("timed out").
Your server has no clue that you've pulled the plug on your Internet connection. It has no way of knowing that.
Your server's TCP stack has been configured a certain way to wait an arbitrary amount of time before "giving up" on the other end if no response is received. If this timeout is set to a very large period of time, it may appear to you that your server is hanging on to connections that are no longer valid. If this bothers you, you should look into ways to decrease the timeout interval.
Analogy: If you are on a phone call with someone, and there's a very real risk of them being hurt or killed, and you are talking to them and getting them to answer, and then the phone suddenly goes dead..... Well, how long do you wait? At what point do you assume the other person has been hurt or killed? If you wait a couple milliseconds, in most cases that's too short of a "timeout", because the other person could just be listening and thinking of how to respond. If you wait for 50 years, the person might be long dead by then. So you have to set a reasonable timeout value that makes sense.
What you want is a KeepAlive, heartbeat, or ping.
As per #allquicatic's answer, there's no completely reliable built-in method to do this in TCP. You'll have to implement a method to explicitly ask the client "Are you still there?" and await an answer for a specified amount of time.
A keepalive (KA) is a message sent by one device to another to check that the link between the two is operating, or to prevent this link from being broken.
In computer science, a heartbeat is a periodic signal generated by hardware or software to indicate normal operation or to synchronize other parts of a system.[1] Usually a heartbeat is sent between machines at a regular interval in the order of seconds. If a heartbeat isn't received for a time—usually a few heartbeat intervals—the machine that should have sent the heartbeat is assumed to have failed.[2]
The easiest way to implement one is to periodically send an arbitrary piece of data - e.g. a null command. A properly programmed TCP stack will timeout if an ACK is not received within its specified timeout period, and then you'll get a IOException 'Connection reset by peer'
You may have to manually tune the TCP parameters, or implement your own functionality if you want more fine-grained control than the default timeout.
The TCP framework is not exposed to Java. And Java does not provide a means to edit TCP configuration that exists on the OS level.
This means we cannot use TCP keep alive in Java efficiently because we can't change its default configuration values. Furthermore we can't set the timeout for not receiving an ACK for a message sent. (Learn about TCP to discover that every message sent will wait for an ACK (acknowledgement) from the peer that the message has been successfully delivered.)
Java can only throw exceptions for cases such as a timeout for not completing the TCP handshake in a custom amount of time, a 'Connection Reset by Peer' exception when a RST is received from the peer, and an exception for an ACK timeout after whatever period of time that may be.
To dependably track connection status, you must implement your own Ping/Pong, Keep Alive, or Heartbeat system as #Dog suggested in his answer. (The server must poll the client to see if it's still there, or the client has to continuosly let the server know it's still there.)
For example, configure your client to send a small packet every 10 seconds.
In MINA, you can set a session reader idle timeout, which will send an event when a session reader has been idle for a period of time. You can terminate that connection on delivery of this event. Setting the reader timeout to be a bit longer than the small packet interval will account for random high latency between the client and server. For example, a reader idle timeout of 15 seconds would be lenient in this case.
If your server will rarely experience session idling, and you think you can save bandwidth by polling the client when the session has gone idle, look into using the Apache MINA Keep Alive Filter.
a short explanaition of what i have.
I have a Server and a Client
Client makes GET Request
The stream of the GET Request is used as Push Stream
Server pushes messages to client via this stream in a single thread
The Problem is that when i don't sent data for 30 sec the Client seems to close the Stream automaticly.
I've already set the Timout from 30 sec to LONG.MAX_VALUE with:
For now I've implemented a "Heartbeat-Workaround" that pushes a simple String every 20sec so i elude the timeout.
I just want to know if this is the only way to do it. Or if I have to change some Settings i didn't found.
Thank you for every answer.
Seems you are doing reverse HTTP long-polling, which does require a "heart-beat" to avoid that streams or connections are closed by an idle timeout.
It is normally better to do regular HTTP long polling (i.e. the client sends the heart-beat), because it allows the server to detect disconnected clients much quicker.
However, you are better off using solutions like CometD if you want to perform server-push messaging.
I need to send a continuous flow of messages (simple TextMessages with a timestamp and x/y coordinates) over a wireless network from a moving computer. There will be a lot of these short messages (like 200 per sec) and unfortunately the network connection is most likely unreliable since the sending device will leave the WLAN area from time to time... When the connection is not available, all upcoming messages should be buffered until the connection is back up again. The order of the transmitted messages does not matter, since they contain a timestamp, but ALL messages must be transferred.
What would be a simple but reliable method for sending these telegrams? Would it be possible to just use a "plain" TCP or UDP socket connection? Would messages be buffered when the connection is temporarily down and send afterwards automatically? Or is the connection loss directly detected and reported, thus I could buffer the messages and try to reconnect periodically on my own? Do libraries like Netty help here?
I also thought about using a broker to broker communication (e.g. ActiveMQ network of brokers) as an alternative. Would the overhead too big here?! Would you suggest another messaging middleware in this case?
TCP is guaranteed delivery (When it's connected that is) - You should check if the connection went down and put messages in a queue while it is retrying the connection. Once it sees that connection is back up dump the queue into the TCP socket.
Also look into TCP Keepalive for recognition of a down connection:
Seems like you could use a message wrapper like Java JMS using a "Assured persistent" reliability mode. I have not done this myself, in the context of text messages, but this idea may lead you to the right answer. Also, there may be an Apache library already written that handles what you need, such as Qpid .
We are running a high throughput system that utilizes tibco-ems JMS to pass large numbers of messages to and from our main server to our client connections. We've done some statistics and have determined that JMS is the causing a lot of latency. How can we make tibco JMS more performant? Are there any resources that give a good discussion on this topic.
Using non-persistent messages is one option if you don't need persistence.
Note that even if you do need persistence, sometimes it's better to use non persistent messages, and in case of a crash perform a different recovery action (like resending all messages)
This is relevant if:
crashes are rare (as the recovery takes time)
you can easily detect a crash
you can handle duplicate messages (you may not know exactly which messages were delivered before the crash
EMS also provides some mechanisms that are persistent, but less bullet proof then classic guaranteed delivery
these include:
instead of "exactly once" message delivery you can use "at least once" or "up to once" delivery.
you may use the pre-fetch mechanism which causes the client to fetch messages to memory before your application request them.
EMS should not be the bottle neck. I've done testing and we have gotten a shitload of throughput on our server.
You need to try to determine where the bottle neck is. Is the problem in the producer of the message or the consumer. Are messages piling up on the queue.
What type of scenario are you doing.
Pub/sup or request reply?
are you having temporary queue pile up. Too many temporary queues can cause performance issues. (Mostly when they linger because you didn't close something properly)
Are you publishing to a topic with durable subscribers if so. Try bridging the topic to queue and reading from those. Durable subscribers can cause a little hiccup in performance too since it needs to track who has copies of all messages.
Ensure that your sending process has one session and multiple calls through that session. Don't open a complete session for each operation. Re-use where possible. Do the same for the consumer.
make sure you CLOSE when you are done. EMS doesn't clear things up. So if you make a connection and just close your app the connection still is there and sucking up resources.
review your tolerance for lost messages in the even of a crash. If you are doing Client ack and it doesn't matter if you crash processing the message then switch to auto. Also I believe if you are using (TEMS - Tibco EMS for WCF) there's a problem with the session acknowledge. So a message is only when its processed on the whole message, we switched from Client ACK to the one that had Dups ok and it worked better)