Need to reassemble tcp packets with netty

Need to reassemble tcp packets with netty - java

currently I am working with an inhouse protocol where I send a request to our hardware and receive the answer with netty. In the message which I receive are several bytes which tell me how many bytes the answer will contain.
In my channelRead method I wait until the readable bytes of the recieved message are equal or greater than the expected bytes to make sure I get all data.
if (((ByteBuf) msg).readableBytes() >= dataSize) {
//do something with the bytes
ctx.close();
((ByteBuf) msg).release();
}
This works fine if I receive exactly one tcp package from the hardware. Sometimes the hardware splits the TCP frame into several packages and my channelRead waits for ever.
Is there a simple way in netty to reassemble these packets in the channelRead method?

Just extend ByteToMessageDecoder. This will handle all the buffering for you. Check the javadocs for more details and an example.

Related

How does Java NIO break up messages?

I'm writing a toy Java NIO server paired with a normal Java client. The client sends a string message to the server using plain Socket. The server receives the message and dumps the content to terminal.
I've noticed that the same message from client is broken up into bytebuffers differently every single time. I understand this is intended behaviour of NIO, but would like to find out roughly how the NIO decides to chop up a message?
Example: Sending string "this is a test message" to server. The following are excerpts of server loggings (each line represents 1 bytebuffer received).
Run 1:
Server receiving: this is a test message
Run 2:
Server receiving: t
Server receiving: his is a test message
Run 3:
Server receiving: this is
Server receiving: a test message
UPDATE - Issue Resolved
I have installed Wireshark to analyse the packets and it has become apparent that the random "break up" was due to me using DataOutputStream for the writer, which sends the message character by character! So there was a packet for each character...
After changing the writer to BufferedWriter, my short message is now sent as a single packet, as expected. So the truth is Java NIO actually did the clever thing and merged my tiny packets to 1 to 2 bytebuffers!
UPDATE2 - Clarification
Thank you all for your replies. Thank you #StephenC for pointing out that unless I encode the message myself(yes, I did call flush() after writing to BufferedWriter), there's always the possiblity of my message arriving across multiple packets.
So the truth is Java NIO actually did the clever thing and merged my tiny
Actually, no. The merging is happening in the BufferedWriter layer. The buffered writer will only deliver a "bunch" of bytes to the NIO layer when either the application flushes or closes the DataOutputStream or the BufferdWriters buffer fills up.
I was in fact referring to my first attempt with DataOutputStream (I got it from an example online, which obviously is incorrect use of the class now that you've pointed it out). BufferedWriter was not involved. My simple writer in that case went like
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
out.writeBytes("this is a test message");
Wireshark confirmed that this message was sent(server on localhost) 1 character a packet(22 packets in total for the actual message not including all the ACK and etc).
I'm probably wrong, but this behaviour seems to suggest that the NIO server combined these 22 packets into 1-2 bytebuffers?
The end game I'm trying to achieve here is a simple Java NIO server capable of receiving request and data stream using TCP from various clients, some may be written in C++ or C# by third party. It's not time critical so the clients can send all data in one go and the server can process them at its own pace. That's why I've written a toy client in Java using plain Socket rather than a NIO client. Therefore the client in this case can't really manipulate the ByteBuffer directly, so I probably need some sort of message format. Could I make this work?

If you are sending data over a TCP/IP socket, then there are no "messages" as such. What you send and receive is a stream of bytes.
If you are asking if you can send a chunk of N bytes, and have the receiver get exactly N bytes in a single read call, then the answer is that there is no guarantee that will happen. However, it is the TCP/IP stack that is "breaking up" the "messages". Not NIO. Not Java.
Data sent over a TCP/IP connection is ultimately broken into network packets for transmission. This typically erases any "message" structure based on the original write request sizes.
If you want a reliable message structure over the top of the TCP/IP byte stream, you need to encode it in the stream itself; e.g. using an "end-of-message" marker or prefixing each message with a byte count. (If you want to use fancy words, you need to implement a "message protocol" over the top of the TCP/IP stream.)
Concerning your update, I think there are still some misconceptions:
... it became apparent that the random "break up" was due to me using DataOutputStream for the writer, which sends the message character by character! So there was a packet for each character...
Yes, lots of small writes to a socket stream may result in severe fragmentation at the network level. However, it won't always. If there is sufficient "back pressure" due to either network bandwidth constraints or the receiver reading slowly, then this will lead to larger packets.
After changing the writer to BufferedWriter, my short message is now sent as a single packet, as expected.
Yes. Adding buffering to the stack is good. However, you are probably doing something else; e.g. calling flush() after each message. If you didn't then I would expect a network packet to contain a sequence of messages and partial messages.
What is more, if the messages are too large to fit into a single network packet, or if there is severe back-pressure (see above) then you are liable to get multiple / partial messages in a packet anyway. Either way, the receiver should not rely on getting one (whole) message each time it reads.
In short, you may not have really resolved your issue!!
So the truth is Java NIO actually did the clever thing and merged my tiny
Actually, no. The merging is happening in the BufferedWriter layer. The buffered writer will only deliver a "bunch" of bytes to the NIO layer when either the application flushes or closes the DataOutputStream or the BufferdWriters buffer fills up.
FWIW - given your description of what you are doing, it is unlikely using NIO is helping performance. If you wanted to maximize performance, you should stop using BufferedWriter and DataOutputStream. Instead do your message encoding "by hand", putting the bytes or characters directly into the ByteBuffer or CharBuffer.
(Also DataOutputStream is for binary data, not text. Putting one in front of a Writer doesn't seem right ... if that is what you are really doing.)

Datagramsocket: how receive(...) handles fragmentation of a packet

I came to know from my Professor that, a datagram packet sent using UDP socket gets fragmented in the lower layers and may arrive as multiple packets at the receiver end. For e.g, if I send a 1000 bytes data in a datagram packet, at the receiving end it might arrive as, say 2 bytes, 500 bytes, 12 bytes, and so on. Therefore, he suggested to do multiple receive(...) to receive the entire 1000 byte packet sent by the sender.
Later when I went through the Java documentation for datagram socket receive(...) and there is a line that reads as follows: "This method blocks until a datagram is received. ..." Does it mean that entire datagram packet is received and don't need to do multiple receive (even though it's the case in theory) when we use Java?
Pls. clarify. If multiple receive(...) for each packet is the only option to get around this problem, pls. give suggestions on how to do this.

Any call to receive() will give you an entire packet - the fragment handling happens in two layers below the socket. The fragmentation and defragmentation happens in the Network/Internet layer (IP), so the socket will never see the fragments but only receive entire and full UDP/TCP packets (only full packets gets sent to the listening port).
So, no, you do not need multiple receive() to get a single packet, but you should be aware that UDP is not reliable so if one fragment gets lost in the Network layer (and in some cases if it arrives out of order), you won't be able to get the packet.
You might also want to check the methods getReceiveBufferSize() and setReceiveBufferSize() if you're having trouble receiving packets - if the buffer size is smaller than the packet size, it's not guaranteed that you can receive the packet.

How to detect retransmited packets to discard them in TCP Sockets in java

I sometimes receive already received packets (I used sniffer and system ACKs them). Now I read all data (until socket timeout) and then send new request, but this is ugly. I was thinking about using sequence numbers but i didn't find it in Socket interface. Any clues?

No you don't. If the receiving TCP stack misses a packet, it will re-request it, but it can't have delivered the original one to you, because it missed it. And if it gets a packet it has already received, it will drop it.
TCP will deliver all the bytes that are sent, in the order they are sent. Nothing else (well, except some edge cases around disconnects).
Something else is going on.
EDIT:
To be clear, I'm talking about the bytes that are delivered to your application through the socket's InputStream. What happens on the wire is largely irrelevant unless you have some horrific network retransmission problem that you're trying to investigate. And if the receiving stack does get a duplicate packet, it will ACK it, because if it didn't then the sender would re-send it... again.

It sounds like you're trying to account for things that TCP already takes care of. It has sequence numbers built in and will take care of any lost data for you, and from the receiving point you should be waiting until you receive all your expected data, rather than reissuing a request. If you don't want to wait for a response to complete before issuing a new request, consider pipe-lining requests with multiple connections.

"Lost" UDP packets (JBoss + DatagramSocket)

I develop part of some JBoss+EJB based enterprise application. My module needs to process huge amount of incoming UDP packets. I've done some load testing and it looks that in case of sending packets with 11ms interval everything is fine, but in case of 10ms interval some packets are lost. It's rather strange in my opinion, but I done 10/11ms interval load tests comparison several times and it is always the same result (10 ms - some "lost" packets, 11ms - everything's fine).
If it was something wrong with synchronization, I'd expect that it will also be visible in case of 11ms tests (at least one packet lost, or at least one wrong counter value).
So if it is not because of synchronization, then maybe DatagramSocket through which I receive packets doesn't work as expected.
I found that receive buffer size (SO_RCVBUF) has default 57344 value (probably it's underlying IO network buffers dependent). I suspect, that maybe when this buffer goes full, then new incoming UDP datagrams are rejected. I tried set this value to some higher, but I noticed that if I exaggerate, buffer returns to its default size. If it's underlying layer dependent how can I find out maximum buffer size for certain OS/network card from JBoss level?
Is it possible that it is caused by receive buffer size, or maybe 57344 value is big enough to handle most cases? Do you have any experience with such issues?
There is no timeout set on my DatagramSocket. My UDP datagrams contains about 70 bytes of data (value without datagram header included).
[Edited]
I have to use UDP because I receive Cisco Netflow data - it is protocol used by network devices to send some traffic statistics. Also, I have no influence on sent bytes format (e.g. I cannot add counters for packets and so on). It is not expected that all packets will be processed (some datagrams may be lost), but I'd expect that I will process most of packets. During 10ms interval tests, about 30% of packets were lost.
It is not very possible that slow processing causes this issue. Currently singleton component holds reference to DatagramSocket calling receive method in a loop. When packet is received, it is passed to the queue, and processed by picked from pool stateless component. "Facade" Singleton is only responsible for receiving packets and passing it on to the processing (it does not wait for processing complete event).
Thanks in advance,
Piotr

UDP does not guarantee delivery, so you can tweak parameters, but you can't guarantee that the message will get delivered, especially in the case of very large data transfers.
If you need to guarantee delivery, you should use TCP instead.
If you need (or want) to use UDP, you can encode each packet with a number, and also send the number of packets expected. For example, if you sent 10 large packets, you could include the information: packet 1/10, packet 2/10, etc. This way you can at least tell if you have not received all of the packets. If you have not received them, you could send a request to resend those missing packets.

UDP is inherently unreliable.
Datagrams can be thrown away at any point between sender and receiver, even within the receiver at a level below your code. Setting the recv buffer to a larger size is likely to help the networking code within your machine buffer more datagrams but you should expect that some datagrams will be lost anyway.
If your recv logic takes too long (i.e. longer than it takes for a new datagram to arrive) then you'll always be behind and you'll always miss datagrams eventually. All you can do is make sure that your recv code runs as fast as possible, perhaps move the inbound datagram to a queue and process it 'later' or on another thread but then that will just move your problem to being one where you have a queue that keeps growing.
[Re your edit...] And what's processing your queue and how does the locking work between the producer and the consumers? Change your code so that the recv logic simply increments a count and discards the data and loops back around and see if you're losing fewer datagrams; either way, UDP is unreliable, you WILL have datagrams that are discarded and you should just expect that and deal with it. Worrying about it means you're focusing on the wrong problem; make use of the data you DO get and assume that you wont get much of it and then your program will work even if the network gets congested and MOST of your datagrams get discarded.
In summary, that's just how is it with UDP.

It appears in your tests that only up to two packets can be in the buffer so if each packet is less than 28KB this should be fine.
As you know UDP is lossy, but you should be able to send more than one packet per 10 ms. I suggest you write a simple receiver which just listens to packets just to determine if its your application or something at the network/OS level. (I suspect the later)

I don't know Java but ... does the API allow you to invoke an asynch listen/receive for a datagram:
Use O/S API to do a receive (passing your application-level buffer as a paremeter)
(Wait while there's nothing to receive...)
(O/S receives something from the network...)
O/S puts the received packet into the buffer and completes/returns your API call
If that's true then I suggest that you do several concurrent instances of the API call, so that there are several concurrent application-level buffers into which multiple packets can be received.

What is a TCP window update?

I'm making my own custom server software for a game in Java (the game and original server software were written with Java). There isn't any protocol documentation available, so I am having to read the packets with Wireshark.
While a client is connecting the server sends it the level file in Gzip format. At about 94 packets into sending the level, my server crashes the client with an ArrayIndexOutOfBoundsException. According to the capture file from the original server, it sends a TCP Window Update at about that point. What is a TCP Window Update, and how would I send one using a SocketChannel?

TCP windows are used for flow control between the peers on a connection. With each ACK packet, a host will send a "window size" field. This field says how many bytes of data that host can receive before it's full. The sender is not supposed to send more than that amount of data.
The window might get full if the client isn't receiving data fast enough. In other words, the TCP buffers can fill up while the application is off doing something other than reading from its socket. When that happens, the client would send an ACK packet with the "window full" bit set. At that point, the server is supposed to stop sending data. Any packets sent to a machine with a full window will not be acknowledged. (This will cause a badly behaved sender to retransmit. A well-behaved sender will just buffer the outgoing data. If the buffer on the sending side fills up too, then the sending app will block when it tries to write more data to the socket!)
This is a TCP stall. It can happen for a lot of reasons, but ultimately it just means the sender is transmitting faster than the receiver is reading.
Once the app on the receiving end gets back around to reading from the socket, it will drain some of the buffered data, which frees up some space. The receiver will then send a "window update" packet to tell the sender how much data it can transmit. The sender starts transmitting its buffered data and traffic should flow normally.
Of course, you can get repeated stalls if the receiver is consistently slow.
I've worded this as if the sender and receiver are different, but in reality, both peers are exchanging window updates with every ACK packet, and either side can have its window fill up.
The overall message is that you don't need to send window update packets directly. It would actually be a bad idea to spoof one up.
Regarding the exception you're seeing... it's not likely to be either caused or prevented by the window update packet. However, if the client is not reading fast enough, you might be losing data. In your server, you should check the return value from your Socket.write() calls. It could be less than the number of bytes you're trying to write. This happens if the sender's transmit buffer gets full, which can happen during a TCP stall. You might be losing bytes.
For example, if you're trying to write 8192 bytes with each call to write, but one of the calls returns 5691, then you need to send the remaining 2501 bytes on the next call. Otherwise, the client won't see the remainder of that 8K block and your file will be shorter on the client side than on the server side.

This happens really deep in the TCP/IP stack; in your application (server and client) you don't have to worry about TCP windows. The error must be something else.

TCP WindowUpdate - This indicates that the segment was a pure WindowUpdate segment. A WindowUpdate occurs when the application on the receiving side has consumed already received data from the RX buffer causing the TCP layer to send a WindowUpdate to the other side to indicate that there is now more space available in the buffer. Typically seen after a TCP ZeroWindow condition has occurred. Once the application on the receiver retrieves data from the TCP buffer, thereby freeing up space, the receiver should notify the sender that the TCP ZeroWindow condition no longer exists by sending a TCP WindowUpdate that advertises the current window size.
https://wiki.wireshark.org/TCP_Analyze_Sequence_Numbers

A TCP Window Update has to do with communicating the available buffer size between the sender and the receiver. An ArrayIndexOutOfBoundsException is not the likely cause of this. Most likely is that the code is expecting some kind of data that it is not getting (quite possibly well before this point that it is only now referencing). Without seeing the code and the stack trace, it is really hard to say anything more.

You can dive into this web site http://www.tcpipguide.com/free/index.htm for lots of information on TCP/IP.

Do you get any details with the exception?
It is not likely related to the TCP Window Update packet
(have you seen it repeat exactly for multiple instances?)
More likely related to your processing code that works on the received data.

This is normally just a trigger, not the cause of your problem.
For example, if you use NIO selector, a window update may trigger the wake up of a writing channel. That in turn triggers the faulty logic in your code.
Get a stacktrace and it will show you the root cause.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.