Preventing split packets over TCP - java

I am writing a program that transfers files over the network using TCP sockets.
Now I noticed that when I send a packet in size for example 1024 bytes, I get them split on the other side.
By "split" I mean I get some packets as if they were a part of a whole packet.
I tried to reduce the packet size and the algorithm worked, when the packet size was immensely small (about 30 bytes per packet) thus the file transferred very slowly.
Is there anything I can do in order to prevent the splitting?
SOLVED:i switched the connection to be over UDP and since UDP is packet bounded it worked

There is not such thing in TCP. TCP is a stream, what you write is what you get at the other end. This does not mean you get it the way it was written; TCP may break or group packets in order to do the jobs as effectively as possible. You can send 8 mega bytes packet in one write and TCP can break down into 10, 100 or 1000 packets, what you need to know is that at the other end you will get exactly 8 mega bytes no more no less. In order to do a file transfer effectively you need to tell the receiver how many bytes you are going to send. The receiver may read it in one chunk or in 100 chunks but must keep track of the data it reads and how many bytes to read.

Because TCP is stream oriented, TCP will not transfer information of 'packet boundaries', like UDP and SCTP.
So you must add information of packet boundaries to TCP payload, if it is not there already. There are several ways to do it:
You can use a length field for indicating how many bytes the following packet contains.
Or there could be a reserved symbol for separating different packets.
In all ways, receiver must read TCP input stream again, if complete packet is not received.

You can control the TCP maximum segment size in some socket implementations. If you set it low enough, you can make the segment fit inside a single packet. The BSD Sockets API, which influenced almost every other implementation, has a setsockopt() function that lets you set various options on the socket. One of them, TCP_MAXSEG, controls the maximum segment size.
Unfortunately for you, the standard Java Socket class doesn't support this particular option.

Related

Sending a buffer of 10 MB through socket - Chunks or Whole 10MB?

I am converting the details that has to be sent from my C++ function to Java as strings and as a char* which will be sent through socket.
My buffer size is 10 MB. Can I send the 10MB in one shot or should I split and send as chunks of smaller memory?
What is the difference between those two approaches? If I should send as smaller memory what should be the chunk size?
Can I send the 10MB in one shot
Yes.
or should I split and send as chunks of smaller memory?
No.
What is the difference between those two approaches?
The difference is that in case 1 you are letting TCP make all the decisions it is good at, with all the extra knowledge it has that you don't have, about the path MTU, the RTT, the receive window at the peer, ... whereas in case 2 you're trying to do TCP's job for it. Keeping a dog and barking yourself.
If I should send as smaller memory what should be the chunk size?
As big as possible.
When you call the write() function, you provide a buffer and number of bytes you want to write. However it is not guaranteed that the OS will send/write all the bytes that you are willing to write in a single shot. (In case of blocking sockets, the write() call would block until it copies the entire chunk to the TCP buffer. However in case of non-blocking ones, the write() would return and would not block and would write the just the bytes it is able to).
The TCP/IP stack runs in the OS and each OS will have its own implemenation of the stack. This stack would determine the buffer sizes and moreover the TCP/IP would itself take care of all the low level statistics such as MSS, the available receiver window size, which would let TCP run the flow control, congestion control related algorithms.
Therefore it is best that let TCP decide how would it want to send your data. Instead of you breaking the data into chunks, let the TCP stack do it for you.
Just be careful with the thing that always check the number of bytes actually sent which is returned by the write() call.

Buffer and packets

I'm using an event which is activated when X bytes has arrived into the buffer. Is the typical buffer(), available() and read() serial port methods. My question is, when you send a packet via wireless (or whatever medium) you can expect that the packet arrived with the total length at a time? or bytes arrives sequentially through the buffer forming the packet? Because I don't know If I need to use buffer() considering total packet length or using it considering the bytes that arrives which form the packet.
My guess is that the firmware first use the cheksum operation first to ensure that the packet arrived completely and then move it to the buffer. Isn't it?
Serial ports and TCP connections are byte streams. There are no message-boundaries larger than one byte. You cannot transfer messages any larger than one byte without another protocol on top.

Dimension of packets sent through a socket

I have implemented a system similar to BitTorrent, and I would like to know at what size I should set the packets of each chunk. I was not able to find how BitTorrent does it, what size packets they use. I currently use 100 kilobyte packets, is that a lot?
TCP breaks data into packets automatically. You don't have to worry about the size of network packets.
The size of a TCP packet is constrained by the MTU (maximal transfer unit) of the network, typically around 1500 bytes. If you were making a game or a multimedia program where low latency is important you might have to keep in mind that data is sent in packets, but for a file transfer program it doesn't matter.
There is no such thing as a TCP packet. It's a byte stream. Under the hood it is broken into segments, in a way that is entirely out of your control, and further under the hood those segments are wrapped in IP packets, ditto.
Just write as much as you like in each write, the more the better.

Read most recent UDP packet by disabling SO_RCVBUF in Java DatagramSocket?

I need to read the most recent incoming UDP packet, regardless of dropped packets in between reads. Incoming packets are coming in 3x faster than the maximum application processing speed. In an attempt to achieve this, I used setReceiveBufferSize(int size) of Java's DatagramSocket class to set the SO_RCVBUF to be the same size as my expected packet in bytes.
However, there is still a three packet delay before I get the most recent packet (and if the incoming rate is 10x the receive rate, there is a 10 packet delay). This suggests that SO_RCVBUF contains more than just the newest packet.
First, are the units of setReceiveBufferSize(int size) in bytes? It is not explicitly stated in the javadocs. Second, is there a way to disable SO_RCVBUF so that I only receive the most recent incoming packet? For example, zero is an illegal argument to the function, but I could theoretically set the receive buffer size to one.
this looks like an unusual problem ;)
i would recommend to split your application into separate threads:
reciever (minimal work, no parsing/etc)
handles the incoming packets and puts the last read object into an asyncronous variable
processing (from what you wrote, looks like this takes a long time)
reads the object from the asyncronous space, and processes it (don't forget to ignore the previous)
if you need to hack things like SO_RCVBUF, i think you should step a bit closer to the io processing subsystem with C/C++
You've done exactly the wrong thing. Set the receive buffer as large as possible. 512k for example. Setting it low only increases the probability of dropped packets. And either speed up the receiving code or slow down the sending code. There's no point in sending packets that can't be received.

Java dropping half of UDP packets

I have a simple client/server setup. The server is in C and the client that is querying the server is Java.
My problem is that, when I send bandwidth-intensive data over the connection, such as Video frames, it drops up to half the packets. I make sure that I properly fragment the udp packets on the server side (udp has a max payload length of 2^16). I verified that the server is sending the packets (printf the result of sendto()). But java doesn't seem to be getting half the data.
Furthermore, when I switch to TCP, all the video frames get through but the latency starts to build up, adding several seconds delay after a few seconds of runtime.
Is there anything obvious that I'm missing? I just can't seem to figure this out.
Get a network tool like Wireshark so you can see what is happening on the wire.
UDP makes no retransmission attempts, so if a packet is dropped somewhere, it is up to the program to deal with the loss. TCP will work hard to deliver all packets to the program in order, discarding dups and requesting lost packets on its own. If you are seeing high latency, I'd bet you'll see a lot of packet loss with TCP as well, which will show up as retransmissions from the server. If you don't see TCP retransmissions, perhaps the client isn't handling the data fast enough to keep up.
Any UDP based application protocol will inevitably be susceptible to packet loss, reordering and (in some circumstances) duplicates. The "U" in UDP could stands for "Unreliable" as in Unreliable Datagram Protocol. (OK, it really stands for "User" ... but it is certainly a good way to remember UDP's characteristics.)
UDP packet losses typically occur because your traffic is exceeding the buffering capacity of one or more of the "hops" between the server and client. When this happens, packets are dropped ... and since you are using UDP, there is no transport protocol-level notification that this is occurring.
If you use UDP in an application, the application needs to take account of UDP's unreliable nature, implementing its own mechanisms for dealing with dropped and out-of-order packets and for doing its own flow control. (An application that blasts out UDP packets with no thought to the effect that this may have on an already overloaded network is a bad network citizen.)
(In the TCP case, packets are probably being dropped as well, but TCP is detecting and resending the dropped packets, and the TCP flow control mechanism is kicking in to slow down the rate of data transmission. The net result is "latency".)
EDIT - based on the OP's comment, the cause of his problem was that the client was not "listening" for a period, causing the packets to (presumably) be dropped by the client's OS. The way to address this is to:
use a dedicated Java thread that just reads the packets and queues them for processing, and
increase the size of the kernel packet queue for the socket.
But even when you take these measures you can still get packets dropped. For example, if the machine is overloaded, the application may not get execution time-slices frequently enough to read and queue all packets before the kernel has to drop them.
EDIT 2 - There is some debate about whether UDP is susceptible to duplicates. It is certainly true that UDP has no innate duplicate detection or prevention. But it is also true that the IP packet routing fabric that is the internet is unlikely to spontaneously duplicate packets. So duplicates, if they do occur, are likely to occur because the sender has decided to resend a UDP packet. Thus, to my mind while UDP is susceptible to problems with duplicates, it does not cause them per se ... unless there is a bug in the OS protocol stack or in the IP fabric.
Although UDP supports packets up to 65535 bytes in length (including the UDP header, which is 8 bytes - but see note 1), the underlying transports between you and the destination do not support IP packets that long. For example, Ethernet frames have a maximum size of 1500 bytes - taking into account overhead for the IP and UDP headers, that means that any UDP packet with a data payload length of more than about 1450 is likely to be fragmented into multiple IP datagrams.
A maximum size UDP packet is going to be fragmented into at least 45 separate IP datagrams - and if any one of those fragments is lost, the entire UDP packet is lost. If your underlying packet loss rate is 1%, your application will see a loss rate of about 36%!
If you want to see less packets lost, don't send huge packets - limit your data in each packet to about 1400 bytes (or even do your own "Path MTU discovery" to figure out the maximum size you can safely send without fragmentation).
Of course, UDP is also subject to the limitations of IP, and IP datagrams have a maximum size of 65535, including the IP header. The IP header ranges in size from 20 to 60 bytes, so the maximum amount of application data transportable within a UDP packet might be as low as 65467.
The problem might be to do with your transmit buffer getting filled up in your UDPSocket. Only send the amount of bytes in one go indicated by UDPSocket.getSendBufferSize(). Use setSendBufferSize(int size) to increase this value.
If #send() is used to send a
DatagramPacket that is larger than the
setting of SO_SNDBUF then it is
implementation specific if the packet
is sent or discarded.
IP supports packets up to 65535 bytes including a 20 byte IP packet header. UDP supports datagrams up to 65507 bytes, plus the 20 byte IP header and the 8 byte UDP header. However the network MTU is the practical limit, and don't forget that that includes not just these 28 bytes but also the Ethernet frame header. The real practical limit for unfragmented UDP is the minimum MTU of 576 bytes less all the overheads.

Categories