If I have two connections to a server which require multiple reads on a channel to complete a packet, how will I know which read goes with which packet?
For example two packets which are received as four interleaved buffers:
PacketA buffer part1
PacketB buffer part1
PacketA buffer part2
PacketB buffer part2
The first part MAY have a header but the second part could arrive as being split anywhere along the packet.
When receiving a partial packet, how do I know which buffer goes where?
I'm thinking about building a map to associate each channel object with its respective output buffer which will hold the reassembled packet. Is this the way it's supposed to be done?
Indeed, typical practice is to have separate buffers for each channel. You don't necessarily need a map. You could attach extra information to each SelectionKey. Every key can hold an object of your choice that your code can retrieve later. This is a convenient place to store a buffer, or a set of buffers.
Related
I am currently using java.net.Socket to send messages from the client and reading messages from the server. All my messages are fairly short so far, and I have never had any problems.
One of my friends noticed that I was not handling message fragmentation, where the data could come in pieces, and has advised that I should create a buffer to handle this. I insisted that TCP handles this for me, but I'm not 100% sure.
Who is right?
Also, I plan on creating a client in C as well in the future. Do Berkeley sockets handle message fragmentation?
Details: Currently, in Java, the server creates a socket and reads the first byte from the message with InputStream#read(). That first byte determines the length of the entire message, and creates a byte array of the appropriate length, and calls InputStream#read(byte[]) once and assumes that the entire message has been read.
If you are talking about WebSockets,you may be mixing different concepts.
One thing is TCP/IP message fragmentation.
Other thing is how buffering works. You read buffers of data, and you need a framing protocol that tells you when you have a complete "message" (or frame). Basically you:
Read buffer.
Has complete header? No-> Goto 1, Yes-> continue
Read until having all the bytes that the head indicates as message
length.
Has complete message? No-> Goto 3, Yes -> continue
Yield message.
Goto 1.
Other different thing is WebSocket message fragmentation. WebSocket has already a framing protocol and messages can be split in different data frames, and control frames can be interleaved with data frames: https://developer.mozilla.org/en-US/docs/WebSockets/Writing_WebSocket_servers#Message_Fragmentation
If you are writing a WebSocket client or server you have to be ready for this situation.
Expanding on what nos said, TCP will break up large messages into smaller chunks, if the message is large enough. Often, it isn't. Often, the data you write is already split into parts (by you), into meaningful chunks like discrete messages.
The stuff about the reads/writes taking different amounts of calls comes from how the data is written, how it travels over the wire, and how you read it.
If you write 2 bytes 100 times, and then 20 seconds later go to read, it will say there is 200 bytes to be read, which you can read all at once if you want. If you pass a massive 2mb buffer to be written (I dont even know if thats possible), it would take longer to write out, giving more of a chance to the reading program to get different read calls.
Details: Currently, in Java, the server creates a socket and reads the first byte from the message with InputStream#read(). That first byte determines the length of the entire message, and creates a byte array of the appropriate length, and calls InputStream#read(byte[]) once and assumes that the entire message has been read.
That won't work. Have a look at the contract for InputStream.read(byte[]). It isn't obliged to transfer more than one byte. The correct technique is to read the length byte and then use DataInputStream.readFully(), which has the obligation to fill the buffer.
I build a client side (SocketChannel) which is getting big messages (the size of each message is ~1MB-~2MB).
How can I get the message ?
I'm using selector. When the key isReadable I want to read all the packets of the receiving message.
How can I know that the receiving packets belongs to one message and not to the other message ?
The safest way to do this is to know in advance the size of each message. If you can change the server protocol to send the size of the message just before the message itself, then all you have to do in the client is to read the size first, eventually allocate enough memory for that size in a ByteBuffer for instance, then read the data until you have the desired number of bytes.
If you cannot change the server protocol, then there has to be some way to recognize the start or end of a message, like a specific header or footer. Then you need to keep reading data until you reach the footer or the next header, depending on what you have.
Also keep in mind that for large messages, you will likely not have all the data in a single read(). You'll need to keep your selection key interested in OP_READ operations, adding a chunk of data into your buffer with each read(), until all the data has been read from the channel.
I am trying to read some data from a network socket using the following code -
Socket s = new Socket(address, 502);
response = new byte[1024];
InputStream is = s.getInputStream();
int count = is.read(response, 0, 100);
The amount of data isn't large. It is 16 bytes in total. However the read() statement does not read all the data in one go. It reads only 8 bytes of data into my buffer.
I have to make multiple calls to read() like this in order to read the data -
Socket s = new Socket(address, 502);
response = new byte[1024];
InputStream is = s.getInputStream();
int count = is.read(response, 0, 100);
count += is.read(response, count, 100-count);
Why is this happening? Why does read() not read the entire stream in one go?
Please note that the data is not arriving gradually. If I wait for 2 seconds before reading the data by making a call to Thread.sleep(2000) the behavior remains the same.
Why does read() not read the entire stream in one go?
Because it isn't specified to do so. See the Javadoc. It blocks until at least one byte is available, then returns some number between 1 and the supplied length, inclusive.
That in turn is because the data doesn't necessarily arrive all in one go. You have no control over how TCP sends and receives data. You are obliged to just treat it as a byte stream.
I understand that it blocks until data arrives. "That in turn is because the data doesn't necessarily arrive all in one go." Why not is my question.
The data doesn't necessarily all arrive in one go because the network typically breaks it up into packets. IP is a packet switching protocol.
Does TCP transmit it blocks of 8 bytes?
Possibly, but probably not. The packet size depends on the network / networks that the data has traversed, but a typical internet packet size is around 1500 bytes.
If you are getting 8 bytes at a time, either your data is either coming through a network with an unusually small packet size, or (more likely) the sender is sending the data 8 bytes at a time. The second explanation more or less jives with what your other comments say.
And since I explicitly specify 100, a number much larger than the data in buffer shouldn't it attempt to read up till atleast 100 bytes?
Well no. It is not specified to work that way, and it doesn't work that way. You need to write your code according to what the spec says.
It is possible that this has something to do with the way the device is being "polled". But without looking at the specs for the device (or even knowing what it is exactly) this is only a guess.
Maybe the data is arriving gradually not because of your reading but because of the sender.
The sender should use a BufferedOutputStream (in the middle) to make big chunks before sending (and use flush only when it's needed).
Most datagram receiving functions such as c's recv or read, javas DatagramPacket class or pythons SocketServer, include the possibility to find out the amount of received data.
c:
int amount = recv(sock, buf, n, MSG_WAITALL);
java:
int amount = datagramSocket.getLength();
python:
class MyUDPHandler(socketserver.BaseRequestHandler):
def handle(self):
amount = len (self.request[0])
Are these reliable? Or is it possible that only parts of the message are received, due to for example packet fragmentation or network delay?
In other words: When I send a variable length chunk of data via udp and receive it at the other end, are these amount values exactly equal to the size of the original chunk?
Edit:
ninjalj made a good point and I want to include it here. What happens when the receiving function is interrupted, for instance by a signal? What happens when two threads simultaneously try to receive from the same socket?
UDP datagrams cannot be partially delivered¹; they are delivered as-is or not at all. So yes, you can be sure that the received datagram was sent exactly as you see it on the receiver's end.
Edit to incorporate Will's comment which is the best kind of correct (i.e., technically):
¹They can be fragmented at the IP level, but the network stack on the receiver side will either fully reassemble a datagram and pass it to the listening process as sent, or will not acknowledge that any data at all has been received.
Partial datagrams are only permissible with UDP Lite.
I am using Java Socket API for communication. But sometime I am getting, packet attached in single packet. How can I avoid the same. Is there any method to resolve same in Java NIO or java NIO 2. I am sure that packets are coming separately. But both stored in single buffer.
Please note that here Packet is nothing but logical separation of data. The data is send by third party system. They send one by one. But I am receiving two packet at same time.
This is the way it's supposed to work. TCP uses packets to transfer data, but it's not visible from the high-level socket API : you open a output stream and send as much data as you want. This data is split into packets by the TCP/IP protocol stack. And at the receiving side, you open an input stream and receive the data, without knowing it has been split into packets.
If you want two application-level packets, then design a transfer protocol using separators between your packets, or fixed-size chunks of data, or anything else allwoing to distinguish what is part of a logical packet and what is part of the next one.