As some background:
I have a connection to a server with a SocketChannel, SelectionKey...etc. On the client end, if I want to send something to the server, I just write my data into a ByteBuffer and send it through the socket channel. If all of it was written, I'm done and can return to OP_READ. If not all of it was written, I take the bytes left over, store them in a "to send" buffer somewhere and mark OP_WRITE on the key (is it a good idea to replace OP_READ so its only write?).
Therefore, the next time I call selectNow(), I'm assuming it will recognize OP_WRITE and attempt to flush more data through (which I will attempt to do by entering another writing loop with the data to write, and repeating the previous if needed).
This leads me to two questions:
Am I supposed to leave it in OP_WRITE until all the data has been flushed through? Or should I change to OP_READ and attempt any reads in between?
If the writing channel is full and I can't write, do I just keep looping until I can start writing stuff through? If the connection suddenly gets choked, I'm unsure if I'm supposed to just write what I can, flip back to OP_READ, attempt to read, then flip back to OP_WRITE. From what I've read, this appears to not be the correct way to do things (and may cause large overhead constantly switching back and forth?).
What is the optimal way to handle reading and writing bulk data when the buffers both may become full?
Reading sounds easy cause you just loop through until the data is consumed, but with writing... the server may be only writing and not reading. This would leave you with quite a full send buffer, and cycling around forever on OP_WRITE without reading would be bad. How do you avoid that situation? Do you set a timer on which you just stop attempting to write and start reading again if the send buffer is not clearing up? If so, do you remove OP_WRITE and remember it for later?
Side question: Do you even need OP_READ to read from the network? I'm unsure if it's like OP_WRITE where you only mark it in a specific case (just in case I', doing it wrong, since I have it on OP_READ 99.9% of the time).
Currently I just set my key to OP_READ and then leave it in that mode, waiting for data, and then go to OP_WRITE if and only if writing fails to send all the data (with a write() value of 0).
Am I supposed to leave it in OP_WRITE until all the data has been flushed through? Or should I change to OP_READ and attempt any reads in between?
There are differing views about that. Mine is that the peer should be reading every part of the response you're sending before he sends a new request, and if he doesn't he is just misbehaving, and this you shouldn't encourage by reading ahead. Otherwise you just run out of memory eventually, and you shouldn't let a client do that to you. Of course that assumes you're the server in a request-response protocol. Other situations have their own requirements.
If the writing channel is full and I can't write, do I just keep looping until I can start writing stuff through?
No, you wait for OP_WRITE to fire.
If the connection suddenly gets choked, I'm unsure if I'm supposed to just write what I can, flip back to OP_READ, attempt to read, then flip back to OP_WRITE. From what I've read, this appears to not be the correct way to do things (and may cause large overhead constantly switching back and forth?).
The overhead isn't significant, but it's the wrong thing to do in the situation I described above.
What is the optimal way to handle reading and writing bulk data when the buffers both may become full?
In general, read when OP_READ fires; write whenever you need to; and use OP_WRITE to tell you when an outbound stall has relieved itself.
Do you even need OP_READ to read from the network?
Yes, otherwise you just smoke the CPU.
Whenever you need to write just set interested operation to (OP_READ || OP_WRITE). When you finish writing just set the interested operation to OP_READ.
that's all you have to do.
Related
I'm experimenting with some network code, and I'm trying to figure out a good way to trigger writes to my DatagramChannel after having processed an event on a separate thread. This is quite simple for TCP, since you have a separate socket on which you can register your write interest (see reactor pattern, etc). For UDP, however, registering and deregistering interest with the datagram channel doesn't work so well, since I'm basically just modifying the same selection key.
I feel like you either block on the send in the event handler thread (wrong because then we're using blocking sends), or you block on a queue or something to take the responses and write them (also wrong because then we're blocking the selector thread).
I'd like to do something like switch the interest to write once I have something to write, and then back again to read, but then I run the risk of a race where I set it back to read after a write has been queued up, and then that write waits until I get the next read, which is also bad.
Yes, I know that there are other, potentially better suited threading models for these sorts of things, but I'm experimenting, so I'm curious. =)
You don't have to set the interest-ops to write when you want to write. Just write, from whatever thread you happen to be in. Only if that write returns zero do you need to worry about write interest-ops, as per many answers here.
Let's say I have some UDP channels and some TCP channels registered with my selector. Once the selector wakes up, can I just keep looping and reading as much information as I can from ALL keys (not just the selected ones) without looping back and performing another select()? For TCP this does not make much sense since I can read as much as possible into my ByteBuffer with a call to channel.read(), but for UDP you can only read one packet at a time with a call to channel.receive(). So how many packets do I read? Do you see a problem with just keep reading (not just reading, but writing, connecting and accepting, in other words ALL key operations) until there is nothing else to do then I perform the select again? That way a UDP channel would not starve the other channels. You would process all channels as much as you can, reading one packet at a time from the UDP channels. I am particularly concerned about:
1) Performance hit of doing too many selects if I can just keep processing my keys without it.
2) Does the select() do anything fundamental that I cannot bypass in order to keep reading/writing/accepting/connecting?
Again, keep in mind that I will be processing all keys and not just the ones selected. If there is nothing to do for a key (no data) I just do nothing and continue to the next key.
I think you have to try it both ways. You can construct a plausible argument that says you should read every readable channel until read() returns zero, or that you should process one event per channel and do just one read each time. I probably favour the first but I can remember when I didn't.
Again, keep in mind that I will be processing all keys and not just
the ones selected.
Why? You should process the events on the selected channels, and you might then want to perform timeout processing on the non-selected channels. I wouldn't conflate the two things, they are quite different. Don't forget to remove keys from the selectedKeys set whichever way you do it.
I've a situation where a thread opens a telnet connection to a target m/c and reads the data from a program which spits out the all the data in its buffer. After all the data is flushed out, the target program prints a marker. My thread keeps looking for this marker to close the connection (successful read).
Some times, the target program does not print any marker, it keeps on dumping the data and my thread keeps on reading it (no marker is printed by the target program).
So i want to read the data only for a specific period of time (say 15 mins/configurable). Is there any way to do this at the java API level?
Use another thread to close the connection after 15 mins. Alternatively, you could check after each read if 15mins have passed and then simply stop reading and cleanup the connection, but this would only work if you're sure the remote server will continue to send data (if it doesn't the read will block indefinitely).
Generally, no. Input streams don't provide timeout functinality.
However, in your specific case, that is, reading data from a socket, yes. What you need to do is set the SO_TIMEOUT on your socket to a non-zero value (the timeout you need in millisecs). Any read operations that block for the amount of time specified will throw a SocketTimeoutException.
Watch out though, as even though your socket connection is still valid after this, continuing to read from it may bring unexpected result, as you've already half consumed your data. The easiest way to handle this is to close the connection but if you keep track of how much you've read already, you can choose to recover and continue reading.
If you're using a Java Socket for your communication, you should have a look at the setSoTimeout(int) method.
The read() operation on the socket will block only for the specified time. After that, if no information is received, a java.net.SocketTimeoutException will be raised and if treated correctly, the execution will continue.
If the server really dumps data forever, the client will never be blocked in a read operation. You might thus regularly check (between reads) if the current time minus the start time has exceeded your configurable delay, and stop reading if it has.
If the client can be blocked in a synchronous read, waiting for the server to output something, then you might use a SocketChannel, and start a timer thread that interrupts the main reading thread, or shuts down its input, or closes the channel.
To be more specific, i have written a server with java NIO, and it works quiet well, after some testing i have found out that for some reason, in average a call to the SocketChannels write method takes 1ms, the read method on the other hand takes 0.22ms in average.
Now at first i was thinking that setting the sent/receive buffer values on Socket might help a bit, but after thinking about it, all the messages are very short(a few bytes) and i send a message about every 2 seconds on a single connection. Both sent and receive buffers are well over 1024 bytes in size so this can't really be the problem, i do have several thousand clients connected at once thou.
Now i am a bit out of ideas on this, is this normal and if it is, why ?
I would start by using Wireshark to eliminate variables.
#Nuoji i am using nonblocikng-io and yes i am using a Selector, as for when i write to a channel i do the following:
Since what i wrote in the second paragraph in my post is true, i assume that the channel is ready for writing in most cases, hence i do not at first set the interest set on the key to write, but rather try to write to the channel directly. In case however that, i can not write everything to the channel (or anything at all for that matter), i set the interest set on the key to write(that way the next time i try to write to the channel it is ready to write). Although in my testing where i got the results mentioned in the original post, this happens very rarely.
And yes i can give you samples of the code, although i didn't really want to bother anyone with it. What parts in particular would you like to see, the selector thread or the write thread ?
If I am only WRITING to a socket on an output stream, will it ever block? Only reads can block, right? Someone told me writes can block but I only see a timeout feature for the read method of a socket - Socket.setSoTimeout().
It doesn't make sense to me that a write could block.
A write on a Socket can block too, especially if it is a TCP Socket. The OS will only buffer a certain amount of untransmitted (or transmitted but unacknowledged) data. If you write stuff faster than the remote app is able to read it, the socket will eventually back up and your write calls will block.
It doesn't make sense to me that a write could block.
An OS kernel is unable to provide an unlimited amount of memory for buffering unsent or unacknowledged data. Blocking in write is the simplest way to deal with that.
Responding to these followup questions:
So is there a mechanism to set a
timeout for this? I'm not sure what
behavior it'd have...maybe throw away
data if buffers are full? Or possibly
delete older data in the buffer?
There is no mechanism to set a write timeout on a java.net.Socket. There is a Socket.setSoTimeout() method, but it affects accept() and read() calls ... and not write() calls. Apparently, you can get write timeouts if you use NIO, non-blocking mode, and a Selector, but this is not as useful as you might imagine.
A properly implemented TCP stack does not discard buffered data unless the connection is closed. However, when you get a write timeout, it is uncertain whether the data that is currently in the OS-level buffers has been received by the other end ... or not. The other problem is that you don't know how much of the data from your last write was actually transferred to OS-level TCP stack buffers. Absent some application level protocol for resyncing the stream*, the only safe thing to do after a timeout on write is to shut down the connection.
By contrast, if you use a UDP socket, write() calls won't block for any significant length of time. But the downside is that if there are network problems or the remote application is not keeping up, messages will be dropped on the floor with no notification to either end. In addition, you may find that messages are sometimes delivered to the remote application out of order. It will be up to you (the developer) to deal with these issues.
* It is theoretically possible to do this, but for most applications it makes no sense to implement an additional resyncing mechanism on top of an already reliable (to a point) TCP/IP stream. And if it did make sense, you would also need to deal with the possibility that the connection closed ... so it would be simpler to assume it closed.
The only way to do this is to use NIO and selectors.
See the writeup from the Sun/Oracle engineer in this bug report:
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=4031100