Java NIO: Relationship between OP_ACCEPT and OP_READ? - java

I am re-writing the core NIO server networking code for my project, and I'm trying to figure out when I should "store" connection information for future use. For example, once a client connects in the usual manner, I store and associate the SocketChannel object for that connected client so that I can write data to that client at any time. Generally I use the client's IP address (including port) as the key in a HashMap that maps to the SocketChannel object. That way, I can easily do a lookup on their IP address and asynchronously send data to them via that SocketChannel.
This might not be the best approach, but it works, and the project is too large to change its fundamental networking code, though I would consider suggestions. My main question, however, is this:
At what point should I "store" the SocketChannel for future use? I have been storing a reference to the SocketChannel once the connection is accepted (via OP_ACCEPT). I feel that this is an efficient approach, because I can assume that the map entry already exists when the OP_READ event comes in. Otherwise, I would need to do a computationally expensive check on the HashMap every time OP_READ occurs, and it is obvious that MANY more of those will occur for a client than OP_ACCEPT. My fear, I guess, is that there may be some connections that become accepted (OP_ACCEPT) but never send any data (OP_READ). Perhaps this is possible due to a firewall issue or a malfunctioning client or network adaptor. I think this could lead to "zombie" connections that are not active but also never receive a close message.
Part of my reason for re-writing my network code is that on rare occasions, I get a client connection that has gotten into a strange state. I'm thinking the way I've handled OP_ACCEPT versus OP_READ, including the information I use to assume a connection is "valid" and can be stored, could be wrong.
I'm sorry my question isn't more specific, I'm just looking for the best, most efficient way to determine if a SocketChannel is truly valid so I can store a reference to it. Thanks very much for any help!

If you're using Selectors and non-blocking IO, then you might want to consider letting NIO itself keep track of the association between a channel and it's stateful data. When you call SelectionKey.register(), you can use the three-argument form to pass in an "attachment". At every point in the future, that SelectionKey will always return the attachment object that you provided. (This is pretty clearly inspired by the "void *user_data" type of argument in OS-level APIs.)
That attachment stays with the key, so it's a convenient place to keep state data. The nice thing is that all the mapping from channel to key to attachment will already be handled by NIO, so you do less bookkeeping. Bookkeeping--like Map lookups--can really hurt inside of an IO responder loop.
As an added feature, you can also change the attachment later, so if you needed different state objects for different phases of your protocol, you can keep track of that on the SelectionKey, too.
Regarding the odd state you find your connections in, there are some subtleties in using NIO and selectors that might be biting you. For example, once a SelectionKey signals that it's ready for read, it will continue to be ready for read the next time some other thread calls select(). So, it's easy to end up with multiple threads attempting to read the socket. On the other hand, if you attempt to deregister the key for reading while you're doing the read, then you can end up with threading bugs because SelectionKeys and their interest ops can only be manipulated by the thread that actually calls select(). So, overall, this API has some sharp edges, and it's tricky to get all the state handling correct.
Oh, and one more possibility, depending on who closes the socket first, you may or may not notice a closed socket until you explicitly ask. I can't recall the exact details off the top of my head, but it's something like this: the client half-closes its end of the socket, this does not signal any ready op on the selection key, so the socketchannel never gets read. This can leave a bunch of sockets in TIME_WAIT status on the client.
As a final recommendation, if you're doing async IO, then I definitely recommend a couple of books in the "Pattern Oriented Software Architecture" (POSA) series. Volume 2 deals with a lot of IO patterns. (For instance, NIO lends itself very well to the Reactor pattern from Volume 2. It addresses a bunch of those state handling problems I mention above.) Volume 4 includes those patterns and embeds them in the larger context of distributed systems in general. Both of these books are a very valuable resource.

An alternative may be to look at an existing NIO socket framework, possible candidates are:
Apache MINA
Sun Grizzly
JBoss Netty

Related

Java nio SelectionKey.register and interestops

I have been working on Java NIO communications and reading various writeups regarding this. The document says that I could "or" ops that I am interested in. However, I haven't seen a single example of
channel.register(selector,SelectionKey.OP_ACCEPT|SelectionKey.OP_READ|Selection.OP_WRITE)
Is this a bad idea?
Yep. It's wrong.
The only thing that can deliver you an OP_ACCEPT is a ServerSocketChannel.
The only thing that can deliver you an OP_READ or OP_WRITE is a SocketChannel or a DatagramSocketChannel.
So there is no way a single channel can deliver you all three of those events. So there is no sense in registering for them all.
OP_WRITE is almost always ready. It rarely if ever makes sense to register for OP_READ and OP_WRITE at the same time.
The validOps() method tells you which operations are valid for a given channel, not that you should need to know at runtime.

What's the proper way to continuously read socket messages through DataInputStream?

I'm trying to build a Java Bittorent client. From what I understand after peers handshake with one another they may start sending messages to each other, often sending messages sporadically.
Using a DataInputStream connection I can read messages, but if I call a read and nothing is on the stream the peers holds. Is there a way I can tell if something is being sent over the stream? Or should I create a new thread that reads the stream for messages continuously from each peer until the client shuts them down shut down?
I think you need to do some major experimenting so that you can start to learn the basics of socket I/O. Trying to answer your question "as is" is difficult, because you don't yet understand enough to ask the question in a manner that it can be answered.
If you want to be able to tell if there is data to read, then you should not use the blocking I/O approach. Instead, you will need to use the APIs known as "NIO", which allow you to "select" a socket that has data to read (i.e. a socket that is associated with a buffer that already has data in it).
This will make much more sense after you write a lot of code and mess it up a few times. The underlying I/O primitives are actually quite primitive (pun intended). In this industry, we just made up lots of complicated terms and function names and API descriptions so that people would think that network communication is magic. It's not. It's generally no more complicated than memcpy().
There is a function in C called select(). In the scenario you've described, you need an equivalent of select in Java. And that is, as cpurdy mentioned, Non-blocking Socket I/O or NIO. Cursory googling returned following links:
http://tutorials.jenkov.com/java-nio/socket-channel.html
http://www.owlmountain.com/tutorials/NonBlockingIo.htm
http://rox-xmlrpc.sourceforge.net/niotut/index.htm
You might want to take a look at the Netty project: http://netty.io/
It is very easy with Netty to get started on network programming.

Is it possible to use multiple java ObjetOutputStream objects to write to a single java ObjectInputStream object?

I have a standard client/server setup.
The program I'd like to build acts a lot like a mail office(which is my Server). Multiple people (client with ObjectOutputStream) hand the office (server with the single ObjectInputStream) mail with an attached address and the office sends the mail where it is supposed to go. If possible, I'd like to have one ObjectInputStream in the server that blocks, waiting for "mail" to come in from any ObjectOutputStream, then sends the "mail" where it's supposed to go. This way I can just have one thread that is completely dedicated to receiving data and sending it.
I will have a thread for each person's client with their ObjectOutputStream, but would like to not also need a matching thread in the server to communicate with each person. I am interested in this idea because I find it excessive to build tons of threads to separately handle connections, when it's possible that a single thread will only send data once in my case.
Is this feasible? or just silly?
Use a JMS queue of Java Message Service, is the design pattern for this case.
http://en.wikipedia.org/wiki/Java_Message_Service
If you have in the server app just one instance of ObjectInputStream and you have many clients then this instance needs to be shared by all threads thus you need to synchronize the access to it.
You can read more here. Hope this helps.
OR
You can have a pool of ObjectInputStream instances and using a assignment algorithm like Round Robin (doc) you can return the same instance for each x order thread for example ... this will make the flow in the server app to be more paralleled
Your question doesn't make sense. You need a separate pair of ObjectInputStream and ObjectOutputStream per Socket. You also need a Thread per Socket, unless you are prepared to put up with the manifest limitations of polling via InputStream.available(), which won't prevent your reads from blocking. If you are using Object Serialization you are already committed to blocking I/O and therefore to a thread per Socket.

Java NIO Threading issue with SocketChannel.write()

Sometimes, while sending a large amount of data via SocketChannel.write(), the underlying TCP buffer gets filled up, and I have to continually re-try the write() until the data is all sent.
So, I might have something like this:
public void send(ByteBuffer bb, SocketChannel sc){
sc.write(bb);
while (bb.remaining()>0){
Thread.sleep(10);
sc.write(bb);
}
}
The problem is that the occasional issue with a large ByteBuffer and an overflowing underlying TCP buffer means that this call to send() will block for an unexpected amount of time. In my project, there are hundreds of clients connected simultaneously, and one delay caused by one socket connection can bring the whole system to a crawl until this one delay with one SocketChannel is resolved. When a delay occurs, it can cause a chain reaction of slowing down in other areas of the project, and having low latency is important.
I need a solution that will take care of this TCP buffer overflow issue transparently and without causing everything to block when multiple calls to SocketChannel.write() are needed. I have considered putting send() into a separate class extending Thread so it runs as its own thread and does not block the calling code. However, I am concerned about the overhead necessary in creating a thread for EACH socket connection I am maintaining, especially when 99% of the time, SocketChannel.write() succeeds on the first try, meaning there's no need for the thread to be there. (In other words, putting send() in a separate thread is really only needed if the while() loop is used -- only in cases where there is a buffer issue, perhaps 1% of the time) If there is a buffer issue only 1% of the time, I don't need the overhead of a thread for the other 99% of calls to send().
I hope that makes sense... I could really use some suggestions. Thanks!
Prior to Java NIO, you had to use one Thread per socket to get good performance. This is a problem for all socket based applications, not just Java. Support for non-blocking IO was added to all operating systems to overcome this. The Java NIO implementation is based on Selectors.
See The definitive Java NIO book and this On Java article to get started. Note however, that this is a complex topic and it still brings some multithreading issues into your code. Google "non blocking NIO" for more information.
The more I read about Java NIO, the more it gives me the willies. Anyway, I think this article answers your problem...
http://weblogs.java.net/blog/2006/05/30/tricks-and-tips-nio-part-i-why-you-must-handle-opwrite
It sounds like this guy has a more elegant solution than the sleep loop.
Also I'm fast coming to the conclusion that using Java NIO by itself is too dangerous. Where I can, I think I'll probably use Apache MINA which provides a nice abstraction above Java NIO and its little 'surprises'.
You don't need the sleep() as the write will either return immediately or block.
You could have an executor which you pass the write to if it doesn't write the first time.
Another option is to have a small pool of thread to perform the writes.
However, the best option for you may be to use a Selector (as has been suggested) so you know when a socket is ready to perform another write.
For hundreds of connections, you probably don't need to bother with NIO. Good old fashioned blocking sockets and threads will do you.
With NIO, you can register interest in OP_WRITE for the selection key, and you will get notified when there is room to write more data.
There are a few things you need to do, assuming you already have a loop using
Selector.select(); to determine which sockets are ready for I/O.
Set the socket channel to non-blocking after you've created it, sc.configureBlocking(false);
Write (possibly parts of) the buffer and check if there's anything left. The buffer itself takes care of current position and how much is left.
Something like
sc.write(bb);
if(sc.remaining() == 0)
//we're done with this buffer, remove it from the select set if there's nothing else to send.
else
//do other stuff/return to select loop
Get rid of your while loop that sleeps
I am facing some of the same issues right now:
- If you have a small amount of connections, but with large transfers, I would just create a threadpool, and let the writes block for the writer threads.
- If you have a lot of connections then you could use full Java NIO, and register OP_WRITE on your accept()ed sockets, and then wait for the selector to come in.
The Orielly Java NIO book has all this.
Also:
http://www.exampledepot.com/egs/java.nio/NbServer.html?l=rel
Some research online has led me to believe NIO is pretty overkill unless you have a lot of incoming connections. Otherwise, if its just a few large transfers - then just use a write thread. It will probably have quicker response. A number of people have issues with NIO not repsonding as quick as they want. Since your write thread is on its own blocking it wont hurt you.

Java - Multiple selectors in multiple threads for nonblocking sockets

I'm writing a Java application that will instantiate objects of a class to represent clients that have connected and registered with an external system on the other side of my application.
Each client object has two nested classes within it, representing front-end and back-end. the front-end class will continuously receive data from the actual client, and send indications and data to the back-end class, which will take that data from the front-end and send it to the external system in using the proper format and protocol that system requires.
In the design, we're looking to have each instantiation of a client object be a thread. Then, within each thread will naturally be two sockets [EDIT]with their own NIO channels each[/EDIT], one client-side, one system-side residing in the front- and back-end respectively. However, this now introduces the need for nonblocking sockets. I have been reading the tutorial here that explains how to safely use a Selector in your main thread for handling all threads with connections.
But, what I need are multiple selectors--each operating in their own thread. From reading the aforementioned tutorial, I've learned that the key sets in a Selector are not threadsafe. Does this mean that separate Selectors instantiated in their own repsective threads may create conflicting keys if I try to give them each their own pair of sockets and channels? Moving the selector up to the main thread is a slight possibility, but far from ideal based on the software requirements I've been given. Thank you for your help.
Using multiple selectors would be fine as long as you do not register the same channel with the same interests (OP_READ / OP_WRITE etc) with both the selector instances. Registering the same channel with multiple selector instances could cause a problem where selector1.select() could consume an event that selector2.select() could be interested in.
The default selectors on most of the platforms are poll() [or epoll()] based.
Selector.select internally calls the int poll( ListPointer, Nfdsmsgs, Timeout) method.
where the ListPointer structure can then be initialized as follows:
list.fds[0].fd = file_descriptorA;
list.fds[0].events = requested_events;
list.msgs[0].msgid = message_id;
list.msgs[0].events = requested_events;
That said, I would recommend the usage of a single selecting thread as mentioned in the ROX RPC nio tutorial. NIO implementations are platform dependant, and it is quite possible that what works on one platform may not work on another. I have seen problems across minor versions too.
For instance, AIX JDK 1.6 SR2 used a poll() based selector - PollSelectorImpl and the corresponding selector provider as PollSelectorProvider, our server ran fine. When I moved to AIX JDK 1.6 SR5, which used a pollset interface based optimized selector (PollSetSelectorImpl), we encountered frequent hangs in our server in the select() and socketchannel.close(). One reason I see is that we open multiple selectors in our application (as opposed to the ideal one Selecting Thread model) and the implementation of the PollSetSelectorImpl as described here.
If you have to use this single socket connection, you have to separate the process of receiving and writing data from and to the channel from the data processing itself. You do not must delegate the channel. The channel is like a bus. The bus (the single thread, that manages the channel) has to read the data and to write it into a (thread-safe) input queue including the information required, so your client thread(s) can pick up the correct datagram package from the queue. If the client thread likes to write data, that data is written to an output queue which is then read by the channels thread to write the data to the channel.
So from a concept of sharing a connection between actors using this connection with their unpredictable processing time (which is the main reason for blocks), you move to a concept of asynchronous data read, data processing and data writing. So, it's not the processing time which is unpredictable anymore, but the time, your data is read or written. Non-blocking means, that the stream of data is as constant as possible, despite what time is required to process that data.

Categories