Let's say I'm running a server, and set client SocketChannels that I accept as non blocking, and read them through a thread pool's threads. But what does that buy me? I anyway need to read the full client request before processing it, which means I need to make multiple read calls.
I've also come across articles saying that threads should block naturally so it gives a chance to other threads to run. However this won't happen in the aforementioned case as these threads will not block.
So how would non blocking IO be efficient? How to make sense of this all? Some multi-core CPU angle to it perhaps? But how?
EDIT: found a pretty good link that explains it programmatically:
http://rox-xmlrpc.sourceforge.net/niotut/
The problem using blocking IO starts when you want to scale your server program. You'd have to hold a blocking thread-per-request. Many many requests will introduce man many threads. This might make some hard time for a server application that serves thousands and more of IO involving concurrent requests.
Using nio non-blocking IO, this request-to-thread coupling is redundant. You can use any thread to complete the IO operation of any request. This lets you use the great pooling pattern for your IO handling threads, and decrease significantly the thread creation and management overhead. On the other hand, you'd have to work harder to sustain data consistency, but that would be the price of scalability.
Unless you want to use busy waiting (which sounds unlikely) if you want to use non-blocking you usually use a small number of threads (may be only one) and a Selector.
If you are going to use blocking IO, that is when you dedicate one or two threads per connection.
Related
I have a Java program with multiple sockets that occasionally have data that need to be read and processed, but there is an indeterminate amount of time which there is no data to be read. I need a good way to constantly check if there is data in the sockets, and process the data. Assigning one thread per socket is not a good idea since there could be too many sockets and use too much memory.
Currently, I have a couple threads, each one assigned to service its own list of sockets. If there was nothing to read in any of the sockets, then sleep one second, then loop. If there was something to read in any of the sockets, just loop without waiting and iterate through the sockets again.
The reason I do this is because I don't want to use up too much resources if there is nothing to read, and the one second delay is not a problem. The only down side is that there is no flexibility for sockets to jump threads, so the worst case scenario is that a single thread is overloaded with work, while the other threads are doing nothing.
Another idea I've had: create a thread pool, and queue up all the sockets to be serviced, and re-add them when they are serviced, but there is no good way to know if none of the sockets need servicing and the threads can take a break to free up CPU cycles.
Is there a good way to assign threads tasks, but not overload computer resources if there is nothing to do?
Ideally an event is triggered each time there is data available in a socket, but as far as I know, there is no way to do this, and I must poll the sockets.
To reiterate, I do not want a one to one relationship between socket and thread.
there could be too many sockets and use too much memory.
You can achieve 1,000 to 10,000 this way. Memory is much cheaper than it was when NIO was introduced 12 years ago and threads are more efficient and scalable than they used to be.
I have a couple threads, each one assigned to service its own list of sockets. If there was nothing to read in any of the sockets, then sleep one second, then loop.
I use a pause which busy waits for a short period and yeilds and finally sleeps for an escalating period of time.
You can use Selectors, but these are not simple to use correctly. In this situation I would use a library like netty or at the very least read the code it uses.
The only down side is that there is no flexibility for sockets to jump threads, so the worst case scenario is that a single thread is overloaded with work, while the other threads are doing nothing.
This is where using a thread per socket is better.
I must poll the sockets.
You can use Selectors, but these are single threaded and switch sockets between selectors is not simple.
I would reconsider using more threads for simplicity.
I am trying to understand SocketChannels, and NIO in general. I know how to work with regular sockets and how to make a simple thread-per-client server (using the regular blocking sockets).
So my questions:
What is a SocketChannel?
What is the extra I get when working with a SocketChannel instead of a Socket.
What is the relationship between a channel and a buffer?
What is a selector?
The first sentance in the documentation is A selectable channel for stream-oriented connecting sockets.. What does that mean?
I have read the also this documentation, but somehow I am not getting it...
A Socket is a blocking input/output device. It makes the Thread that is using it to block on reads and potentially also block on writes if the underlying buffer is full. Therefore, you have to create a bunch of different threads if your server has a bunch of open Sockets.
A SocketChannel is a non-blocking way to read from sockets, so that you can have one thread communicate with a bunch of open connections at once. This works by adding a bunch of SocketChannels to a Selector, then looping on the selector's select() method, which can notify you if sockets have been accepted, received data, or closed. This allows you to communicate with multiple clients in one thread and not have the overhead of multiple threads and synchronization.
Buffers are another feature of NIO that allows you to access the underlying data from reads and writes to avoid the overhead of copying data into new arrays.
By now NIO is so old that few remember what Java was like before 1.4, which is what you need to know in order to understand the "why" of NIO.
In a nutshell, up to Java 1.3, all I/O was of the blocking type. And worse, there was no analog of the select() system call to multiplex I/O. As a result, a server implemented in Java had no choice but to employ a "one-thread-per-connection" service strategy.
The basic point of NIO, introduced in Java 1.4, was to make the functionality of traditional UNIX-style multiplexed non-blocking I/O available in Java. If you understand how to program with select() or poll() to detect I/O readiness on a set of file descriptors (sockets, usually), then you will find the services you need for that in NIO: you will use SocketChannels for non-blocking I/O endpoints, and Selectors for fdsets or pollfd arrays. Servers with threadpools, or with threads handling more than one connection each, now become possible. That's the "extra".
A Buffer is the kind of byte array you need for non-blocking socket I/O, especially on the output/write side. If only part of a buffer can be written immediately, with blocking I/O your thread will simply block until the entirety can be written. With non-blocking I/O, your thread gets a return value of how much was written, leaving it up to you to handle the left-over for the next round. A Buffer takes care of such mechanical details by explicitly implementing a producer/consumer pattern for filling and draining, it being understood that your threads and the JVM's kernel will not be in sync.
Even though you are using SocketChannels, It's necessary to employ thread pool to process channels.
Thinking about the scenairo you use only one thread which is responsible for both polling select() and processing the SocketChannels selected from Selectors, if one channel takes 1 seconds for processing, and there are 10 channels in queue, it means you have to wait 10 seconds before next polling which is untolerable. so there should be a thread pool for channels processing.
In this sense, i don't see tremendous difference to the thread-per-client blocking sockets pattern. the major difference is in NIO pattern, the task is smaller, it's more like thread-per-task, and tasks could be read, write, biz process etc. for more detail, you can take a look at Netty's implementation of NioServerSocketChannelFactory, which is using one Boss thread accepting connection, and dispatch tasks to a pool of Worker threads for processing
If you are really fancy at one thread, the bottom-line is at least you shold have pooled I/O threads, because I/O operations is often oders of magnitude slower than instruction-processing cycles, you would not want the precious one thread being blocked by I/O, and this is exactly NodeJS doing, using one thread accept connection, and all I/O are asynchornous and being parallelly processed by back-end I/O threads pool
is the old style thread-per-client dead?
I don't think so, NIO programming is complex, and multi-threads is not naturally evil, Keep in mind that modern operating systems and CPU's become better and better at multitasking, so the overheads of multithreading becomes smaller over time.
I have a web-service that write files to disk and other stuff to database. The entire operation takes 1-2 seconds for each write.
The service can, bur that is unlikely, be called from several clients at the same time. Let´s assume that 20 clients call the webservice at the same time, the write operations must be synchronized. In that case, some clients can get a time out exception because they have to wait to many seconds.
Are there any good practices to solve these kind of situations? As it is now, the methods are synchronized (and that can cause the starvation/timeouts).
Should I let all threads get into the write method by removing the synchronized keyword and put their task into a task queue to avoid a timeout? Is that the correct way to get arount this?
Removing the synchronized and putting it into a task queue by itself will not help you (because that's effectively what the synchronized is doing for you). However if you respond to the web request as soon as you put it on the queue, then you will reduce your response fime. But at the cost of some reliability as the user will get a confirmation that the work is done and the work will not really have been done (the system could crash before the work is done).
Francis Upton's practice is indeed an accepted practice.
Another one, is making more fine grained synchronization. Instead of synchronizing all read/write methods of a class, you can synchronize access of the exact invariants that should be synchronized.
And yet even better, is to get rid of synchronization altogether. This is possible using the java.util.concurrent package. This package introduce new collections that use Non-Blocking Algorithms (implemented in java using Compare-Ans-Swap atomic instructions). These collections, such as ConcurrentHashMap, enable much better throughput when scaling.
You can read more about it in this article.
In this type of implementation (slow service under increasing load) you want to make as much as possible async, including the timeout processing (if server-based) and the required I/O. Don't hold up your client response threads waiting for either of these time-consuming operations, to preserve the server's responsiveness to new requests, but instead fire off the required operations (maybe to a dynamic thread pool) and let callbacks process the results, whether timeout, complete I/O, or errors.
Send the appropriate response depending on what happens first, but be prepared to roll back I/O if you send an error/timeout message and then a completed I/O arrives (due to a race condition between I/O and timer). This implies transactional semantics are required in the server.
This is an area that get increasingly complex as your load grows but good design early on should allow you to scale as load grows. Ideally the client servicing threads should not block at all.
I'm reading about Channels in the JDK 7 docs (here), and stumbled upon this:
Multiplexed, non-blocking I/O, which is much more scalable than thread-oriented, blocking I/O, [...]
Is there a simple explanation as to why this is so?
Because a thread stack is usually much larger than the data structure needed to support an async I/O connection. Also, scheduling thousands of threads is inefficient.
"Blocking" means that threads have to wait as long as necessary for a resource to become available...which means, by definition, threads will be sitting around waiting for resources. Non-blocking avoids this sort of thing.
Generally, non-blocking solutions are trickier, but they avoid resource contention, which makes it much easier to scale up. (That said, the point of Channel is to make this less tricky.)
I am troubled with the following concept:
Most books/docs describe how robust servers are multithreaded and that the most common approach is to start a new thread to serve each new client. E.g. a thread is dedicated to each new connection. But how is this actually implemented in big systems? If we have a server that accepts requests from 100000 clients, it has started 100000 threads? Is this realistic? Aren't there limits on how many threads can run in a server? Additionally the overhead of context switching and synchronization, doesn't it degrade performance? Is it implemented as a mix of queues and threads? In this case is the number of queues fixed? Can anybody enlighten me on this, and perhaps give me a good reference that describes these?
Thanks!
The common method is to use thread pools. A thread pool is a collection of already created threads. When a new request gets to the server it is assigned a spare thread from the pool. When the request is handled, the thread is returned to the pool.
The number of threads in a pool is configured depending on the characteristics of the application. For example, if you have an application that is CPU bound you will not want too many threads since context switches will decrease performance. On the other hand, if you have a DB or IO bound application you want more threads since much time is spent waiting. Hence, more threads will utilize the CPU better.
Google "thread pools" and you will for sure find much to read about the concept.
Also Read up on the SEDA pattern link , link
In addition to the answers above I should notice, that really high-performance servers with many incoming connections attempt not to spawn a thread per each connection but use IO Completion Ports, select() and other asynchronous techniques for working with multiple sockets in one thread. And of course special attention must be paid to ensure that problems with one request or one socket won't block other sockets in the same thread.
Also thread management consumes CPU time, so threads should not be spawned for each connection or each client request.
In most systems a thread pool is used. This is a pool of available threads that wait for incoming requests. The number of threads can grow to a configured maximum number, depending on the number of simultaneous requests that come in and the characteristics of the application.
If a requests arrives, an unoccupied thread is requested from the thread pool. This thread is then dedicated to handling the request until the request finishes. When that happens, the thread is returned to the thread pool to handle another request.
Since there is only a limited number of threads, in most server systems one should attempt to make the lifetime of requests as short as possible. The less time a request needs to execute, the sooner a thread can be reused for a new request.
If requests come in while all threads are occupied, most servers implement a queueing mechanism for requests. Of course the size of the queue is also limited, so when more requests arrive than can be queued, new requests will be denied.
One other reason for having a thread pool instead of starting threads for each request is that starting a new thread is an expensive operation. It's better to have a number of threads started beforehand and reusing them then starting new threads all the time.
To get network servers to handle lots of concurrent connections there are several approaches (mostly divided up in "one thread per connection" and "several connections per thread" categories), take a look at the C10K page, which is a great resource on this topic, discussing and comparing a lot of approaches and linking to further resources on them.
Creating 10k threads is not likely to be efficient in most cases, but can be done and would work.
If you needed to serve 10k clients at once, doing so on a single machine would be unlikely but possible.
Depending on the client side implementation, it may be that the 10,000 clients do not need to maintain an open TCP connection - depending on the purpose, the protocol design can greatly improve the efficiency of implementation.
I think the appropriate solution for high scale systems is probably extremely domain-specific, and if you wanted a suggestion you'd have to explain more about your problem domain.