In a server, there is a thread A listening for incoming connections, typically looping forever. When a connection is accepted, thread A creates a task (say, class Callable in Java) and submits it to an Executor.
All this really means is that A lost the reference to the socket, and that now there’s a thread B (created by the Executor) that manages the socket. If B experiences any exception, it would close the socket, and there is no risk that the socket, as an operating system resource, will not be reclaimed.
This is all fine if thread B starts. But what if the executor was shut down before B had a chance to get scheduled?
Does anyone think this is an issue? If the reference to the socket is lost due to this, would the garbage collector close the socket?
Yes, it sounds like an issue.
The OS will probably eventually free up the socket (at least if it's TCP, as far as I can tell) but it will probably take a relatively long time.
I don't think the garbage collector plays a role in this case. At least not for threads, which after having been started will usually keep running even if there is no reference to them in the code (this is true at least for non-daemon threads). Sockets may behave in a similar manner.
If you cannot guarantee the connection is going to be processed (by starting the handling Thread instance as soon as it is established) then you should keep a reference to the socket and make sure you close all of them as soon as possible, which probably means right after Executor.shutdown() or similar method has been called.
Please note that depending on how you ask the Executor to shut down it will either process or not threads which already have been submitted to execution but haven't yet started. So be sure to make your code behave accordingly.
Also if you have limited resources (available threads) to process incoming socket connections and don't want them to grow too much, consider closing them immediately after having been accepted so they don't pile up in the unprocessed wait queue, if this is feasible in your project. The client can then retry connecting at a later time. If you still need to consume connections as soon as they come in, consider a non-blocking I/O approach, which will tend to scale better (and up to a point).
If the reference to the socket is lost due to this, would the garbage collector close the socket?
Probably. But the garbage collector may not run until literally the end of next week: You can't rely on the GC running, pretty much ever, just because 'hey, java has a garbage collector'. It does, and it won't kick in until needed. It may simply never be needed.
Depending on the GC to close resources is a fine way to get your VM killed by the OS for using up too many system resources.
The real question is: What is the causal process that results in shutting down the executor?
If there is some sort of 'cancel all open connections' button, and you implemented that as a one-liner: queue.shutdown(), then, no - that is not a good idea: You'll now be leaning on the GC to clean up those sockets which is bad.
I assume your callables look like:
Socket socket = ....; // obtained from queue
Callable<Void> socketHandler = () -> {
try {
// all actual handling code is here.
} finally {
socket.close();
}
return null;
};
then yeah that is a problem: If the callable is never even started, that finally block won't run. (If you don't have finally you have an even bigger problem - that socket won't get cleaned up if an exception occurs during the handling of it!).
One way out is to have a list of sockets, abstract away the queue itself, and have that abstraction have a shutdown method which both shuts down the queue and closes every socket, guarding every step (both the queue shutdown as well as all the socket.close commands) with a try/catch block to ensure that a single exception in one of these steps won't just stop the shutdown process on the spot.
Note that a bunch of handlers are likely to still be chugging away, so closing the socket 'out from under them' like this will cause exceptions in the handlers. If you don't want that, shut down the queue, then await termination (guarded with try/catch stuff), and then close all the sockets.
You can close a closed socket, that is a noop, no need to check first and no need to worry about the impact of closing a ton of already-closed sockets.
But do worry about keeping an obj ref to an infinitely growing list of sockets. Once a socket is completely done with, get rid of it - also from this curated list of 'stuff you need to close if the queue is terminated'.
Of course, if the only process that leads to early queue termination is because you want to shut down the VM, don't worry about it. The sockets go away with the VM. In fact, no need to shutdown the queue. If you intend to end the VM, just.. end it. immediately: System.shutdown(0) is what you want. There is no such thing as 'but.. I should ask all the things to shut down nicely!'. That IS how you ask. Systems that need to clean up resources are mostly badly designed (design them so that they don't need cleanup on VM shutdown. All the resources work that way, for example), and if you must, register a shutdown hook.
Related
I've a general question. If cpu has one core and I run multiple threads on it. Each thread is used for a GET request. How will network connection survive the thread-switching?
What happens if one thread starts receiving response from server and suddenly a thread-switch happens, considering HTTP use TCP comm., how things would end-up?
Thanks.
TL;DR Connection will survive unless the thread gets control back too late when the server terminates it by timeout.
To understand why it works this way, consider how data gets from a wire (or air) to an application.
The network interface collects data from medium (wire) into internal hardware buffer and when some chunk of data is complete it emits so called hardware interruption (which is just a low-level event). OS handles the interruption using a driver of the network interface and that chunk of data gets to a buffer in the main memory of a computer. The buffer is controlled by OS. When the application reads data from the connection it actually reads data from that buffer.
When thread-switch happens, content of the main memory is never lost. So when the thread gets control back, it just proceeds with its task from the point it was suspended.
If the thread gets back to work when the server has already closed the connection by timeout, an IOError is thrown by the method that tries to read the data from the connection.
This explanation is oversimplified and may be even wrong in details but should give an overall impression about how the things work.
I would like to use Java Netty to create a TCP server for a large number of persistent connections from a clients. In other words, imaging that there are 1000 client devices out there, and all of them create and maintain a persistent connection to the TCP server. There will be a reasonable amount of traffic (mostly lines of text) that go back and forth across each of these persistent connections. How can I determine the best number of threads to use in the boss and worker groups for NioEventLoopGroup?
My understanding is that when the connection is created, Netty creates a SimpleChannelInboundHandler<String> object to handle the connection. When the connection is created then the handler channelActive method is called, and every time it gets a new message from the client, the method messageReceived gets called (or channelRead0 method in Netty 4.0.24).
Is my understanding correct?
What happens if I have long running code to run in messageReceived -
do I need to launch this code in yet another thread
(java.util.Thread)?
What happens if my messageReceived method blocks on something or
takes a long time to complete? Does that bring Netty to a grinding
halt?
Basically I need to write a TCP socket server that can serve a large number of persistent connections as quickly as possible.
Is there any guidance available on number of threads for NioEventLoopGroup and on how to use any threads inside the handler?
Any help would be greatly appreciated.
How can I determine the best number of threads to use in the boss and worker groups for NioEventLoopGroup?
About Boss Thread,if you are saying that you need persistent connections , there is no sense to use a lot of boss threads, because boss threads only responsible for accepting new connections. So I would use only one boss thread.
The number of worker threads should depends on your processor cores.
Don't forget to add -XmsYYYYM and -XmxYYYYM as your VM attributes, because without them you can face the case, when your JVM are not using all cores.
What happens if I have long running code to run in messageReceived - do I need to launch this code in yet another thread (java.util.Thread)?
Do you really need to do it? Probably you should think of doing your logic another way, if not then probably you should consider OIO with new thread for each connection.
What happens if my messageReceived method blocks on something or takes a long time to complete?
You should avoid using thread blocking actions in your handlers.
Does that bring Netty to a grinding halt?
Yep, it does.
i'm reading the TCP/IP Socket in Java, about the serversocket, it says
When we call accept() on that ServerSocket
instance, if a new connection is pending, accept() returns immediately; otherwise it blocks
until either a connection comes in or the timer expires, whichever comes first. This allows
a single thread to handle multiple connections. Unfortunately, the approach requires that
we constantly poll all sources of I/O, and that kind of “busy waiting” approach again introduces
a lot of overhead from cycling through connections just to find out that they have
nothing to do.
As I understand it, should this be "notified" when a connection comes thus should not be "busy waiting"? Did i misunderstand something...?
-----------------EDIT----------------------
The whole paragraph is as below:
Because of these complications, some programmers prefer to stick with a single-threaded
approach, in which the server has only one thread, which deals with all clients—not sequentially,
but all at once. Such a server cannot afford to block on an I/O operation with any
one client, and must use nonblocking I/O exclusively. Recall that with nonblocking I/O, we specify the maximum amount of time that a call to an I/O method may block (including zero).
We saw an example of this in Chapter 4, where we set a timeout on the accept operation
(via the setSoTimeout() method of ServerSocket). When we call accept() on that ServerSocket
instance, if a new connection is pending, accept() returns immediately; otherwise it blocks
until either a connection comes in or the timer expires, whichever comes first. This allows
a single thread to handle multiple connections. Unfortunately, the approach requires that
we constantly poll all sources of I/O, and that kind of “busy waiting” approach again introduces
a lot of overhead from cycling through connections just to find out that they have
nothing to do
It's mostly nonsense, even in the entire quotation. Either you are using blocking I/O, in which case you need a thread per connection, and another per accept() loop, or you are using non-blocking I/O, in which case you have select(), or, from Java 7, you are using Asynchronous I/O, in which case it is all callbacks into completion handlers. In none of these cases do you need to poll or busy-wait.
I think he must be referring to using blocking mode with very short timeouts, but it's really most unclear.
I have a thread pool (executor) which I would like to monitor for excessive resource usage (time since cpu and memory seem to be quite harder). I would like to 'kill' threads that are running too long, like killing an OS process. The workers spend most time calculating, but significant time is also spent waiting for I/O, mostly database...
I have been reading up on stopping threads in java and how it is deprecated for resource cleanup reasons (not properly releasing locks, closing sockets and files and so on). The recommended way is to periodically check in a worker thread whether it should stop and then exit. This obviously expect that client threads be written in certain ways and that they are not blocked waiting on some external I/O. There is also ThreadDeth and InterruptedException which might be able to do the job, but they may actually be circumvented in improperly/malicously written worker threads, and also I got an impression (though no testing yet) they InterruptedException might not work properly in some (or even all) cases when the worker thread is waiting for I/O.
Another way to mitigate it would be to use multiple OS processes to isolate parts of the system, but it brings some unwanted increases in resource consumption.
That led me to that old story about isolates and/or MVM from more than five years ago, but nothing seems to have happened on that front, maybe in java 8 or 9...
So, actually, this all has made me to wander whether some poor mans simulation of processes could be achieved through using threads that would each have their own classloader? Could that be used to simulate processes if each thread (or group) would be loaded in its own classloader? I am not sure how much an increase in resource consumption would that bring (as there would not be much code sharing and code is not tiny). At least process copy-on-write semantics enable code sharing..
Any recommendations/ideas?
EDIT:
I am asking because of general interest and kind of disappointment that no solutions for this exist in the JVM to date (I mean shared application servers are not really possible - application domains, or something like that, in .NET seem to address exactly this kind of problem). I understand that killing a process does not guarantee reverting all system state to some initial condition, but at least all resorces like handles, memory and cpu are released. I was thinking of using classloaders since they might help with releasing locks held by the thread, which is one of the reasons that Thread.stop is deprecated. In my current situation the only other thing should be released (I can think of currently) is database connection, that could be handled separately/externally (by watchdog thread) if needed..
Though, really, in my case Thread.stop might actually be workable, I just dislike using deprecated methods..
Also I am considering this as a safety net for misbehaving processes, Ideally they should behave nicely, and are in a quite high degree under my control.
So, to clarify, I am asking how do for example java people on the server side handle runaway threads? I suspect by using many machines in the cluster to offset the problem and restarting misbehaving ones - when the application is stateless at least..
The difference between a thread and a process is that thread implicitly share memory and resources like sockets and files (making thread local memory a workaround). Processes implicitly have private memory and resources.
Killing the thread is not the problem. The problem is that a poorly behaving thread or even a reasonable behaviour thread can leave resources in an inconsistent state. Using a class loader will not help you track this, or solve the problem for you. For processes its easier to track what resources they are using, as most of the resources are isolated. Even for processes they can leave locks, temporary files and shared IPC resources in an incorrect state if killed.
The real solution is to write code which behaves properly so it can be managed and working around and trying to handle every possible poorly behaving code is next to impossible. If you have a bad third party library you have to use, you can try killing and cleaning it up and you can come up with an ok solution, but you can't expect it to be a clean one.
EDIT: Here is a simple program which will deadlock between two processes or machines because it has a bug in it. The way to stop deadlocks is to fix the code.
public static void main(String... args) throws IOException {
switch(args.length) {
case 1: {
// server
ServerSocket ss = new ServerSocket(Integer.parseInt(args[0]));
Socket s = ss.accept();
ObjectInputStream ois = new ObjectInputStream(s.getInputStream());
ObjectOutputStream oos = new ObjectOutputStream(s.getOutputStream());
// will deadlock before it gets here
break;
}
case 2: {
Socket s = new Socket(args[0], Integer.parseInt(args[1]));
ObjectInputStream ois = new ObjectInputStream(s.getInputStream());
ObjectOutputStream oos = new ObjectOutputStream(s.getOutputStream());
// will deadlock before it gets here
break;
}
default:
System.err.println("Must provide either a port as server or hostname port as client");
}
}
In a servlet-based app I'm currently writing we have separate thread classes for readers and writers. Data is transmitted from a writer to multiple readers by using LinkedBlockingQueue<byte[]>, so a reader safely blocks if there is no new data to get from the writer. The problem is that if remote clients served by these reader threads terminate connection, Tomcat won't throw a broken pipe unless the writer sends in new data and attempts to transmit this new chunk to the remote clients. In other words, the following attack can be performed against our service:
Start a streaming write request and don't write any data to it at all.
Keep creating and dropping read connections. Since the writer doesn't produce any data, reading threads attached to it remain blocked and consume memory and other resources.
Observe the server run out of RAM quickly.
Should I create a single maintenance thread that would monitor sockets belonging to blocked reader threads and send interrupt() to those that appear to have lost connection to their respective clients? Are there any major flaws in the architecture described above? Thank you.
Sounds to me that the vulnerability lies in the fact that your readers wait forever, regardless of the state of the incoming connection (which, of course, you can't know about).
Thus a straightforward way to address this, if appropriate, would be to use the poll method on BlockingQueue, rather than take.Calling poll allows you to specify a timeout, after which the reader will return null if no data has been added to the queue.
In this way the readers won't stay blocked forever, and should relatively quickly fall back into the normal processing loop, allowing their resources to be freed as appropriate.
(This isn't a panacea of course; while the timeout is still running, the readers will consume resources. But ultimately a server with finite resources will have some vulnerability to a DDOS attack - and this reduces its impact to a customisably small window, instead of leaving your server permanently crippled, at least.)
The approach I taken in the past is to have blocking connections and only have reader threads. When you want to write to multiple connections, you do that in the current thread. If you are concerned about a write blocking for every, you can have a single monitoring thread which closes blocked connections.
You can still have resources tied up in unused sockets, but I would have another thread which finds unused sockets and closes them.
This leaves you will one thread per connection, plus a couple of monitoring threads.