Socket timeout not respected with multiple threads in Java? - java

I have a java program that spawns out 4 threads. In each thread, I have multiple socket timeouts. However, it seems that these timeouts are not respected i.e. the readLine() function might block for a longer period of time.
I want the following behavior: If I set a socket timeout to 300 ms, then I want the readLine() function to return within 300 ms from when the readLine() (i.e. the underlying select call) was invoked, no matter what. I understand that the OS scheduler would be putting the threads to sleep while doing processor sharing, but is there any way in Java to force the threads to always be woken up to ensure this behavior? Or is this just not the right way to think when doing multi-threaded programming?
Ideally, since I am spawning out 4 threads and running on a 6-core machine, each thread should be able to get its own CPU and run in parallel, and respect the select timeout ... but that's perhaps too much to expect ...
PS: I actually do use Thread.interrupt() to ensure that each of my threads exits within a certain time (I check the time elapsed in the master thread, and interrupt the child threads if its been too long). In each of my threads, I connect to a (different) server, make a request, and wait for a response. I do not know how long the response will be. So I keep on calling the readLine() method, till it times out with a SocketTimeoutException. I enforce a timeout of 300 ms, since I expect the server to start responding within this time. The reason I want to enforce this timeout is that the server behaves in a broadcast fashion, and sends responses to a request by a single client to all clients. So if I don't have a timeout, I will keep on getting data in response to requests by some other clients.

If I really understood your problem, you can always try to invoke Thread.interrupt() in the thread that is performing the readLine() operation. As you didn't provide any code, I leave this link for you to read. It's general, but it provides enough information about interrupting threads.
These two links may also be of use to you: How do you kill a thread in Java? and How to abort a thread in a fast and clean way in java?.
Regarding your question about the OS scheduler, you should be aware that in general purpose OSes you do not have total control on the way the OS schedules tasks. For instance, in Linux the interrupts are the highest priority tasks, then there are scheduling policies that enable you to put "some" determinism on how the tasks are scheduled. In Java, you can use the setPriority() method to change the priority of the thread, but in fact it is the same as using the nice command, and still you don't get any guarantees that this thread will be scheduled ahead of other OS threads.
I hope this helps.

You are making some wrong assumptions here:
that the timeout will be exactly 300ms. In fact the timeout is that will be at least 300ms.
the OS Scheduler does not do anything (rather than schedule the java os processes) within the java threads.
having 6 core, does not mean that each one of your threads will run at separate core, it is not possible to bind thread->core at java
for last, you consider that the jvm has only yours 4 threads running, but in fact there has more threads, by example, the garbage collector thread(s).
Asking your question: "is there any way in Java to force the threads to always be woken up to ensure this behavior? "
Yes, depending how it is your code, if the thread is thread.sleep(), you can use thread.interrupt() (for readline() uses it) and handle the InterruptionException or if they are object.wait(), you can use object.notify() or object.notifyAll().

Related

Couldn't Spring Webflux or Non blocking pattern be bad for scaling

I get that with threads being nonblocking, we don't need to have Thread sprawl depending on N concurrent requests, but rather we put our tasks in a single event loop in our reactive web programming pattern.
Yes, that can help, but since the event loop is a queue, what if the first task to be processed blocks forever? Then the event loop will never progress and thus end of responses and processing other than queueing more tasks. Yes, timeouts are probably possible, but I can't wrap my head around how the event loop can be a good solution.
Say you have 3 tasks that take 3 seconds to wait for IO and run each executions and they got submitted to the event queue. Then they will still take 9 seconds to be able to be processed and also to execute once IO resolved. In the case of making threads that block, this would have resolved in 3 seconds since they run concurrently.
Where I can see a benefit is if the event loop is not really a queue and upon signal that a task is ready to be processed, it dispatches that task to be processed. In that case though, this would mean that order of task execution is not maintained and also each task has to still be running a thread in order to be able to tell when IO is resolved.
Maybe I am not understanding the event loop and thread handling correctly. Can someone correct me please because it seems like this Reactor pattern seems to make things possibly worse.
Lastly, upon X requests in Spring Reactor, does only 1 thread get created to run handlers instead of the traditional X threads? In that case, if someone accidently wrote blocking code, doesnt that mean each subsequent requests get queued?
It is not a good idea to use the event loop for long running tasks. This is considered an anti-pattern. Usually it is merely used for quickly picking up imminent events, but not actually doing the work associated with these events if the work would block the event loop noticeably. You would want to use a separate thread pool for executing long running tasks. So the event loop would usually only initiate work using asynchronous and hence non-blocking structures (or actually doing the work only if it can be done very quickly) and pass the heavier and possibly blocking tasks to a separate thread pool (for CPU intensive computations) or to the operating system (such as data buffers to be sent over the network).
Also, don't be fooled by the fact that only one thread is dealing with the events, it is very fast and is usually enough for even demanding applications. Platforms like NodeJS or frameworks like Netty (used in Akka, Play framework, Apache Cassandra, etc.) are using an event loop at their heart with great success. One should just be aware of the fact, that performing blocking operations inside the event loop is generally a bad idea.
Please have a look at some of these posts for more information:
The reactor pattern and non blocking IO
Unix Network Programming
Kotlin Webflux
Slightly off topic but still a very prominent example: Don't Block the Event Loop (NodeJS)

Is possible to control the amount of time that each thread executes in Java?

I want to control the amount of time that each thread uses.
One thread does some processing and another processes data in the database, but the insertion is slower than processing because of the amount of generated data. I want to give more processor time to insert that data.
Is it possible do this with threads? At the moment, I'm putting a sleep in the thread doing the processing, but the time of insertion changes according to the machine. Is there another way I can do this? Is the way involving the use of thread synchronization inside my program?
You can increase the priority of a thread using Thread.setPriority(...) but this is not ideal.
Perhaps you can use some form of blocking queue from the java.util.concurrent package to make one Thread wait while another Thread is doing something. For example, a SynchronousQueue can be used to send a message from one Thread to another Thread that it can now do something.
Another approach is to use Runnables instead of Threads, and submit the Runnables to an Executor, such as ThreadPoolExecutor. This executor will have the role of making sure Runnables are using a fair amount of time.
The first thing to mention is that thread priority doesn't per se mean "share of the CPU". There seems to be a lot of confusion about what thread priority actually means, partly because it actually means different things under different OS's. If you're working in Linux, it actually does mean something close to relative share of CPU. But under Windows, it definitely doesn't. So in case it's of any help, you may firstly want to look at some information I compiled a little while ago about thread priorities in Java, which explains what Thread Priorities Actually Mean on different systems.
The general answer to your question is that if you want a thread to take a particular share of CPU, it's better to implicitly do that programmatically: periodically, for each "chunk" of processing, measure how much time elapsed (or how much CPU was used-- they're not strictly speaking the same thing), then sleep an appropriate amount of time so that the processing/sleep ratio comes to roughly the % of processing time you intended.
However, I'm not sure that will actually help your task here.
As I understand, basically you have an insertion task which is the rate determining step. Under average circumstances, it's unlikely that the system is "deliberately dedicating less CPU than it can or needs to" to the thread running that insertion.
So there's probably more mileage in looking at that insertion task and seeing if programmatically you can change how that insertion task functions. For example: can you insert in larger batches? if the insertion process really is CPU bound for some reason (which I am suspicious of), can you multi-thread it? why does your application actually care about waiting for the insertion to finish, and can you change that dependency?
If the insertion is to a standard DB system, I wonder if that insertion is terribly CPU bound anyway?
One way would be to set the priority of the processing thread to be lower than the other. But beware this is not recommended as it wont keep your code platform independent. (DIfferent thread priorities behave differently on different platforms).
Another way would be to use a service where database thread would keep sending messages about its current status (probably some flag "aboutToOver").
Or use synchronization say a binary semaphore. When the database thread is working, the other thread would be blocked and hence db thread would be using all the resources. But again processing thread would be blocked in the mean time. Actually this will be the best solution as the processign thread can perform say 3-4 tasks and then will get blocked by semaphore till later when it can again get up and do task

Is dangerous to start threads in Java and not to wait for them (with .join())?

When writing a multithread internet server in java, the main-thread starts new
ones to serve incoming requests in parallel.
Is any problem if the main-thread does not wait ( with .join()) for them?
(It is obviously absurd create a new thread and then, wait for it).
I know that, in a practical situation, you should (or "you must"?) implement a pool
of threads to "re-use" them for new requests when they become idle.
But for small applications, should we use a pool of threads?
You don't need to wait for threads.
They can either complete running on their own (if they've been spawned to perform one particular task), or run indefinitely (e.g. in a server-type environment).
They should handle interrupts and respond to shutdown requests, however. See this article on how to do this correctly.
If you need a set of threads I would use a pool and executor methods since they'll look after thread resource management for you. If you're writing a multi-threaded network server then I would investigating using (say) a servlet container or a framework such as Mina.
The only problem in your approach is that it does not scale well beyond a certain request rate. If the requests are coming in faster than your server is able to handle them, the number of threads will rise continuously. As each thread adds some overhead and uses CPU time, the time for handling each request will get longer, so the problem will get worse (because the number of threads rises even faster). Eventually no request will be able to get handled anymore because all of the CPU time is wasted with overhead. Probably your application will crash.
The alternative is to use a ThreadPool with a fixed upper bound of threads (which depends on the power of the hardware). If there are more requests than the threads are able to handle, some requests will have to wait too long in the request queue, and will fail due to a timeout. But the application will still be able to handle the rest of the incoming requests.
Fortunately the Java API already provides a nice and flexible ThreadPool implementation, see ThreadPoolExecutor. Using this is probably even easier than implementing everything with your original approach, so no reason not to use it.
Thread.join() lets you wait for the Thread to end, which is mostly contrary to what you want when starting a new Thread. At all, you start the new thread to do stuff in parallel to the original Thread.
Only if you really need to wait for the spawned thread to finish, you should join() it.
You should wait for your threads if you need their results or need to do some cleanup which is only possible after all of them are dead, otherwise not.
For the Thread-Pool: I would use it whenever you have some non-fixed number of tasks to run, i.e. if the number depends on the input.
I would like to collect the main ideas of this interesting (for me) question.
I can't totally agree with "you
don't need to wait for threads".
Only in the sense that if you don't
join a thread (and don't have a
pointer to it) once the thread is
done, its resources are freed
(right? I'm not sure).
The use of a thread pool is only
necessary to avoid the overhead of
thread creation, because ...
You can limit the number of parallel
running threads by accounting, with shared variables (and without a thread pool), how many of then
were started but not yet finished.

Java daemon - handling shutdown requests

I'm currently working on a daemon that will be doing A LOT of different tasks. It's multi threaded and is being built to handle almost any kind of internal-error without crashing. Well I'm getting to the point of handling a shutdown request and I'm not sure how I should go about doing it.
I have a shutdown hook setup, and when it's called it sets a variable telling the main daemon loop to stop running. The problem is, this daemon spawns multiple threads and they can take a long time. For instance, one of these threads could be converting a document. Most of them will be quick (I'm guessing under 10 seconds), but there will be threads that can last as long as 10+ minutes.
What I'm thinking of doing right now is when a shutdown hook has been sent, do a loop for like 5 seconds on ThreadGroup.activeCount() with a 500ms (or so) Sleep (all these threads are in a ThreadGroup) and before this loop, I will send a notification to all threads telling them a shutdown request has been called. Then they will have to instantly no matter what they're doing cleanup and shutdown.
Anyone else have any suggestions? I'm interested in what a daemon like MySQL for instance does when it gets told to stop, it stops instantly. What happens if like 10 query's are running that are very slow are being called? Does it wait or does it just end them. I mean servers are really quick, so there really isn't any kind of operation that I shouldn't be able to do in less than a second. You can do A LOT in 1000ms now days.
Thanks
The java.util.concurrent package provides a number of utilities, such as ThreadPoolExecutor (along with various specialized types of other Executor implementations from the Executors class) and ThreadPoolExecutor.awaitTermination(), which you might want to look into - as they provide the same exact functionality you are looking to implement. This way you can concentrate on implementing the actual functionality of your application/tasks instead of worrying about things like thread and task scheduling.
Are your thread jobs amenable to interruption via Thread#interrupt()? Do they mostly call on functions that themselves advertise throwing InterruptedException? If so, then the aforementioned java.util.concurrent.ExecutorService#shutdownNow() is the way to go. It will interrupt any running threads and return the list of jobs that were never started.
Similarly, if you hang on to the Futures produced by ExecutorService#submit(), you can use Future#cancel(boolean) and pass true to request that a running job be interrupted.
Unless you're calling on code out of your control that swallows interrupt signals (say, by catching InterruptedException without calling Thread.currentThread().interrupt()), using the built-in cooperative interruption facility is a better choice than introducing your own flags to approximate what's already there.

Java Thread won't pause on I/O operation

I was under the impression that in Java, a Thread will pause and give other threads a chance to do some work during blocking I/O operations ( like Socket.read() or DataGramsocket.receive() ). For some reason in my multi threaded network application, a call to receive() is causing all my other threads to starve ( the thread that called receive() is becoming a big boss and never giving up control, thus blocking forever! )
Why would this happen? I used to have the same exact application, but instead of UDP it was TCP based. Socket.read() always paused the Thread and allowed others to work for a bit if it blocked for too long.
-- extra info --
The TCP version of my custom thread was this code:
http://www.dwgold.com/code/proxy/proxyThread.java.txt
My new code ( UDP version ) is pretty much the same, but modified a bit to use UDP methods and style. I then created two of these threads and call start on both. The first thread always blocks and never lets the other do work in the UDP version.
It seems that you're using an InputStream, and according to the JDK docs for InputStream.read(), the read blocks exactly as you've described -- until the data is received, an end-of-file is reached, or an exception is thrown.
So for me, the question is: why does the TCP version of your code allow the block to get interrupted? That doesn't seem to make any sense. Perhaps your TCP code breaks the reads up into discrete, short-enough bursts that the thread has a chance to jump in between separate calls to read()?
Looking further, I see that the difference is that the TCP version of your code receives data via the InputStream that's provided by Socket, whereas the UDP version of your code receives data directly from the DatagramSocket's own receive() method. So I'd posit that this is just a fundamental difference between the functionality offered by the two (InputStream.read and DatagramSocket.receive). Have you thought about using DatagramPacket.setSoTimeout to set a timeout on the socket's receive block, and then catching the SocketTimeoutException when it's thrown by the call to receive() that times out? You should be able to implement this to achieve what you want.
UPDATE: looking further still, it appears that the DatagramSocket.receive method is synchronized. So the way to explain the behavior you're seeing is that you're starting two threads that are attempting to use the same DatagramSocket instance -- if two threads are attempting to execute receive() on the same DatagramSocket instance, one will block the instance until it's done, and the other will be blocked. (There's a post on the Sun forums that describes this, as well.) Might this be what's happening -- you're reusing the same DatagramSocket for both threads?
It can be because the Java threads are user level threads (opposed to kernel level threads).
One kernel level thread can contain multiple user level threads. However if one user level thread waits for io, the kernel has no way to put only that user level thread on a wait queue, it can only put the whole kernel thread on the wait queue (user level threads are scheduled at the user level, not at the kernel level). As a result, a single user level thread waiting for io can/will block all the other user level threads (from that kernel thread).
There's really no explanation for the behaviour you're describing, that a read on an InputStream from a Socket should cause "all other threads" to pause. Are you by any chance starting several threads, which all are reading from the same socket and what you actually mean is that all these threads are hanging? In that case, I would expect that the data read from the socket is divided arbitrarily among the threads.

Categories