Java Thread won't pause on I/O operation

Java Thread won't pause on I/O operation - java

I was under the impression that in Java, a Thread will pause and give other threads a chance to do some work during blocking I/O operations ( like Socket.read() or DataGramsocket.receive() ). For some reason in my multi threaded network application, a call to receive() is causing all my other threads to starve ( the thread that called receive() is becoming a big boss and never giving up control, thus blocking forever! )
Why would this happen? I used to have the same exact application, but instead of UDP it was TCP based. Socket.read() always paused the Thread and allowed others to work for a bit if it blocked for too long.
-- extra info --
The TCP version of my custom thread was this code:
http://www.dwgold.com/code/proxy/proxyThread.java.txt
My new code ( UDP version ) is pretty much the same, but modified a bit to use UDP methods and style. I then created two of these threads and call start on both. The first thread always blocks and never lets the other do work in the UDP version.

It seems that you're using an InputStream, and according to the JDK docs for InputStream.read(), the read blocks exactly as you've described -- until the data is received, an end-of-file is reached, or an exception is thrown.
So for me, the question is: why does the TCP version of your code allow the block to get interrupted? That doesn't seem to make any sense. Perhaps your TCP code breaks the reads up into discrete, short-enough bursts that the thread has a chance to jump in between separate calls to read()?
Looking further, I see that the difference is that the TCP version of your code receives data via the InputStream that's provided by Socket, whereas the UDP version of your code receives data directly from the DatagramSocket's own receive() method. So I'd posit that this is just a fundamental difference between the functionality offered by the two (InputStream.read and DatagramSocket.receive). Have you thought about using DatagramPacket.setSoTimeout to set a timeout on the socket's receive block, and then catching the SocketTimeoutException when it's thrown by the call to receive() that times out? You should be able to implement this to achieve what you want.
UPDATE: looking further still, it appears that the DatagramSocket.receive method is synchronized. So the way to explain the behavior you're seeing is that you're starting two threads that are attempting to use the same DatagramSocket instance -- if two threads are attempting to execute receive() on the same DatagramSocket instance, one will block the instance until it's done, and the other will be blocked. (There's a post on the Sun forums that describes this, as well.) Might this be what's happening -- you're reusing the same DatagramSocket for both threads?

It can be because the Java threads are user level threads (opposed to kernel level threads).
One kernel level thread can contain multiple user level threads. However if one user level thread waits for io, the kernel has no way to put only that user level thread on a wait queue, it can only put the whole kernel thread on the wait queue (user level threads are scheduled at the user level, not at the kernel level). As a result, a single user level thread waiting for io can/will block all the other user level threads (from that kernel thread).

There's really no explanation for the behaviour you're describing, that a read on an InputStream from a Socket should cause "all other threads" to pause. Are you by any chance starting several threads, which all are reading from the same socket and what you actually mean is that all these threads are hanging? In that case, I would expect that the data read from the socket is divided arbitrarily among the threads.

Related

When to use selector and when to use blocking channels (performence)

Assume that there are 1 to 30~ channels (UDP & TCP channels)
Assume that we use NIO channels
Assume we are running on Multi core CPU
There are 2 options:
Define 1 thread per channel (each thread will be blocked till there is data to read)
(so all the thread are in the waiting Q ... till they wake up)
or:
Define 1 thread (with selector) which will read the data (each time from different channel)
what is the best way ?
what will give me the best performance ?

In Java you don't have much control over the thread (concrete) mechanisms so you can't really bind a thread to a certain core (setting the affinity for example) so you cannot expect much performance difference from having a single thread processing via a selector or having multiple per-channel threads.
Given the introduction that I just gave, let's talk on a higher level, ok?
When you have multiple threads, one per channel, and considering a multiple-core processor there is the possibility of having N threads (where N is the number of cores) executing at the same time, if you have a single thread processing a queue of requests it could have some speed if we could guarantee that once a certain request is made there isn't another request arriving. Either way the processing of the threads will end up going through context-switch wich takes the thread out of the core, see if there is something else that has to be done, and if not, put some thread back on the core to execute.
What happens sometimes is that you thread keeps moving from one core to another and this kills the cache which ends up decreasing performance (as many other factos also affect).
So, depending on how much requests will arriving at the same time I would prefer the multiple threads approach.
Cheers.

Is there a way for a thread to know it has been "interleaved"?

In Java, is there a way for a thread to know it has been "interleaved"?
I would like to send a certain update to my clients (who are handled by individual threads) after their thread has been interleaved by another thread.
In case my use of the term "interleaved" is incorrect, I'm referring to the process where the processor stops running one thread and moves to another one.
So when the processor eventually returns to my thread, I would like a certain update to be sent to my client via the thread.

Apparently there is no simple way to detect that a thread has been interleaved.
Instead, I decided to use an atomic integer to track the amount of updates that were executed by all threads.
I then changed the code within my threads to monitor the amount of changes that had been done (since last notifying the client) and, once a certain threshold had been exceeded, I updated the client.

Socket timeout not respected with multiple threads in Java?

I have a java program that spawns out 4 threads. In each thread, I have multiple socket timeouts. However, it seems that these timeouts are not respected i.e. the readLine() function might block for a longer period of time.
I want the following behavior: If I set a socket timeout to 300 ms, then I want the readLine() function to return within 300 ms from when the readLine() (i.e. the underlying select call) was invoked, no matter what. I understand that the OS scheduler would be putting the threads to sleep while doing processor sharing, but is there any way in Java to force the threads to always be woken up to ensure this behavior? Or is this just not the right way to think when doing multi-threaded programming?
Ideally, since I am spawning out 4 threads and running on a 6-core machine, each thread should be able to get its own CPU and run in parallel, and respect the select timeout ... but that's perhaps too much to expect ...
PS: I actually do use Thread.interrupt() to ensure that each of my threads exits within a certain time (I check the time elapsed in the master thread, and interrupt the child threads if its been too long). In each of my threads, I connect to a (different) server, make a request, and wait for a response. I do not know how long the response will be. So I keep on calling the readLine() method, till it times out with a SocketTimeoutException. I enforce a timeout of 300 ms, since I expect the server to start responding within this time. The reason I want to enforce this timeout is that the server behaves in a broadcast fashion, and sends responses to a request by a single client to all clients. So if I don't have a timeout, I will keep on getting data in response to requests by some other clients.

If I really understood your problem, you can always try to invoke Thread.interrupt() in the thread that is performing the readLine() operation. As you didn't provide any code, I leave this link for you to read. It's general, but it provides enough information about interrupting threads.
These two links may also be of use to you: How do you kill a thread in Java? and How to abort a thread in a fast and clean way in java?.
Regarding your question about the OS scheduler, you should be aware that in general purpose OSes you do not have total control on the way the OS schedules tasks. For instance, in Linux the interrupts are the highest priority tasks, then there are scheduling policies that enable you to put "some" determinism on how the tasks are scheduled. In Java, you can use the setPriority() method to change the priority of the thread, but in fact it is the same as using the nice command, and still you don't get any guarantees that this thread will be scheduled ahead of other OS threads.
I hope this helps.

You are making some wrong assumptions here:
that the timeout will be exactly 300ms. In fact the timeout is that will be at least 300ms.
the OS Scheduler does not do anything (rather than schedule the java os processes) within the java threads.
having 6 core, does not mean that each one of your threads will run at separate core, it is not possible to bind thread->core at java
for last, you consider that the jvm has only yours 4 threads running, but in fact there has more threads, by example, the garbage collector thread(s).
Asking your question: "is there any way in Java to force the threads to always be woken up to ensure this behavior? "
Yes, depending how it is your code, if the thread is thread.sleep(), you can use thread.interrupt() (for readline() uses it) and handle the InterruptionException or if they are object.wait(), you can use object.notify() or object.notifyAll().

Is dangerous to start threads in Java and not to wait for them (with .join())?

When writing a multithread internet server in java, the main-thread starts new
ones to serve incoming requests in parallel.
Is any problem if the main-thread does not wait ( with .join()) for them?
(It is obviously absurd create a new thread and then, wait for it).
I know that, in a practical situation, you should (or "you must"?) implement a pool
of threads to "re-use" them for new requests when they become idle.
But for small applications, should we use a pool of threads?

You don't need to wait for threads.
They can either complete running on their own (if they've been spawned to perform one particular task), or run indefinitely (e.g. in a server-type environment).
They should handle interrupts and respond to shutdown requests, however. See this article on how to do this correctly.
If you need a set of threads I would use a pool and executor methods since they'll look after thread resource management for you. If you're writing a multi-threaded network server then I would investigating using (say) a servlet container or a framework such as Mina.

The only problem in your approach is that it does not scale well beyond a certain request rate. If the requests are coming in faster than your server is able to handle them, the number of threads will rise continuously. As each thread adds some overhead and uses CPU time, the time for handling each request will get longer, so the problem will get worse (because the number of threads rises even faster). Eventually no request will be able to get handled anymore because all of the CPU time is wasted with overhead. Probably your application will crash.
The alternative is to use a ThreadPool with a fixed upper bound of threads (which depends on the power of the hardware). If there are more requests than the threads are able to handle, some requests will have to wait too long in the request queue, and will fail due to a timeout. But the application will still be able to handle the rest of the incoming requests.
Fortunately the Java API already provides a nice and flexible ThreadPool implementation, see ThreadPoolExecutor. Using this is probably even easier than implementing everything with your original approach, so no reason not to use it.

Thread.join() lets you wait for the Thread to end, which is mostly contrary to what you want when starting a new Thread. At all, you start the new thread to do stuff in parallel to the original Thread.
Only if you really need to wait for the spawned thread to finish, you should join() it.

You should wait for your threads if you need their results or need to do some cleanup which is only possible after all of them are dead, otherwise not.
For the Thread-Pool: I would use it whenever you have some non-fixed number of tasks to run, i.e. if the number depends on the input.

I would like to collect the main ideas of this interesting (for me) question.
I can't totally agree with "you
don't need to wait for threads".
Only in the sense that if you don't
join a thread (and don't have a
pointer to it) once the thread is
done, its resources are freed
(right? I'm not sure).
The use of a thread pool is only
necessary to avoid the overhead of
thread creation, because ...
You can limit the number of parallel
running threads by accounting, with shared variables (and without a thread pool), how many of then
were started but not yet finished.

Java daemon - handling shutdown requests

I'm currently working on a daemon that will be doing A LOT of different tasks. It's multi threaded and is being built to handle almost any kind of internal-error without crashing. Well I'm getting to the point of handling a shutdown request and I'm not sure how I should go about doing it.
I have a shutdown hook setup, and when it's called it sets a variable telling the main daemon loop to stop running. The problem is, this daemon spawns multiple threads and they can take a long time. For instance, one of these threads could be converting a document. Most of them will be quick (I'm guessing under 10 seconds), but there will be threads that can last as long as 10+ minutes.
What I'm thinking of doing right now is when a shutdown hook has been sent, do a loop for like 5 seconds on ThreadGroup.activeCount() with a 500ms (or so) Sleep (all these threads are in a ThreadGroup) and before this loop, I will send a notification to all threads telling them a shutdown request has been called. Then they will have to instantly no matter what they're doing cleanup and shutdown.
Anyone else have any suggestions? I'm interested in what a daemon like MySQL for instance does when it gets told to stop, it stops instantly. What happens if like 10 query's are running that are very slow are being called? Does it wait or does it just end them. I mean servers are really quick, so there really isn't any kind of operation that I shouldn't be able to do in less than a second. You can do A LOT in 1000ms now days.
Thanks

The java.util.concurrent package provides a number of utilities, such as ThreadPoolExecutor (along with various specialized types of other Executor implementations from the Executors class) and ThreadPoolExecutor.awaitTermination(), which you might want to look into - as they provide the same exact functionality you are looking to implement. This way you can concentrate on implementing the actual functionality of your application/tasks instead of worrying about things like thread and task scheduling.

Are your thread jobs amenable to interruption via Thread#interrupt()? Do they mostly call on functions that themselves advertise throwing InterruptedException? If so, then the aforementioned java.util.concurrent.ExecutorService#shutdownNow() is the way to go. It will interrupt any running threads and return the list of jobs that were never started.
Similarly, if you hang on to the Futures produced by ExecutorService#submit(), you can use Future#cancel(boolean) and pass true to request that a running job be interrupted.
Unless you're calling on code out of your control that swallows interrupt signals (say, by catching InterruptedException without calling Thread.currentThread().interrupt()), using the built-in cooperative interruption facility is a better choice than introducing your own flags to approximate what's already there.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.