In my Java web app I have a method which ends out about 200 emails. Because of email server delay the whole process takes about 7 minutes. This bulk email sending has to take place as the result of user action. I of course don't want the user to have to wait that long before they are forwarded to the next, not mention that Apache times out anyway, so I am attempting to implement FutureTask to get the process to run in a separate thread while proceed with the rest of the code like this:
Some code;
Runnable r = (Runnable)new sendEmails(ids);
FutureTask task = new FutureTask(r, null);
Thread t = new Thread(task);
t.start();
Some more code;
The app, however, still waits for the FutureTask to finish before proceeding. I am open to the idea that this also not the best way to run some code on the side in another thread while continuing with the rest of the script. Are there better ways/How do I make this one work?
It looks like you are spinning up 200+ threads in a for loop. That will place a high burden on the machine, and due to the size of each stack that is allocated with each thread it will not take too many threads before the JVM runs out of memory, initially causing much GC and JVM locking up and then potentially under high enough load, a crash.
Sadly this may or may not explain why your code is waiting for the FutureTasks to complete. It may only appear to be waiting to due thrashing by creating/scheduling so many threads; but then again it may not. There could very well be something else synchronizing your code that has been cut out of the snippet above.
A way for you to find if there is a tricksy synchronisation hiding somewhere would be to hit ctrl-break while running the code (assuming that you are running from a command line, intellij/eclipse both have a stack dump icon that is handy). This will cause a stack dump for every thread in the system to appear. By doing this you will be able to find the user thread that is waiting for the future tasks to complete, and it will say which monitor it is waiting on. If it is not waiting, then you have a different problem. For example the system thrashes creating so many threads in short order that it appears to lock up or some such for a short period of time.
But first I would avoid the excessive Thread creation part, as that could be masking the issue. I suggest using code similar to the following:
ExecutorService scheduler = Executors.newCachedThreadPool()
scheduler.submit( task )
Related
I'm having a game server based on Java.
1 user need to use 2 threads to send and receive data. But whenever the thread come to 200-300 threads, the function where execute data doesnt work anymore. CPU, RAM of the server is not full, just around 15-20%.
I tried to use "garbage collector" when user disconnect, but this still happen.
Thanks for helping. Sorry with my bad English.
Your service, should ideally never be creating "too many threads".
Opt for a thread-pool using ExecutorService.
Number of threads you want to create a pool with, depends upon the kind of underlying task you have.
From a general practice:
1: For a CPU Intensive task your number of threads should be equal to
Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
2: For an IO Intensive task you can create more number of threads than the number of available processors as most of your threads will be waiting if the IO task is taking long.
300 Java Threads does not sound like too much (compare with default settings for application servers like Wildfly). If your application is getting stuck but neither CPU nor RAM are the bottleneck, maybe try to figure out what is happening. You may be facing threads waiting for each other to finish as well.
Thus I recommend to look at the thread dump to see where the threads might be stuck. Check out Generate a Java thread dump without restarting.
Try checking out java ExecutorService to create a thread pool with a fixed number of threads.
In an open source application I'm participating, we've got a bug, where the application doesn't always close properly. That's what I'd like to solve.
Experience has shown that this happens most of the time when threads and processes are being started, but not managed correctly (e.g. a thread is waiting on a socket connection, the application is being shut down and the thread keeps on waiting).
With this in mind I've searched for '.start()' in the entire source and found 53 occurrences (which scared me a bit).
As a first step, I wanted to create a helper class (ThreadExecutor) where the current code 'thread.start()' would be replaced by 'ThreadExecutor.Execute(thread)' to have a) only a few changes in the existing source and b) a single class where I can easily check which threads don't end as they should. To do this I wanted to
add the thread to be executed to a list called activeThreads when calling the Execute method
start the thread
remove it from the activeThreads list when it ends.
This way I'd have an up to date list of all executing threads and when the app hangs on shutdown I could see in there which thread(s) is(are) causing it.
Questions:
What do you think about the concept? I'm usually coding c# and know how I'd do it using .NET with workers, but am not too sure what's best in Java (I'd like to modify as few lines of code as possible in the existing source).
If the concept seems ok, how can I get notified of a thread terminating. I'd like to avoid having an additional thread checking every once in a while what the state of all threads contained in activeThreads is, to remove them if they terminated.
Just to clarify: Before figuring out how to terminate the application properly, what I'm asking here is what's the best/easiest way to find which threads are at cause for certain test cases which are pretty hard to reproduce.
I would attempt to analyze your application's behavior before changing any code. Code changes can introduce new problems - not what you want to do if you're trying to solve problems.
The easiest way to introspect the state of your application with regard to which threads are currently running is to obtain a thread dump. You said that your problem is that the application hangs on shutdown. This is the perfect scenario to apply a thread dump. You'll be able to see which threads are blocked.
You can read more about thread dumps here.
Try to make all threads daemon(when all remaining threads are daemon the JVM terminates). Use thread.setDaemon(true) before starting each thread.
You could try to look into your application using jvisualvm (which is shipped with the jdk, find it in the bin folder of your jdk). JVisualVM can connect to your application and display a lot of interesting information, including which processes are still running. I'd give that a shot before starting down the road you describe.
Here is some documentation on JVisualVM should you need it.
The best way in java is to use Thread pools instead of Threads directly (although using threads directly is accepted). Thread pools accept Runnable objects, which you can see as Tasks. The idea is that most threads would do a small task and then end, because making a Thread is expensive and harder to manager you can use the threadpool, which allows things like 'ThreadPoolExecutor.awaitTermination()`. If you have more tasks than Threads in the pool, remaining tasks will just be queued.
Changing a Thread into a Runnable is easy, and you can even execute a runnable on a Thread you make yourself.
Note that this might not work for threads that run a long time, but your question seems to suggest that they will eventually finish.
As for your second question, the best way to find out which threads are running at a certain point is to run the application in a debugger (such as Eclipse) and pause all threads on a breakpoint in the close function.
I would try the trial edition of jprofiler or something similar, which gives you a lot of insight into what your application and its threads actually do.
Don't change the code yet, but try to reproduce and understand when this happens.
Create yourself a static thread pool.
static ExecutorService threads = Executors.newCachedThreadPool();
For every start of thread change:
new Thread(new AThread()).start();
to
threads.submit(new AThread ());
When your code exits, list all running threads with:
List<Runnable> runningThreads = threads.shutdownNow();
for ( Runnable t : runningThreads ) {
System.out.println("Thread running at shutdown: "+t.toString());
}
This will not only shut down all running threads, it will list them out for you to see what their issue is.
EDIT: Added
If you want to keep track of all running threads use:
Future f = threads.submit(new AThread ());
and store it in a list somewhere. You can then find out about its state with calls like:
f.isDone();
... etc.
When writing a multithread internet server in java, the main-thread starts new
ones to serve incoming requests in parallel.
Is any problem if the main-thread does not wait ( with .join()) for them?
(It is obviously absurd create a new thread and then, wait for it).
I know that, in a practical situation, you should (or "you must"?) implement a pool
of threads to "re-use" them for new requests when they become idle.
But for small applications, should we use a pool of threads?
You don't need to wait for threads.
They can either complete running on their own (if they've been spawned to perform one particular task), or run indefinitely (e.g. in a server-type environment).
They should handle interrupts and respond to shutdown requests, however. See this article on how to do this correctly.
If you need a set of threads I would use a pool and executor methods since they'll look after thread resource management for you. If you're writing a multi-threaded network server then I would investigating using (say) a servlet container or a framework such as Mina.
The only problem in your approach is that it does not scale well beyond a certain request rate. If the requests are coming in faster than your server is able to handle them, the number of threads will rise continuously. As each thread adds some overhead and uses CPU time, the time for handling each request will get longer, so the problem will get worse (because the number of threads rises even faster). Eventually no request will be able to get handled anymore because all of the CPU time is wasted with overhead. Probably your application will crash.
The alternative is to use a ThreadPool with a fixed upper bound of threads (which depends on the power of the hardware). If there are more requests than the threads are able to handle, some requests will have to wait too long in the request queue, and will fail due to a timeout. But the application will still be able to handle the rest of the incoming requests.
Fortunately the Java API already provides a nice and flexible ThreadPool implementation, see ThreadPoolExecutor. Using this is probably even easier than implementing everything with your original approach, so no reason not to use it.
Thread.join() lets you wait for the Thread to end, which is mostly contrary to what you want when starting a new Thread. At all, you start the new thread to do stuff in parallel to the original Thread.
Only if you really need to wait for the spawned thread to finish, you should join() it.
You should wait for your threads if you need their results or need to do some cleanup which is only possible after all of them are dead, otherwise not.
For the Thread-Pool: I would use it whenever you have some non-fixed number of tasks to run, i.e. if the number depends on the input.
I would like to collect the main ideas of this interesting (for me) question.
I can't totally agree with "you
don't need to wait for threads".
Only in the sense that if you don't
join a thread (and don't have a
pointer to it) once the thread is
done, its resources are freed
(right? I'm not sure).
The use of a thread pool is only
necessary to avoid the overhead of
thread creation, because ...
You can limit the number of parallel
running threads by accounting, with shared variables (and without a thread pool), how many of then
were started but not yet finished.
I'm currently working on a daemon that will be doing A LOT of different tasks. It's multi threaded and is being built to handle almost any kind of internal-error without crashing. Well I'm getting to the point of handling a shutdown request and I'm not sure how I should go about doing it.
I have a shutdown hook setup, and when it's called it sets a variable telling the main daemon loop to stop running. The problem is, this daemon spawns multiple threads and they can take a long time. For instance, one of these threads could be converting a document. Most of them will be quick (I'm guessing under 10 seconds), but there will be threads that can last as long as 10+ minutes.
What I'm thinking of doing right now is when a shutdown hook has been sent, do a loop for like 5 seconds on ThreadGroup.activeCount() with a 500ms (or so) Sleep (all these threads are in a ThreadGroup) and before this loop, I will send a notification to all threads telling them a shutdown request has been called. Then they will have to instantly no matter what they're doing cleanup and shutdown.
Anyone else have any suggestions? I'm interested in what a daemon like MySQL for instance does when it gets told to stop, it stops instantly. What happens if like 10 query's are running that are very slow are being called? Does it wait or does it just end them. I mean servers are really quick, so there really isn't any kind of operation that I shouldn't be able to do in less than a second. You can do A LOT in 1000ms now days.
Thanks
The java.util.concurrent package provides a number of utilities, such as ThreadPoolExecutor (along with various specialized types of other Executor implementations from the Executors class) and ThreadPoolExecutor.awaitTermination(), which you might want to look into - as they provide the same exact functionality you are looking to implement. This way you can concentrate on implementing the actual functionality of your application/tasks instead of worrying about things like thread and task scheduling.
Are your thread jobs amenable to interruption via Thread#interrupt()? Do they mostly call on functions that themselves advertise throwing InterruptedException? If so, then the aforementioned java.util.concurrent.ExecutorService#shutdownNow() is the way to go. It will interrupt any running threads and return the list of jobs that were never started.
Similarly, if you hang on to the Futures produced by ExecutorService#submit(), you can use Future#cancel(boolean) and pass true to request that a running job be interrupted.
Unless you're calling on code out of your control that swallows interrupt signals (say, by catching InterruptedException without calling Thread.currentThread().interrupt()), using the built-in cooperative interruption facility is a better choice than introducing your own flags to approximate what's already there.
I'm using a java.util.concurrent.ExecutorService that I obtained by calling Executors.newSingleThreadExecutor(). This ExecutorService can sometimes stop processing tasks, even though it has not been shutdown and continues to accept new tasks without throwing exceptions. Eventually, it builds up enough of a queue that my app shuts down with OutOfMemoryError exceptions.
The documentation seem to indicate that this single thread executor should survive task processing errors by firing up a new worker thread if necessary to replace one that has died. Am I missing something?
It sounds like you have two different issues:
1) You're over-feeding the work queue. You can't just keep stuffing new tasks into the queue, with no regard for the consumption rate of the task executors. You need to figure out some logic for knowing when you to block new additions to the work queue.
2) Any uncaught exception in a task's thread can completely kill the thread. When that happens, the ExecutorService spins up a new thread to replace it. But that doesn't mean you can ignore whatever problem is causing the thread to die in the first place! Find those uncaught exceptions and catch them!
This is just a hunch (cuz there's not enough info in your post to know otherwise), but I don't think your problem is that the task executor stops processing tasks. My guess is that it just doesn't process tasks as fast as you're creating them. (And the fact that your tasks sometimes die prematurely is probably orthogonal to the problem.)
At least, that's been my experience working with thread pools and task executors.
Okay, here's another possibility that sounds feasible based on your comment (that everything will run smoothly for hours until suddenly coming to a crashing halt)...
You might have a rare deadlock between your task threads. Most of the time, you get lucky, and the deadlock doesn't manifest itself. But occasionally, two or more of your task threads get into a state where they're waiting for the release of a lock held by the other thread. At that point, no more task processing can take place, and your work queue will pile up until you get the OutOfMemoryError.
Here's how I'd diagnose that problem:
Eliminate ALL shared state between your task threads. At first, this might require each task thread making a defensive copy of all shared data structures it requires. Once you've done that, it should be completely impossible to experience a deadlock.
At this point, gradually reintroduced the shared data structures, one at a time (with appropriate synchronization). Re-run your application after each tiny modification to test for the deadlock. When you get that crashing situation again, take a close look at the access patterns for the shared resource and determine whether you really need to share it.
As for me, whenever I write code that processes parallel tasks with thread pools and executors, I always try to eliminate ALL shared state between those tasks. As far as the application is concerned, they may as well be completely autonomous applications. Hunting down deadlocks is a drag, and in my experience, the best way to eliminate deadlocks is for each thread to have its own local state rather than sharing any state with other task threads.
Good luck!
My guess would be that your tasks are blocking indefinitely, rather than dying. Do you have evidence, such as a log statement at the end of your task, suggest that your tasks are successfully completing?
This could be a deadlock, or an interaction with some external process that is blocking.
Although you don't leave enough detail to be sure, the first thing I'd try is to have your tasks catch "Exception" at the top level and log the message.
I know it doesn't seem right, but occasionally (depending on a lot of variables) I've worked on code where stuff happening in a thread throws an exception and it is never logged, or it just doesn't show up on the console--yet the "executing" code exits out of it's top level loop or whatever code is causing your task to run.
I guess I'm just saying, make sure your tasks are not throwing an exception out.