Analysing a multi-threaded Java application - java

In an open source application I'm participating, we've got a bug, where the application doesn't always close properly. That's what I'd like to solve.
Experience has shown that this happens most of the time when threads and processes are being started, but not managed correctly (e.g. a thread is waiting on a socket connection, the application is being shut down and the thread keeps on waiting).
With this in mind I've searched for '.start()' in the entire source and found 53 occurrences (which scared me a bit).
As a first step, I wanted to create a helper class (ThreadExecutor) where the current code 'thread.start()' would be replaced by 'ThreadExecutor.Execute(thread)' to have a) only a few changes in the existing source and b) a single class where I can easily check which threads don't end as they should. To do this I wanted to
add the thread to be executed to a list called activeThreads when calling the Execute method
start the thread
remove it from the activeThreads list when it ends.
This way I'd have an up to date list of all executing threads and when the app hangs on shutdown I could see in there which thread(s) is(are) causing it.
Questions:
What do you think about the concept? I'm usually coding c# and know how I'd do it using .NET with workers, but am not too sure what's best in Java (I'd like to modify as few lines of code as possible in the existing source).
If the concept seems ok, how can I get notified of a thread terminating. I'd like to avoid having an additional thread checking every once in a while what the state of all threads contained in activeThreads is, to remove them if they terminated.
Just to clarify: Before figuring out how to terminate the application properly, what I'm asking here is what's the best/easiest way to find which threads are at cause for certain test cases which are pretty hard to reproduce.

I would attempt to analyze your application's behavior before changing any code. Code changes can introduce new problems - not what you want to do if you're trying to solve problems.
The easiest way to introspect the state of your application with regard to which threads are currently running is to obtain a thread dump. You said that your problem is that the application hangs on shutdown. This is the perfect scenario to apply a thread dump. You'll be able to see which threads are blocked.
You can read more about thread dumps here.

Try to make all threads daemon(when all remaining threads are daemon the JVM terminates). Use thread.setDaemon(true) before starting each thread.

You could try to look into your application using jvisualvm (which is shipped with the jdk, find it in the bin folder of your jdk). JVisualVM can connect to your application and display a lot of interesting information, including which processes are still running. I'd give that a shot before starting down the road you describe.
Here is some documentation on JVisualVM should you need it.

The best way in java is to use Thread pools instead of Threads directly (although using threads directly is accepted). Thread pools accept Runnable objects, which you can see as Tasks. The idea is that most threads would do a small task and then end, because making a Thread is expensive and harder to manager you can use the threadpool, which allows things like 'ThreadPoolExecutor.awaitTermination()`. If you have more tasks than Threads in the pool, remaining tasks will just be queued.
Changing a Thread into a Runnable is easy, and you can even execute a runnable on a Thread you make yourself.
Note that this might not work for threads that run a long time, but your question seems to suggest that they will eventually finish.
As for your second question, the best way to find out which threads are running at a certain point is to run the application in a debugger (such as Eclipse) and pause all threads on a breakpoint in the close function.

I would try the trial edition of jprofiler or something similar, which gives you a lot of insight into what your application and its threads actually do.
Don't change the code yet, but try to reproduce and understand when this happens.

Create yourself a static thread pool.
static ExecutorService threads = Executors.newCachedThreadPool();
For every start of thread change:
new Thread(new AThread()).start();
to
threads.submit(new AThread ());
When your code exits, list all running threads with:
List<Runnable> runningThreads = threads.shutdownNow();
for ( Runnable t : runningThreads ) {
System.out.println("Thread running at shutdown: "+t.toString());
}
This will not only shut down all running threads, it will list them out for you to see what their issue is.
EDIT: Added
If you want to keep track of all running threads use:
Future f = threads.submit(new AThread ());
and store it in a list somewhere. You can then find out about its state with calls like:
f.isDone();
... etc.

Related

What is a preferred way of closing a third-application thread without waiting for it to complete?

I am currently running the JAR that I cannot change, and sometimes it simply gets stuck for no good reason. I have tried finding the ways to interrupt the thread, stop the thread, etceteras, but no luck.
Each solution offered was about doing the complete exit or waiting for a thread to complete.
What I want to do is to simply close the thread, exactly when the timeout completes, and carry on with the program.
What I do not want to do is use the while loop with a timeout, java.util.concurrent.Future, System.exit, and make a Thread.interrupt call.
None of these will help!
You can't forcibly stop a thread in mid-execution. The Thread.destroy() method would have done that, but it was never implemented, and its documentation explains why it would be unsafe to use even if it worked.
There are some other deprecated methods like Thread.stop() and Thread.suspend() which may actually work, but they're also unsafe to use; again, their documentation explains why.
Telling the thread that it should terminate itself, and then waiting for it to do so, is the only safe way to stop a thread.
As an workaround, you could run your task in an entirely separate process, so that you can destroy it when you want it to stop. That is safe, since processes are isolated from each other and destroying the child process can't leave the parent process in an unstable state.
Interacting with a separate process is more difficult, though, since you can't share variables between processes like you can with threads. You'd need to send messages through the process's input and output streams.
Actually, you can't really solve this!
What I mean is: even if you would manage to kill "your" thread that you used to trigger the 3rd party code - you have no way of killing threads or processes created by the code you are invoking.
If you want to be absolutely sure to kill all and anything, you might have to look into rather complex solutions like:
instead of just using a thread, you create a new process with a new JVM B
in that JVM B, you can call that library
but of course, that requires that you put additional code around; so that "your" code in JVM A can talk to "your" code in JVM B
And now you might be able to tear down that process, and all artifacts belonging to it. Maybe.
And seriously: to be really really sure that the 3rd party library didn't kick of anything that you can't stop; you might even have to run that JVM inside some kind of container (for example a docker instance). That you could tear down and be sure that everything is gone.
Long story short: I think there is no way to absolutely control the threads created in a thread. If you need that level of control, you need to look into "outsourcing" those calls.
You can use Executor for this. It allows you to submit tasks (e.g. runnable) and executes those tasks parallely. Also, once you call shutdown(), it lets you configure the timeout and kills all the workers if they are not finished by that time. An example would look like this:
ExecutorService executor = Executors.newFixedThreadPool(1);
executor.execute(() -> {
//logic to call the method of third party jar
});
//Other business logic
executor.awaitTermination(1, TimeUnit.MINUTES);
executor.shutdownNow();
TimeUnit is an enum, with values like SECONDS, HOURS, MINUTES etc (here's javadoc) so you can configure different time units. A couple of points:
Once shutdownNow is called, no new tasks will be accepted (i.e. you can't call execute or submit) and existing tasks will be stopped. So, we are basically waiting for a minute for tasks to be complete and if it is not complete, we are killing that task.
awaitTermination throws InterruptedException (as it interrupts the threads internally if they are not finished) so you will have to wrap it inside try-catch block.
Here's javadoc for Executor.

Is Thread to be favoured over Executor here?

As far as I understand Executors help handling the execution of runnables. E.g. I would choose using an executor when I have several worker threads that do their job and then terminate.
The executor would handle the creation and the termination of the Threads needed to execute the worker runnables.
However now I am facing another situation. A fixed number of classes/objects shall encapsulate their own thread. So the thread is started at the creation of those objects and the Thread shall continue running for the whole life time of these objects.
The few objects in turn are created at the start of the programm and exist for the whole run time.
I guess Threads are preferable over Executors in this situation, however when I read the internet everybody seems to suggest using Executors over Threads in any possible situation.
Can somebody please tell me if I want to choose Executors or Threads here and why?
Thanks
You're somewhat mixing things. Executor is just an interface. Thread is a core class. There's nothing which directly implies that Executor implementations execute tasks in separate threads.
Read the first few lines of the JavaDoc.
Executor
So if you want full control, just use Thread and do things on your own.
Without knowing more about the context, it's hard to give a good answer, but generally speaking I'd say that the situations that calls for using Thread are pretty few and far between. If you start trying to synchronize your program "manually" using synchronized I bet things will get out of hand quickly. (Not to mention how hard it will be to debug the code.)
Last time I used a thread was when I wanted to record some audio in the background. It was a "start"/"stop" kind of thing, and not "task oriented". (I tried long and hard to try to find an audio library that would encapsulate that for me but failed.)
If you choose to go for a thread-solution, I suggest you try to limit the scope of the thread to only execute within the associated object. This will to an as large extent as possible avoid forcing you to think about happens-before relations, thread-safe publishing of values etc throughout the code.
ExecutorService can have thread pool
It optimizes performance, because creating a Thread is expensive.
ExecutorService has life cycle control
shutdown(), shutdownNow() etc are provided.
ExecutorService is flexible
You could invoke variety of behaviors: customize ThreadFactory, set thread pool size, delay behavior ScheduledThreadPoolExecutor etc...

Trouble using FutureTask for Asynchronous proceedures

In my Java web app I have a method which ends out about 200 emails. Because of email server delay the whole process takes about 7 minutes. This bulk email sending has to take place as the result of user action. I of course don't want the user to have to wait that long before they are forwarded to the next, not mention that Apache times out anyway, so I am attempting to implement FutureTask to get the process to run in a separate thread while proceed with the rest of the code like this:
Some code;
Runnable r = (Runnable)new sendEmails(ids);
FutureTask task = new FutureTask(r, null);
Thread t = new Thread(task);
t.start();
Some more code;
The app, however, still waits for the FutureTask to finish before proceeding. I am open to the idea that this also not the best way to run some code on the side in another thread while continuing with the rest of the script. Are there better ways/How do I make this one work?
It looks like you are spinning up 200+ threads in a for loop. That will place a high burden on the machine, and due to the size of each stack that is allocated with each thread it will not take too many threads before the JVM runs out of memory, initially causing much GC and JVM locking up and then potentially under high enough load, a crash.
Sadly this may or may not explain why your code is waiting for the FutureTasks to complete. It may only appear to be waiting to due thrashing by creating/scheduling so many threads; but then again it may not. There could very well be something else synchronizing your code that has been cut out of the snippet above.
A way for you to find if there is a tricksy synchronisation hiding somewhere would be to hit ctrl-break while running the code (assuming that you are running from a command line, intellij/eclipse both have a stack dump icon that is handy). This will cause a stack dump for every thread in the system to appear. By doing this you will be able to find the user thread that is waiting for the future tasks to complete, and it will say which monitor it is waiting on. If it is not waiting, then you have a different problem. For example the system thrashes creating so many threads in short order that it appears to lock up or some such for a short period of time.
But first I would avoid the excessive Thread creation part, as that could be masking the issue. I suggest using code similar to the following:
ExecutorService scheduler = Executors.newCachedThreadPool()
scheduler.submit( task )

How do I suspend java threads on demand?

I am working on a multithreaded game in java. I have several worker threads that fetch modules from a central thread manager, which then executes it on its own. Now I would like to be able to pause such a thread if it temporarily has nothing to execute. I have tried calling the wait() method on it from the thread manager, but that only resulted in it ignoring the notify() call that followed it.
I googled a bit on it too, only finding that most sites refer to functions like suspend(), pause(), etc, which are now marked a deprecated on the java documentation pages.
So in general, what is the way to pause or continue a thread on demand?
You can use an if block in the thread with a sentinal variable that is set to false if you want to halt the thread's action. This works best if the thread is performing loops.
Maybe I'm missing the point, but if they have nothing to do, why not just let them die? Then spawn a new thread when you have work for one to do again.
It sounds to me like you're trying to have the conversation both ways. In my (humble) opinion, you should either have the worker threads responsible for asking the central thread manager for work (or 'modules'), or you should have the central thread manager responsible for doling out work and kicking off the worker threads.
What it sounds like is that most of the time the worker threads are responsible for asking for work. Then, sometimes, the responsibility flips round to the thread manager to tell the workers not to ask for a while. I think the system will stay simpler if this responsibility stays on only one side.
So, given this, and with my limited knowledge of what you're developing, I would suggest either:
Have the thread manager kick of worker threads when there's stuff to do and keep track of their progress, letting them die when they're done and only creating new ones when there's new stuff to do. Or
Have a set number of always existing worker threads that poll the thread manager for work and (if there isn't any) sleep for a period of time using Thread.sleep() before trying again. This seems pretty wasteful to me so I would lean towards option 1 unless you've a good reason not to?
In the grand tradition of not answering your question, and suggest that You Are Doing It Wrong, I Offer this :-)
Maybe you should refactor your code to use a ExecutorService, its a rather good design.
http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html
There are many ways to do this, but in the commonest (IMO), the worker thread calls wait() on the work queue, while the work generator should call notify(). This causes the worker thread to stop, without the thread manager doing anything. See e.g. this article on thread pools and work queues.
use a blocking queue to fetch those modules using take()
or poll(time,unit) for a timed out wait so you can cleanly shutdown
these will block the current thread until a module is available

what would make a single task executor stop processing tasks?

I'm using a java.util.concurrent.ExecutorService that I obtained by calling Executors.newSingleThreadExecutor(). This ExecutorService can sometimes stop processing tasks, even though it has not been shutdown and continues to accept new tasks without throwing exceptions. Eventually, it builds up enough of a queue that my app shuts down with OutOfMemoryError exceptions.
The documentation seem to indicate that this single thread executor should survive task processing errors by firing up a new worker thread if necessary to replace one that has died. Am I missing something?
It sounds like you have two different issues:
1) You're over-feeding the work queue. You can't just keep stuffing new tasks into the queue, with no regard for the consumption rate of the task executors. You need to figure out some logic for knowing when you to block new additions to the work queue.
2) Any uncaught exception in a task's thread can completely kill the thread. When that happens, the ExecutorService spins up a new thread to replace it. But that doesn't mean you can ignore whatever problem is causing the thread to die in the first place! Find those uncaught exceptions and catch them!
This is just a hunch (cuz there's not enough info in your post to know otherwise), but I don't think your problem is that the task executor stops processing tasks. My guess is that it just doesn't process tasks as fast as you're creating them. (And the fact that your tasks sometimes die prematurely is probably orthogonal to the problem.)
At least, that's been my experience working with thread pools and task executors.
Okay, here's another possibility that sounds feasible based on your comment (that everything will run smoothly for hours until suddenly coming to a crashing halt)...
You might have a rare deadlock between your task threads. Most of the time, you get lucky, and the deadlock doesn't manifest itself. But occasionally, two or more of your task threads get into a state where they're waiting for the release of a lock held by the other thread. At that point, no more task processing can take place, and your work queue will pile up until you get the OutOfMemoryError.
Here's how I'd diagnose that problem:
Eliminate ALL shared state between your task threads. At first, this might require each task thread making a defensive copy of all shared data structures it requires. Once you've done that, it should be completely impossible to experience a deadlock.
At this point, gradually reintroduced the shared data structures, one at a time (with appropriate synchronization). Re-run your application after each tiny modification to test for the deadlock. When you get that crashing situation again, take a close look at the access patterns for the shared resource and determine whether you really need to share it.
As for me, whenever I write code that processes parallel tasks with thread pools and executors, I always try to eliminate ALL shared state between those tasks. As far as the application is concerned, they may as well be completely autonomous applications. Hunting down deadlocks is a drag, and in my experience, the best way to eliminate deadlocks is for each thread to have its own local state rather than sharing any state with other task threads.
Good luck!
My guess would be that your tasks are blocking indefinitely, rather than dying. Do you have evidence, such as a log statement at the end of your task, suggest that your tasks are successfully completing?
This could be a deadlock, or an interaction with some external process that is blocking.
Although you don't leave enough detail to be sure, the first thing I'd try is to have your tasks catch "Exception" at the top level and log the message.
I know it doesn't seem right, but occasionally (depending on a lot of variables) I've worked on code where stuff happening in a thread throws an exception and it is never logged, or it just doesn't show up on the console--yet the "executing" code exits out of it's top level loop or whatever code is causing your task to run.
I guess I'm just saying, make sure your tasks are not throwing an exception out.

Categories