How Do You Test Method For Thread Safety With JUnit [duplicate] - java
I have thus far avoided the nightmare that is testing multi-threaded code since it just seems like too much of a minefield. I'd like to ask how people have gone about testing code that relies on threads for successful execution, or just how people have gone about testing those kinds of issues that only show up when two threads interact in a given manner?
This seems like a really key problem for programmers today, it would be useful to pool our knowledge on this one imho.
Look, there's no easy way to do this. I'm working on a project that is inherently multithreaded. Events come in from the operating system and I have to process them concurrently.
The simplest way to deal with testing complex, multithreaded application code is this: If it's too complex to test, you're doing it wrong. If you have a single instance that has multiple threads acting upon it, and you can't test situations where these threads step all over each other, then your design needs to be redone. It's both as simple and as complex as this.
There are many ways to program for multithreading that avoids threads running through instances at the same time. The simplest is to make all your objects immutable. Of course, that's not usually possible. So you have to identify those places in your design where threads interact with the same instance and reduce the number of those places. By doing this, you isolate a few classes where multithreading actually occurs, reducing the overall complexity of testing your system.
But you have to realize that even by doing this, you still can't test every situation where two threads step on each other. To do that, you'd have to run two threads concurrently in the same test, then control exactly what lines they are executing at any given moment. The best you can do is simulate this situation. But this might require you to code specifically for testing, and that's at best a half step towards a true solution.
Probably the best way to test code for threading issues is through static analysis of the code. If your threaded code doesn't follow a finite set of thread safe patterns, then you might have a problem. I believe Code Analysis in VS does contain some knowledge of threading, but probably not much.
Look, as things stand currently (and probably will stand for a good time to come), the best way to test multithreaded apps is to reduce the complexity of threaded code as much as possible. Minimize areas where threads interact, test as best as possible, and use code analysis to identify danger areas.
It's been a while when this question was posted, but it's still not answered ...
kleolb02's answer is a good one. I'll try going into more details.
There is a way, which I practice for C# code. For unit tests you should be able to program reproducible tests, which is the biggest challenge in multithreaded code. So my answer aims toward forcing asynchronous code into a test harness, which works synchronously.
It's an idea from Gerard Meszaros's book "xUnit Test Patterns" and is called "Humble Object" (p. 695): You have to separate core logic code and anything which smells like asynchronous code from each other. This would result to a class for the core logic, which works synchronously.
This puts you into the position to test the core logic code in a synchronous way. You have absolute control over the timing of the calls you are doing on the core logic and thus can make reproducible tests. And this is your gain from separating core logic and asynchronous logic.
This core logic needs be wrapped around by another class, which is responsible for receiving calls to the core logic asynchronously and delegates these calls to the core logic. Production code will only access the core logic via that class. Because this class should only delegate calls, it's a very "dumb" class without much logic. So you can keep your unit tests for this asychronous working class at a minimum.
Anything above that (testing interaction between classes) are component tests. Also in this case, you should be able to have absolute control over timing, if you stick to the "Humble Object" pattern.
Tough one indeed! In my (C++) unit tests, I've broken this down into several categories along the lines of the concurrency pattern used:
Unit tests for classes that operate in a single thread and aren't thread aware -- easy, test as usual.
Unit tests for Monitor objects (those that execute synchronized methods in the callers' thread of control) that expose a synchronized public API -- instantiate multiple mock threads that exercise the API. Construct scenarios that exercise internal conditions of the passive object. Include one longer running test that basically beats the heck out of it from multiple threads for a long period of time. This is unscientific I know but it does build confidence.
Unit tests for Active objects (those that encapsulate their own thread or threads of control) -- similar to #2 above with variations depending on the class design. Public API may be blocking or non-blocking, callers may obtain futures, data may arrive at queues or need to be dequeued. There are many combinations possible here; white box away. Still requires multiple mock threads to make calls to the object under test.
As an aside:
In internal developer training that I do, I teach the Pillars of Concurrency and these two patterns as the primary framework for thinking about and decomposing concurrency problems. There's obviously more advanced concepts out there but I've found that this set of basics helps keep engineers out of the soup. It also leads to code that is more unit testable, as described above.
I have faced this issue several times in recent years when writing thread handling code for several projects. I'm providing a late answer because most of the other answers, while providing alternatives, do not actually answer the question about testing. My answer is addressed to the cases where there is no alternative to multithreaded code; I do cover code design issues for completeness, but also discuss unit testing.
Writing testable multithreaded code
The first thing to do is to separate your production thread handling code from all the code that does actual data processing. That way, the data processing can be tested as singly threaded code, and the only thing the multithreaded code does is to coordinate threads.
The second thing to remember is that bugs in multithreaded code are probabilistic; the bugs that manifest themselves least frequently are the bugs that will sneak through into production, will be difficult to reproduce even in production, and will thus cause the biggest problems. For this reason, the standard coding approach of writing the code quickly and then debugging it until it works is a bad idea for multithreaded code; it will result in code where the easy bugs are fixed and the dangerous bugs are still there.
Instead, when writing multithreaded code, you must write the code with the attitude that you are going to avoid writing the bugs in the first place. If you have properly removed the data processing code, the thread handling code should be small enough - preferably a few lines, at worst a few dozen lines - that you have a chance of writing it without writing a bug, and certainly without writing many bugs, if you understand threading, take your time, and are careful.
Writing unit tests for multithreaded code
Once the multithreaded code is written as carefully as possible, it is still worthwhile writing tests for that code. The primary purpose of the tests is not so much to test for highly timing dependent race condition bugs - it's impossible to test for such race conditions repeatably - but rather to test that your locking strategy for preventing such bugs allows for multiple threads to interact as intended.
To properly test correct locking behavior, a test must start multiple threads. To make the test repeatable, we want the interactions between the threads to happen in a predictable order. We don't want to externally synchronize the threads in the test, because that will mask bugs that could happen in production where the threads are not externally synchronized. That leaves the use of timing delays for thread synchronization, which is the technique that I have used successfully whenever I've had to write tests of multithreaded code.
If the delays are too short, then the test becomes fragile, because minor timing differences - say between different machines on which the tests may be run - may cause the timing to be off and the test to fail. What I've typically done is start with delays that cause test failures, increase the delays so that the test passes reliably on my development machine, and then double the delays beyond that so the test has a good chance of passing on other machines. This does mean that the test will take a macroscopic amount of time, though in my experience, careful test design can limit that time to no more than a dozen seconds. Since you shouldn't have very many places requiring thread coordination code in your application, that should be acceptable for your test suite.
Finally, keep track of the number of bugs caught by your test. If your test has 80% code coverage, it can be expected to catch about 80% of your bugs. If your test is well designed but finds no bugs, there's a reasonable chance that you don't have additional bugs that will only show up in production. If the test catches one or two bugs, you might still get lucky. Beyond that, and you may want to consider a careful review of or even a complete rewrite of your thread handling code, since it is likely that code still contains hidden bugs that will be very difficult to find until the code is in production, and very difficult to fix then.
I also had serious problems testing multi- threaded code. Then I found a really cool solution in "xUnit Test Patterns" by Gerard Meszaros. The pattern he describes is called Humble object.
Basically it describes how you can extract the logic into a separate, easy-to-test component that is decoupled from its environment. After you tested this logic, you can test the complicated behaviour (multi- threading, asynchronous execution, etc...)
There are a few tools around that are quite good. Here is a summary of some of the Java ones.
Some good static analysis tools include FindBugs (gives some useful hints), JLint, Java Pathfinder (JPF & JPF2), and Bogor.
MultithreadedTC is quite a good dynamic analysis tool (integrated into JUnit) where you have to set up your own test cases.
ConTest from IBM Research is interesting. It instruments your code by inserting all kinds of thread modifying behaviours (e.g. sleep & yield) to try to uncover bugs randomly.
SPIN is a really cool tool for modelling your Java (and other) components, but you need to have some useful framework. It is hard to use as is, but extremely powerful if you know how to use it. Quite a few tools use SPIN underneath the hood.
MultithreadedTC is probably the most mainstream, but some of the static analysis tools listed above are definitely worth looking at.
Awaitility can also be useful to help you write deterministic unit tests. It allows you to wait until some state somewhere in your system is updated. For example:
await().untilCall( to(myService).myMethod(), greaterThan(3) );
or
await().atMost(5,SECONDS).until(fieldIn(myObject).ofType(int.class), equalTo(1));
It also has Scala and Groovy support.
await until { something() > 4 } // Scala example
Another way to (kinda) test threaded code, and very complex systems in general is through Fuzz Testing.
It's not great, and it won't find everything, but its likely to be useful and its simple to do.
Quote:
Fuzz testing or fuzzing is a software testing technique that provides random data("fuzz") to the inputs of a program. If the program fails (for example, by crashing, or by failing built-in code assertions), the defects can be noted. The great advantage of fuzz testing is that the test design is extremely simple, and free of preconceptions about system behavior.
...
Fuzz testing is often used in large software development projects that employ black box testing. These projects usually have a budget to develop test tools, and fuzz testing is one of the techniques which offers a high benefit to cost ratio.
...
However, fuzz testing is not a substitute for exhaustive testing or formal methods: it can only provide a random sample of the system's behavior, and in many cases passing a fuzz test may only demonstrate that a piece of software handles exceptions without crashing, rather than behaving correctly. Thus, fuzz testing can only be regarded as a bug-finding tool rather than an assurance of quality.
Testing MT code for correctness is, as already stated, quite a hard problem. In the end it boils down to ensuring that there are no incorrectly synchronised data races in your code. The problem with this is that there are infinitely many possibilities of thread execution (interleavings) over which you do not have much control (be sure to read this article, though). In simple scenarios it might be possible to actually prove correctness by reasoning but this is usually not the case. Especially if you want to avoid/minimize synchronization and not go for the most obvious/easiest synchronization option.
An approach that I follow is to write highly concurrent test code in order to make potentially undetected data races likely to occur. And then I run those tests for some time :) I once stumbled upon a talk where some computer scientist where showing off a tool that kind of does this (randomly devising test from specs and then running them wildly, concurrently, checking for the defined invariants to be broken).
By the way, I think this aspect of testing MT code has not been mentioned here: identify invariants of the code that you can check for randomly. Unfortunately, finding those invariants is quite a hard problem, too. Also they might not hold all the time during execution, so you have to find/enforce executions points where you can expect them to be true. Bringing the code execution to such a state is also a hard problem (and might itself incur concurrency issues. Whew, it's damn hard!
Some interesting links to read:
Deterministic interleaving: A framework that allows to force certain thread interleavings and then check for invariants
jMock Blitzer : Stress test synchronization
assertConcurrent : JUnit version of stress testing synronization
Testing concurrent code : Short overview of the two primary methods of brute force (stress test) or deterministic (going for the invariants)
I've done a lot of this, and yes it sucks.
Some tips:
GroboUtils for running multiple test threads
alphaWorks ConTest to instrument classes to cause interleavings to vary between iterations
Create a throwable field and check it in tearDown (see Listing 1). If you catch a bad exception in another thread, just assign it to throwable.
I created the utils class in Listing 2 and have found it invaluable, especially waitForVerify and waitForCondition, which will greatly increase the performance of your tests.
Make good use of AtomicBoolean in your tests. It is thread safe, and you'll often need a final reference type to store values from callback classes and suchlike. See example in Listing 3.
Make sure to always give your test a timeout (e.g., #Test(timeout=60*1000)), as concurrency tests can sometimes hang forever when they're broken.
Listing 1:
#After
public void tearDown() {
if ( throwable != null )
throw throwable;
}
Listing 2:
import static org.junit.Assert.fail;
import java.io.File;
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Proxy;
import java.util.Random;
import org.apache.commons.collections.Closure;
import org.apache.commons.collections.Predicate;
import org.apache.commons.lang.time.StopWatch;
import org.easymock.EasyMock;
import org.easymock.classextension.internal.ClassExtensionHelper;
import static org.easymock.classextension.EasyMock.*;
import ca.digitalrapids.io.DRFileUtils;
/**
* Various utilities for testing
*/
public abstract class DRTestUtils
{
static private Random random = new Random();
/** Calls {#link #waitForCondition(Integer, Integer, Predicate, String)} with
* default max wait and check period values.
*/
static public void waitForCondition(Predicate predicate, String errorMessage)
throws Throwable
{
waitForCondition(null, null, predicate, errorMessage);
}
/** Blocks until a condition is true, throwing an {#link AssertionError} if
* it does not become true during a given max time.
* #param maxWait_ms max time to wait for true condition. Optional; defaults
* to 30 * 1000 ms (30 seconds).
* #param checkPeriod_ms period at which to try the condition. Optional; defaults
* to 100 ms.
* #param predicate the condition
* #param errorMessage message use in the {#link AssertionError}
* #throws Throwable on {#link AssertionError} or any other exception/error
*/
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms,
Predicate predicate, String errorMessage) throws Throwable
{
waitForCondition(maxWait_ms, checkPeriod_ms, predicate, new Closure() {
public void execute(Object errorMessage)
{
fail((String)errorMessage);
}
}, errorMessage);
}
/** Blocks until a condition is true, running a closure if
* it does not become true during a given max time.
* #param maxWait_ms max time to wait for true condition. Optional; defaults
* to 30 * 1000 ms (30 seconds).
* #param checkPeriod_ms period at which to try the condition. Optional; defaults
* to 100 ms.
* #param predicate the condition
* #param closure closure to run
* #param argument argument for closure
* #throws Throwable on {#link AssertionError} or any other exception/error
*/
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms,
Predicate predicate, Closure closure, Object argument) throws Throwable
{
if ( maxWait_ms == null )
maxWait_ms = 30 * 1000;
if ( checkPeriod_ms == null )
checkPeriod_ms = 100;
StopWatch stopWatch = new StopWatch();
stopWatch.start();
while ( !predicate.evaluate(null) ) {
Thread.sleep(checkPeriod_ms);
if ( stopWatch.getTime() > maxWait_ms ) {
closure.execute(argument);
}
}
}
/** Calls {#link #waitForVerify(Integer, Object)} with <code>null</code>
* for {#code maxWait_ms}
*/
static public void waitForVerify(Object easyMockProxy)
throws Throwable
{
waitForVerify(null, easyMockProxy);
}
/** Repeatedly calls {#link EasyMock#verify(Object[])} until it succeeds, or a
* max wait time has elapsed.
* #param maxWait_ms Max wait time. <code>null</code> defaults to 30s.
* #param easyMockProxy Proxy to call verify on
* #throws Throwable
*/
static public void waitForVerify(Integer maxWait_ms, Object easyMockProxy)
throws Throwable
{
if ( maxWait_ms == null )
maxWait_ms = 30 * 1000;
StopWatch stopWatch = new StopWatch();
stopWatch.start();
for(;;) {
try
{
verify(easyMockProxy);
break;
}
catch (AssertionError e)
{
if ( stopWatch.getTime() > maxWait_ms )
throw e;
Thread.sleep(100);
}
}
}
/** Returns a path to a directory in the temp dir with the name of the given
* class. This is useful for temporary test files.
* #param aClass test class for which to create dir
* #return the path
*/
static public String getTestDirPathForTestClass(Object object)
{
String filename = object instanceof Class ?
((Class)object).getName() :
object.getClass().getName();
return DRFileUtils.getTempDir() + File.separator +
filename;
}
static public byte[] createRandomByteArray(int bytesLength)
{
byte[] sourceBytes = new byte[bytesLength];
random.nextBytes(sourceBytes);
return sourceBytes;
}
/** Returns <code>true</code> if the given object is an EasyMock mock object
*/
static public boolean isEasyMockMock(Object object) {
try {
InvocationHandler invocationHandler = Proxy
.getInvocationHandler(object);
return invocationHandler.getClass().getName().contains("easymock");
} catch (IllegalArgumentException e) {
return false;
}
}
}
Listing 3:
#Test
public void testSomething() {
final AtomicBoolean called = new AtomicBoolean(false);
subject.setCallback(new SomeCallback() {
public void callback(Object arg) {
// check arg here
called.set(true);
}
});
subject.run();
assertTrue(called.get());
}
I handle unit tests of threaded components the same way I handle any unit test, that is, with inversion of control and isolation frameworks. I develop in the .Net-arena and, out of the box, the threading (among other things) is very hard (I'd say nearly impossible) to fully isolate.
Therefore, I've written wrappers that looks something like this (simplified):
public interface IThread
{
void Start();
...
}
public class ThreadWrapper : IThread
{
private readonly Thread _thread;
public ThreadWrapper(ThreadStart threadStart)
{
_thread = new Thread(threadStart);
}
public Start()
{
_thread.Start();
}
}
public interface IThreadingManager
{
IThread CreateThread(ThreadStart threadStart);
}
public class ThreadingManager : IThreadingManager
{
public IThread CreateThread(ThreadStart threadStart)
{
return new ThreadWrapper(threadStart)
}
}
From there, I can easily inject the IThreadingManager into my components and use my isolation framework of choice to make the thread behave as I expect during the test.
That has so far worked great for me, and I use the same approach for the thread pool, things in System.Environment, Sleep etc. etc.
Pete Goodliffe has a series on the unit testing of threaded code.
It's hard. I take the easier way out and try to keep the threading code abstracted from the actual test. Pete does mention that the way I do it is wrong but I've either got the separation right or I've just been lucky.
For Java, check out chapter 12 of JCIP. There are some concrete examples of writing deterministic, multi-threaded unit tests to at least test the correctness and invariants of concurrent code.
"Proving" thread-safety with unit tests is much dicier. My belief is that this is better served by automated integration testing on a variety of platforms/configurations.
Have a look at my related answer at
Designing a Test class for a custom Barrier
It's biased towards Java but has a reasonable summary of the options.
In summary though (IMO) its not the use of some fancy framework that will ensure correctness but how you go about designing you multithreaded code. Splitting the concerns (concurrency and functionality) goes a huge way towards raising confidence. Growing Object Orientated Software Guided By Tests explains some options better than I can.
Static analysis and formal methods (see, Concurrency: State Models and Java Programs) is an option but I've found them to be of limited use in commercial development.
Don't forget that any load/soak style tests are rarely guaranteed to highlight problems.
Good luck!
I like to write two or more test methods to execute on parallel threads, and each of them make calls into the object under test. I've been using Sleep() calls to coordinate the order of the calls from the different threads, but that's not really reliable. It's also a lot slower because you have to sleep long enough that the timing usually works.
I found the Multithreaded TC Java library from the same group that wrote FindBugs. It lets you specify the order of events without using Sleep(), and it's reliable. I haven't tried it yet.
The biggest limitation to this approach is that it only lets you test the scenarios you suspect will cause trouble. As others have said, you really need to isolate your multithreaded code into a small number of simple classes to have any hope of thoroughly testing them.
Once you've carefully tested the scenarios you expect to cause trouble, an unscientific test that throws a bunch of simultaneous requests at the class for a while is a good way to look for unexpected trouble.
Update: I've played a bit with the Multithreaded TC Java library, and it works well. I've also ported some of its features to a .NET version I call TickingTest.
I just recently discovered (for Java) a tool called Threadsafe. It is a static analysis tool much like findbugs but specifically to spot multi-threading issues. It is not a replacement for testing but I can recommend it as part of writing reliable multi-threaded Java.
It even catches some very subtle potential issues around things like class subsumption, accessing unsafe objects through concurrent classes and spotting missing volatile modifiers when using the double checked locking paradigm.
If you write multithreaded Java give it a shot.
The following article suggests 2 solutions. Wrapping a semaphore (CountDownLatch) and adds functionality like externalize data from internal thread. Another way of achieving this purpose is to use Thread Pool (see Points of Interest).
Sprinkler - Advanced synchronization object
I spent most of last week at a university library studying debugging of concurrent code. The central problem is concurrent code is non-deterministic. Typically, academic debugging has fallen into one of three camps here:
Event-trace/replay. This requires an event monitor and then reviewing the events that were sent. In a UT framework, this would involve manually sending the events as part of a test, and then doing post-mortem reviews.
Scriptable. This is where you interact with the running code with a set of triggers. "On x > foo, baz()". This could be interpreted into a UT framework where you have a run-time system triggering a given test on a certain condition.
Interactive. This obviously won't work in an automatic testing situation. ;)
Now, as above commentators have noticed, you can design your concurrent system into a more deterministic state. However, if you don't do that properly, you're just back to designing a sequential system again.
My suggestion would be to focus on having a very strict design protocol about what gets threaded and what doesn't get threaded. If you constrain your interface so that there is minimal dependancies between elements, it is much easier.
Good luck, and keep working on the problem.
I have had the unfortunate task of testing threaded code and they are definitely the hardest tests I have ever written.
When writing my tests, I used a combination of delegates and events. Basically it is all about using PropertyNotifyChanged events with a WaitCallback or some kind of ConditionalWaiter that polls.
I am not sure if this was the best approach, but it has worked out for me.
Assuming under "multi-threaded" code was meant something that is
stateful and mutable
AND accessed/modified by multiple threads
concurrently
In other words we are talking about testing custom stateful thread-safe class/method/unit - which should be a very rare beast nowadays.
Because this beast is rare, first of all we need to make sure that there are all valid excuses to write it.
Step 1. Consider modifying state in same synchronization context.
Today it is easy to write compose-able concurrent and asynchronous code where IO or other slow operations offloaded to background but shared state is updated and queried in one synchronization context. e.g. async/await tasks and Rx in .NET etc. - they are all testable by design, "real" Tasks and schedulers can be substituted to make testing deterministic (however this is out of scope of the question).
It may sound very constrained but this approach works surprisingly well. It is possible to write whole apps in this style without need to make any state thread-safe (I do).
Step 2. If manipulating of shared state on single synchronization context is absolutely not possible.
Make sure the wheel is not being reinvented / there's definitely no standard alternative that can be adapted for the job. It should be likely that code is very cohesive and contained within one unit e.g. with a good chance it is a special case of some standard thread-safe data structure like hash map or collection or whatever.
Note: if code is large / spans across multiple classes AND needs multi-thread state manipulation then there's a very high chance that design is not good, reconsider Step 1
Step 3. If this step is reached then we need to test our own custom stateful thread-safe class/method/unit.
I'll be dead honest : I never had to write proper tests for such code. Most of the time I get away at Step 1, sometimes at Step 2. Last time I had to write custom thread-safe code was so many years ago that it was before I adopted unit testing / probably I wouldn't have to write it with the current knowledge anyway.
If I really had to test such code (finally, actual answer) then I would try couple of things below
Non-deterministic stress testing. e.g. run 100 threads simultaneously and check that end result is consistent.
This is more typical for higher level / integration testing of multiple users scenarios but also can be used at the unit level.
Expose some test 'hooks' where test can inject some code to help make deterministic scenarios where one thread must perform operation before the other.
As ugly as it is, I can't think of anything better.
Delay-driven testing to make threads run and perform operations in particular order. Strictly speaking such tests are non-deterministic too (there's a chance of system freeze / stop-the-world GC collection which can distort otherwise orchestrated delays), also it is ugly but allows to avoid hooks.
Running multiple threads is not difficult; it is piece of cake. Unfortunately, threads usually need to communicate with each other; that's what's difficult.
The mechanism that was originally invented to allow communication between modules was function calls; when module A wants to communicate with module B, it just invokes a function in module B. Unfortunately, this does not work with threads, because when you call a function, that function still runs in the current thread.
To overcome this problem, people decided to fall back to an even more primitive mechanism of communication: just declare a certain variable, and let both threads have access to that variable. In other words, allow the threads to share data. Sharing data is literally the first thing that naturally comes to mind, and it appears like a good choice because it seems very simple. I mean, how hard can it be, right? What could possibly go wrong?
Race conditions. That's what can, and will, go wrong.
When people realized their software was suffering from random, non-reproducible catastrophic failures due to race conditions, they started inventing elaborate mechanisms such as locks and compare-and-swap, aiming to protect against such things happening. These mechanisms fall under the broad category of "synchronization". Unfortunately, synchronization has two problems:
It is very difficult to get it right, so it is very prone to bugs.
It is completely untestable, because you cannot test for a race condition.
The astute reader might notice that "Very prone to bugs" and "Completely untestable" is a deadly combination.
Now, the mechanisms I mentioned above were being invented and adopted by large parts of the industry before the concept of automated software testing became prevalent; So, nobody could see how deadly the problem was; they just regarded it as a difficult topic which requires guru programmers, and everyone was okay with that.
Nowadays, whatever we do, we put testing first. So, if some mechanism is untestable, then the use of that mechanism is just out of the question, period. Thus, synchronization has fallen out of grace; very few people still practice it, and they are becoming fewer and fewer every day.
Without synchronization threads cannot share data; however, the original requirement was not to share data; it was to allow threads to communicate with each other. Besides sharing data, there exist other, more elegant mechanisms for inter-thread communication.
One such mechanism is message-passing, otherwise known as events.
With message passing, there is only one place in the entire software system which utilizes synchronization, and that is the concurrent blocking queue collection class that we use for storing messages. (The idea is that we should be able to get at least that little part right.)
The great thing about message passing is that it does not suffer from race conditions and is fully testable.
For J2E code, I've used SilkPerformer, LoadRunner and JMeter for concurrency testing of threads. They all do the same thing. Basically, they give you a relatively simple interface for administrating their version of the proxy server, required, in order to analyze the TCP/IP data stream, and simulate multiple users making simultaneous requests to your app server. The proxy server can give you the ability to do things like analyze the requests made, by presenting the whole page and URL sent to the server, as well as the response from the server, after processing the request.
You can find some bugs in insecure http mode, where you can at least analyze the form data that is being sent, and systematically alter that for each user. But the true tests are when you run in https (Secured Socket Layers). Then, you also have to contend with systematically altering the session and cookie data, which can be a little more convoluted.
The best bug I ever found, while testing concurrency, was when I discovered that the developer had relied upon Java garbage collection to close the connection request that was established at login, to the LDAP server, when logging in. This resulted in users being exposed to other users' sessions and very confusing results, when trying to analyze what happened when the server was brought to it's knees, barely able to complete one transaction, every few seconds.
In the end, you or someone will probably have to buckle down and analyze the code for blunders like the one I just mentioned. And an open discussion across departments, like the one that occurred, when we unfolded the problem described above, are most useful. But these tools are the best solution to testing multi-threaded code. JMeter is open source. SilkPerformer and LoadRunner are proprietary. If you really want to know whether your app is thread safe, that's how the big boys do it. I've done this for very large companies professionally, so I'm not guessing. I'm speaking from personal experience.
A word of caution: it does take some time to understand these tools. It will not be a matter of simply installing the software and firing up the GUI, unless you've already had some exposure to multi-threaded programming. I've tried to identify the 3 critical categories of areas to understand (forms, session and cookie data), with the hope that at least starting with understanding these topics will help you focus on quick results, as opposed to having to read through the entire documentation.
Concurrency is a complex interplay between the memory model, hardware, caches and our code. In the case of Java at least such tests have been partly addressed mainly by jcstress. The creators of that library are known to be authors of many JVM, GC and Java concurrency features.
But even this library needs good knowledge of the Java Memory Model specification so that we know exactly what we are testing. But I think the focus of this effort is mircobenchmarks. Not huge business applications.
There is an article on the topic, using Rust as the language in the example code:
https://medium.com/#polyglot_factotum/rust-concurrency-five-easy-pieces-871f1c62906a
In summary, the trick is to write your concurrent logic so that it is robust to the non-determinism involved with multiple threads of execution, using tools like channels and condvars.
Then, if that is how you've structured your "components", the easiest way to test them is by using channels to send messages to them, and then block on other channels to assert that the component sends certain expected messages.
The linked-to article is fully written using unit-tests.
It's not perfect, but I wrote this helper for my tests in C#:
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace Proto.Promises.Tests.Threading
{
public class ThreadHelper
{
public static readonly int multiThreadCount = Environment.ProcessorCount * 100;
private static readonly int[] offsets = new int[] { 0, 10, 100, 1000 };
private readonly Stack<Task> _executingTasks = new Stack<Task>(multiThreadCount);
private readonly Barrier _barrier = new Barrier(1);
private int _currentParticipants = 0;
private readonly TimeSpan _timeout;
public ThreadHelper() : this(TimeSpan.FromSeconds(10)) { } // 10 second timeout should be enough for most cases.
public ThreadHelper(TimeSpan timeout)
{
_timeout = timeout;
}
/// <summary>
/// Execute the action multiple times in parallel threads.
/// </summary>
public void ExecuteMultiActionParallel(Action action)
{
for (int i = 0; i < multiThreadCount; ++i)
{
AddParallelAction(action);
}
ExecutePendingParallelActions();
}
/// <summary>
/// Execute the action once in a separate thread.
/// </summary>
public void ExecuteSingleAction(Action action)
{
AddParallelAction(action);
ExecutePendingParallelActions();
}
/// <summary>
/// Add an action to be run in parallel.
/// </summary>
public void AddParallelAction(Action action)
{
var taskSource = new TaskCompletionSource<bool>();
lock (_executingTasks)
{
++_currentParticipants;
_barrier.AddParticipant();
_executingTasks.Push(taskSource.Task);
}
new Thread(() =>
{
try
{
_barrier.SignalAndWait(); // Try to make actions run in lock-step to increase likelihood of breaking race conditions.
action.Invoke();
taskSource.SetResult(true);
}
catch (Exception e)
{
taskSource.SetException(e);
}
}).Start();
}
/// <summary>
/// Runs the pending actions in parallel, attempting to run them in lock-step.
/// </summary>
public void ExecutePendingParallelActions()
{
Task[] tasks;
lock (_executingTasks)
{
_barrier.SignalAndWait();
_barrier.RemoveParticipants(_currentParticipants);
_currentParticipants = 0;
tasks = _executingTasks.ToArray();
_executingTasks.Clear();
}
try
{
if (!Task.WaitAll(tasks, _timeout))
{
throw new TimeoutException($"Action(s) timed out after {_timeout}, there may be a deadlock.");
}
}
catch (AggregateException e)
{
// Only throw one exception instead of aggregate to try to avoid overloading the test error output.
throw e.Flatten().InnerException;
}
}
/// <summary>
/// Run each action in parallel multiple times with differing offsets for each run.
/// <para/>The number of runs is 4^actions.Length, so be careful if you don't want the test to run too long.
/// </summary>
/// <param name="expandToProcessorCount">If true, copies each action on additional threads up to the processor count. This can help test more without increasing the time it takes to complete.
/// <para/>Example: 2 actions with 6 processors, runs each action 3 times in parallel.</param>
/// <param name="setup">The action to run before each parallel run.</param>
/// <param name="teardown">The action to run after each parallel run.</param>
/// <param name="actions">The actions to run in parallel.</param>
public void ExecuteParallelActionsWithOffsets(bool expandToProcessorCount, Action setup, Action teardown, params Action[] actions)
{
setup += () => { };
teardown += () => { };
int actionCount = actions.Length;
int expandCount = expandToProcessorCount ? Math.Max(Environment.ProcessorCount / actionCount, 1) : 1;
foreach (var combo in GenerateCombinations(offsets, actionCount))
{
setup.Invoke();
for (int k = 0; k < expandCount; ++k)
{
for (int i = 0; i < actionCount; ++i)
{
int offset = combo[i];
Action action = actions[i];
AddParallelAction(() =>
{
for (int j = offset; j > 0; --j) { } // Just spin in a loop for the offset.
action.Invoke();
});
}
}
ExecutePendingParallelActions();
teardown.Invoke();
}
}
// Input: [1, 2, 3], 3
// Ouput: [
// [1, 1, 1],
// [2, 1, 1],
// [3, 1, 1],
// [1, 2, 1],
// [2, 2, 1],
// [3, 2, 1],
// [1, 3, 1],
// [2, 3, 1],
// [3, 3, 1],
// [1, 1, 2],
// [2, 1, 2],
// [3, 1, 2],
// [1, 2, 2],
// [2, 2, 2],
// [3, 2, 2],
// [1, 3, 2],
// [2, 3, 2],
// [3, 3, 2],
// [1, 1, 3],
// [2, 1, 3],
// [3, 1, 3],
// [1, 2, 3],
// [2, 2, 3],
// [3, 2, 3],
// [1, 3, 3],
// [2, 3, 3],
// [3, 3, 3]
// ]
private static IEnumerable<int[]> GenerateCombinations(int[] options, int count)
{
int[] indexTracker = new int[count];
int[] combo = new int[count];
for (int i = 0; i < count; ++i)
{
combo[i] = options[0];
}
// Same algorithm as picking a combination lock.
int rollovers = 0;
while (rollovers < count)
{
yield return combo; // No need to duplicate the array since we're just reading it.
for (int i = 0; i < count; ++i)
{
int index = ++indexTracker[i];
if (index == options.Length)
{
indexTracker[i] = 0;
combo[i] = options[0];
if (i == rollovers)
{
++rollovers;
}
}
else
{
combo[i] = options[index];
break;
}
}
}
}
}
}
Example usage:
[Test]
public void DeferredMayBeBeResolvedAndPromiseAwaitedConcurrently_void0()
{
Promise.Deferred deferred = default(Promise.Deferred);
Promise promise = default(Promise);
int invokedCount = 0;
var threadHelper = new ThreadHelper();
threadHelper.ExecuteParallelActionsWithOffsets(false,
// Setup
() =>
{
invokedCount = 0;
deferred = Promise.NewDeferred();
promise = deferred.Promise;
},
// Teardown
() => Assert.AreEqual(1, invokedCount),
// Parallel Actions
() => deferred.Resolve(),
() => promise.Then(() => { Interlocked.Increment(ref invokedCount); }).Forget()
);
}
One simple test pattern that can work for some (not all!) cases is to repeat the same test many times. For example, suppose you have a method:
def process(input):
# Spawns several threads to do the job
# ...
return output
Create a bunch of tests:
process(input1) -> expect to return output1
process(input2) -> expect to return output2
...
Now run each of those tests many times.
If the implementation of process contains a subtle bug (e.g. deadlock, race condition, etc.) that has 0.1% chance to emerge, running the test 1000 times gives 64% probability for the bug to emerge at least once. Running the test 10000 times gives >99% probability.
If you are testing simple new Thread(runnable).run()
You can mock Thread to run the runnable sequentially
For instance, if the code of the tested object invokes a new thread like this
Class TestedClass {
public void doAsychOp() {
new Thread(new myRunnable()).start();
}
}
Then mocking new Threads and run the runnable argument sequentially can help
#Mock
private Thread threadMock;
#Test
public void myTest() throws Exception {
PowerMockito.mockStatic(Thread.class);
//when new thread is created execute runnable immediately
PowerMockito.whenNew(Thread.class).withAnyArguments().then(new Answer<Thread>() {
#Override
public Thread answer(InvocationOnMock invocation) throws Throwable {
// immediately run the runnable
Runnable runnable = invocation.getArgumentAt(0, Runnable.class);
if(runnable != null) {
runnable.run();
}
return threadMock;//return a mock so Thread.start() will do nothing
}
});
TestedClass testcls = new TestedClass()
testcls.doAsychOp(); //will invoke myRunnable.run in current thread
//.... check expected
}
(if possible) don't use threads, use actors / active objects. Easy to test.
You may use EasyMock.makeThreadSafe to make testing instance threadsafe
Related
When not to use volatile,it still can see the changes which issued by other thread
public class VisibleDemo { private boolean flag; public VisibleDemo setFlag(boolean flag) { this.flag = flag; return this; } public static void main(String[] args) throws InterruptedException { VisibleDemo t = new VisibleDemo(); new Thread(()->{ long l = System.currentTimeMillis(); while (true) { if (System.currentTimeMillis() - l > 600) { break; } } t.setFlag(true); }).start(); new Thread(()->{ long l = System.currentTimeMillis(); while (true) { if (System.currentTimeMillis() - l > 500) { break; } } while (!t.flag) { // if (System.currentTimeMillis() - l > 598) { // // } } System.out.println("end"); }).start(); } } if it does not have the following codes, it will not show "end". if (System.currentTimeMillis() - l > 598) { } if it has these codes, it will probably show "end". Sometimes it does not show. when is less than 598 or not have these codes, like use 550, it will not show "end". when is 598, it will probably show "end" when is greater than 598, it will show "end" every time notes: 598 is on my computer, May be your computer is another number. the flag is not with volatile, why can know the newest value. First: I want to know Why? Second: I need help, I want to know the scenarios: when the worker cache of jvm thread will refresh to/from main memory. OS: windows 10 java: jdk8u231
Your code is suffering from a data-race and that is why it is behaving unreliably. The JMM is defined in terms of the happens-before relation. So if you have 2 actions A and B, and A happens-before B, then B should see A and everything before A. It is very important to understand that happens-before doesn't imply happening-before (so ordering based on physical time) and vice versa. The 'flag' field is accessed concurrently; one thread is reading it while another thread is writing it. In JMM terms this is called conflicting access. Conflicting accesses are fine as long as it is done using some form of synchronization because the synchronization will induce happens-before edges. But since the 'flag' accesses are plain loads/stores, there is no synchronization, and as a consequence, there will not be a happens-before edge to order the load and the store. A conflicting access, that isn't ordered by a happens-before edge, is called a data-race and that is the problem you are suffering from. When there is a data-race; funny things can happen but it will not lead to undefined behavior like is possible under C++ (undefined behavior can effectively lead to any possible outcome including crashes and super weird behavior). So load still needs to see a value that is written and can't see a value coming out of thin air. If we look at your code: while (!t.flag) { ... } Because the flag field isn't updated within the loop and is just a plain load, the compiler is allowed to optimize this code to: if(!t.flag){ while(true){...} } This particular optimization is called loop hoisting (or loop invariant code motion). So this explains why the loop doesn't need to complete. Why does it complete when you access the System.currentTimeMillis? Because you got lucky; apparently this prevents the JIT from applying the above optimization. But keep in mind that System.currentTimeMillis doesn't have any formal synchronization semantics and therefore doesn't induce happens-before edges. How to fix your code? The simplest way to fix your code would be to make 'flag' volatile or access both the read/write from a synchronized block. If you want to go really hardcore: use VarHandle get/set opaque. Officially it is still a data-race because opaque doesn't indice happens-before edges, but it will prevent the compiler to optimize out the load/store. It is a benign data race. The primary advantage is slightly better performance because it doesn't prevent the reordering of surrounding loads/stores. I want to know the scenarios: when the worker cache of jvm thread will refresh to/from main memory. This is a fallacy. Caches on modern CPUs are always coherent; this is taken care of by the cache coherence protocol like MESI. Writing to main memory for every volatile read/write would be extremely slow. For more information see the following excellent post. If you want to know more about cache coherence and memory ordering, please check this excellent book which you can download for free.
I want to know the scenarios: when the worker cache of jvm thread will refresh to/from main memory. When Taylor Swift is playing on your music player, it'll be 598, unless it's tuesday, then it'll be 599. No, really. It's that arbitrary. The JVM spec gives the JVM the right to come up with any old number for any reason if your code isn't properly guarded. The problem is JVM diversity. There is a crazy combinatorial explosion: There are about 8 OSes give or take. There are like 20 different 'chip lines', with different pipelining behaviour. These chips can be in various mitigating modes to mitigate against attacks like Spectre. Let's call it 3. There are about 8 different major JVM vendors. These come in ~10 or so different versions (java 8, java 9, java 10, java 11, etc). That gives us about 384000 different combinations. The point of the JMM (Java Memory Model) is to remove the handcuffs from a JVM implementation. A JVM implementation is looking for this optimal case: It wants the freedom to use the various tricks that CPUs use to run code as fast as possible. For example, it wants the freedom to be capable of 're-ordering' (given a(); b(), to run b() first, and a() later. Which is okay, if a and b are utterly independent and are not in any way looking at each others modifications). The reason it wants to do this is because CPUs are pipelines: Even processing a single instruction is in fact a chain of many separate steps, and the 'parse the instruction' step can get cracking on parsing another instruction the very moment it is done, even if that instruction is still being processed by the rest of the pipe. In fact, the CPU could have 4 separate 'instruction parser units' and they can be parsing 4 instructions in parallel. This is NOT the kind of parallelism that multiple cores do: This is a single core that will parse 4 consecutive instructions in parallel because parsing instructions is slightly slower than running them. For example. But that's just intel chips of the Z-whatever line. That's the point. If the memory model of the java specification indicates that a JVM simply can't use this stuff then that would mean JVMs on that particular intel chip run slow as molasses. We don't want that. Nevertheless, the memory model rules can't be so preferential to giving the JVM the right to re-order and do all sorts of crazy things that it becomes impossible to write reliable code for JVMs. Imagine the java lang spec says that the JVM can re-order any 2 instructions in one method at any time even if these 2 instructions are touching the same field. That'd be great for JVM engineers, they can go nuts with optimizing code on the fly to re-order it optimally. But it would impossible to write java code. So, a balance has been struck. This balance takes the following form: The JMM gives you specific rules - these rules take the form of: "If you do X, then the JVM guarantees Y". But that is all. In particular, there is nothing written about what happens if you do not do X. All you know is, that then Y is not guaranteed. But 'not guaranteed' does not mean: Will definitely NOT happen. Here is an example: class Data { static int a = 0; static int b = 0; } class Thread1 extends Thread { public void run() { Data.a = 5; Data.b = 10; } } class Thread2 extends Thread { public void run() { int a = Data.a; int b = Data.b; System.out.println(a); System.out.println(b); } } class Main { public static void main(String[] args) { new Thread1().start(); new Thread2().start(); } } This code: Makes 2 fields, which start out at 0 and 0. Runs one thread that first sets a to 5 and then sets b to 10. Starts a second thread that reads these 2 fields into local vars and then prints these. The JVM spec says that it is valid for a JVM to: Print 0/0 Print 5/0 Print 0/10 Print 5/10 But it would not be legal for a JVM to e.g. print '20/20', or '10/5'. Let's zoom in on the 0/10 case because that is utterly bizarre - how could a JVM possibly do that? Well, reordering! WILL a JVM print 0/10? On some combinations of JVM vender and version+Architecture+OS+phase of the moon, YES IT WILL. On most, no it won't. Ever. Still, imagine you wrote this code, you rely on 0/10 NEVER occurring, and you test the heck out of your code, and you verify that indeed, even running the test a million times, it never happens. You ship it to the production server and it runs fine for a week and then just as you are giving the demo to the really important potential customer, all heck breaks loose: Your app is broken, as from time to time the 0/10 case does occur. You file a bug with your JVM vendor. And they close it as 'intended behaviour - wontfix'. That will really happen, because that really is the intended behaviour. _If you write code that relies on a thing being true that is NOT guaranteed by the JMM, then YOU wrote a bug, even if on your particular hardware on this particular day it is completely impossible for you to make this bug occur right now. This means one simple and very nasty conclusion is the only correct one: You cannot test this stuff. So, if you adhere to the rule that if there are no tests then you can't know if you code works, guess what? You cannot ever know if your code is fine. Ever. That then leads to the conclusion that you don't want to write any such code. This sounds crazy (how can you simply not ever, ever write anything multicore?) but it's not as nuts as you think. This only comes up if 2 threads are dependent on ordering relative to each other for some in-process action. For example, if two threads are both accessing the same field of the same instance. Simply... don't do that. It's easier than you think: If all 'communication' between threads goes via the database and you know how to use transactions in databases, voila. Or you use a message bus service like RabbitMQ. If for some job you really must write multithread code where the threads interact with each other, don't shoot the messenger: It is NOT POSSIBLE to test that you did it right. So write it very carefully. A second conclusion is that the JMM doesn't explain how things work or what happens. It merely says: IF you follow these rules, I guarantee you that THIS will happen. If you don't follow these rules, anything can happen. A JVM is free to do all sorts of crazy shenanigans, and this documentation nor any other documentation will ever enumerate all the crazy things that could happen. After all, there are at least 38400 different combinations and it's crazy to attempt to document all 38400! So, what are the core rules? The core rules are so-called happens-before relationships. The basic rule is simply this: There are various ways to establish H-B relationships. Such a relationship is always between 2 lines of code. 2 lines of code might be unrelated, H-B wise. Or, the rules state that line A 'happens-before' line B. If and only if the rules state this, then it will be impossible to observe a state of the universe (the values of all fields of all instances in the entire JVM) at line B as it was before line A ran. That's it. For example, if line A 'happens before' line B, but line B does not attempt to witness any field change A made, then the JVM is still free to reorder and have B run before A. The point is that this shouldn't matter - you're not observing, so why does it matter? We can 'fix' our weird 0/0/5/10 issue by setting up H-B: If the 'grab the static field values and save them to local a/b vars' code happens-after thread1's setting of it, then we can be sure that the code will always print 5/10 and the JMM guarantees means a JVM that doesn't print that is broken. H-B are also transitive (if HB(A, B) is true, and HB(B, C) is true, then HB(A, C) is also true). How do you set up HB? If line B would run after line A as per the usual understanding of how things run, and both are being run by the same thread, HB(A, B). This is obvious: If you just write x(); y();, then y cannot observe state as it was before x ran. HB(thread.start(), X) where X is the very first line in the started thread. HB(EndS, StartS), where EndS is the exiting of a synchronized block on object ref Z, and StartS is another thread entering a synchronized block (on ref Z as well) later. HB(V, V) where V is 'accessing volatile variable Z', but it is hard to know which way the HB goes with volatiles. There are a few more exotic ways. There's also a separate HB relationship for constructors and final variables that they initialize, but generally this one is real easy to understand (once a constructor returns, whatever final fields it initialized are definitely set and cannot be observed to not be set, even if otherwise no actual HB relationship has been established. This applies only to final fields). This explains why you observe weird values. This also explains why your question of 'I want to know when a JVM thread will refresh to/from main memory' is not answerable: Because the java memory model spec and the java virtual machine spec intentionally and specifically make no promises on how that works. One JVM can work one way, another JVM can do it completely differently. The reason I started off making a seeming joke about playing Taylor Swift is: A CPU has cores, and the cores are limited. A modern computer, especially a desktop, is doing thousands of things at once, and will therefore be rotating apps through cores all the time. Whether a field update is 'flushed out' to main memory (NOTE: THAT IS DANGEROUS THINKING - THE DOCS DO NOT ACTUALLY ENFORCE THAT JVMS CAN BE UNDERSTOOD IN THOSE TERMS!) might depend on whether it gets rotated out of a core or not. And that in turn might depend on your music player dealing with a particular compressed music file that takes a few more cores to decompress the next block so that it can be queued up in the audio buffer. Hence, and this is no joke, the song you are playing on your music player can in fact change the number you get. Hence, why you have to give up: You CANNOT enumerate 'if my computer is in this state, then this code will always produce Y number'. There are billions of states you'd have to enumerate. Impossible.
Are there performance implications to creating a Thread and never starting it?
I'm working on an existing Java codebase which has an object that extends Thread, and also contains a number of other properties and methods that pertain to the operation of the thread itself. In former versions of the codebase, the thread was always an actual, heavyweight Thread that was started with Thread.start, waited for with Thread.join, and the like. I'm currently refactoring the codebase, and in the present version, the object's Thread functionality is not always needed (but the object itself is, due to the other functionality contained in the object; in many cases, it's usable even when the thread itself is not running). So there are situations in which the application creates these objects (which extend Thread) and never calls .start() on them, purely using them for their other properties and methods. In the future, the application may need to create many more of these objects than previously, to the point where I potentially need to worry about performance. Obviously, creating and starting a large number of actual threads would be a performance nightmare. Does the same thing apply to Thread objects that are never started? That is, are any operating system resources, or large Java resources, required purely to create a Thread? Or are the resources used only when the Thread is actually .started, making unstarted Thread objects safe to use in quantity? It would be possible to refactor the code to split the non-threading-related functionality into a separate function, but I don't want to do a large refactoring if it's entirely pointless to do so. I've attempted to determine the answer to this with a few web searches, but it's hard to aim the query because search engines can't normally distinguish a Thread object from an actual Java thread.
You could implement Runnable instead of extending Thread. public class MyRunnableClass implements Runnable { // Your stuff... #Override public void run() { // Thread-related stuff... } } Whenever you need to run your Object to behave as a Thread, simply use: Thread t = new Thread(new MyRunnableClass()); t.start();
As the others have pointed out: performance isn't a problem here. I would focus much more on the "good design" approach. It simply doesn't make (much, any?) sense to extend Thread when you do not intend to ever invoke start(). And you see: you write code to communicate your intentions. Extending Thread without using it as thread, that only communicates confusion. Every new future reader of your code will wonder "why is that"? Therefore, focus on getting to a straight forward design. And I would go one step further: don't just turn to Runnable, and continuing to use threads. Instead: learn about ExecutorServices, and how to submit tasks, and Futures, and all that. "Bare iron" Threads (and Runnables) are like 20 year old concepts. Java has better things to offer by now. So, if you are really serious about improving your code base: look into these new abstraction concepts to figure where they would make sense to be used.
You can create about 1.5 million of these objects per GB of memory. import java.util.LinkedList; import java.util.List; class A { public static void main(String[] args) { int count = 0; try { List<Thread> threads = new LinkedList<>(); while (true) { threads.add(new Thread()); if (++count % 10000 == 0) System.out.println(count); } } catch (Error e) { System.out.println("Got " + e + " after " + count + " threads"); } } } using -Xms1g -Xmx1g for Oracle Java 8, the process grinds to halt at around 1 GB - 1780000 2 GB - 3560000 6 GB - 10690000 The object uses a bit more than you might expect from reading the source code, but it's still about 600 bytes each. NOTE: Throwable also use more memory than you might expect by reading the Java source. It can be 500 - 2000 bytes more depending on the size of the stack at the time it was created.
How to test multi-thread logic in java [duplicate]
I have thus far avoided the nightmare that is testing multi-threaded code since it just seems like too much of a minefield. I'd like to ask how people have gone about testing code that relies on threads for successful execution, or just how people have gone about testing those kinds of issues that only show up when two threads interact in a given manner? This seems like a really key problem for programmers today, it would be useful to pool our knowledge on this one imho.
Look, there's no easy way to do this. I'm working on a project that is inherently multithreaded. Events come in from the operating system and I have to process them concurrently. The simplest way to deal with testing complex, multithreaded application code is this: If it's too complex to test, you're doing it wrong. If you have a single instance that has multiple threads acting upon it, and you can't test situations where these threads step all over each other, then your design needs to be redone. It's both as simple and as complex as this. There are many ways to program for multithreading that avoids threads running through instances at the same time. The simplest is to make all your objects immutable. Of course, that's not usually possible. So you have to identify those places in your design where threads interact with the same instance and reduce the number of those places. By doing this, you isolate a few classes where multithreading actually occurs, reducing the overall complexity of testing your system. But you have to realize that even by doing this, you still can't test every situation where two threads step on each other. To do that, you'd have to run two threads concurrently in the same test, then control exactly what lines they are executing at any given moment. The best you can do is simulate this situation. But this might require you to code specifically for testing, and that's at best a half step towards a true solution. Probably the best way to test code for threading issues is through static analysis of the code. If your threaded code doesn't follow a finite set of thread safe patterns, then you might have a problem. I believe Code Analysis in VS does contain some knowledge of threading, but probably not much. Look, as things stand currently (and probably will stand for a good time to come), the best way to test multithreaded apps is to reduce the complexity of threaded code as much as possible. Minimize areas where threads interact, test as best as possible, and use code analysis to identify danger areas.
It's been a while when this question was posted, but it's still not answered ... kleolb02's answer is a good one. I'll try going into more details. There is a way, which I practice for C# code. For unit tests you should be able to program reproducible tests, which is the biggest challenge in multithreaded code. So my answer aims toward forcing asynchronous code into a test harness, which works synchronously. It's an idea from Gerard Meszaros's book "xUnit Test Patterns" and is called "Humble Object" (p. 695): You have to separate core logic code and anything which smells like asynchronous code from each other. This would result to a class for the core logic, which works synchronously. This puts you into the position to test the core logic code in a synchronous way. You have absolute control over the timing of the calls you are doing on the core logic and thus can make reproducible tests. And this is your gain from separating core logic and asynchronous logic. This core logic needs be wrapped around by another class, which is responsible for receiving calls to the core logic asynchronously and delegates these calls to the core logic. Production code will only access the core logic via that class. Because this class should only delegate calls, it's a very "dumb" class without much logic. So you can keep your unit tests for this asychronous working class at a minimum. Anything above that (testing interaction between classes) are component tests. Also in this case, you should be able to have absolute control over timing, if you stick to the "Humble Object" pattern.
Tough one indeed! In my (C++) unit tests, I've broken this down into several categories along the lines of the concurrency pattern used: Unit tests for classes that operate in a single thread and aren't thread aware -- easy, test as usual. Unit tests for Monitor objects (those that execute synchronized methods in the callers' thread of control) that expose a synchronized public API -- instantiate multiple mock threads that exercise the API. Construct scenarios that exercise internal conditions of the passive object. Include one longer running test that basically beats the heck out of it from multiple threads for a long period of time. This is unscientific I know but it does build confidence. Unit tests for Active objects (those that encapsulate their own thread or threads of control) -- similar to #2 above with variations depending on the class design. Public API may be blocking or non-blocking, callers may obtain futures, data may arrive at queues or need to be dequeued. There are many combinations possible here; white box away. Still requires multiple mock threads to make calls to the object under test. As an aside: In internal developer training that I do, I teach the Pillars of Concurrency and these two patterns as the primary framework for thinking about and decomposing concurrency problems. There's obviously more advanced concepts out there but I've found that this set of basics helps keep engineers out of the soup. It also leads to code that is more unit testable, as described above.
I have faced this issue several times in recent years when writing thread handling code for several projects. I'm providing a late answer because most of the other answers, while providing alternatives, do not actually answer the question about testing. My answer is addressed to the cases where there is no alternative to multithreaded code; I do cover code design issues for completeness, but also discuss unit testing. Writing testable multithreaded code The first thing to do is to separate your production thread handling code from all the code that does actual data processing. That way, the data processing can be tested as singly threaded code, and the only thing the multithreaded code does is to coordinate threads. The second thing to remember is that bugs in multithreaded code are probabilistic; the bugs that manifest themselves least frequently are the bugs that will sneak through into production, will be difficult to reproduce even in production, and will thus cause the biggest problems. For this reason, the standard coding approach of writing the code quickly and then debugging it until it works is a bad idea for multithreaded code; it will result in code where the easy bugs are fixed and the dangerous bugs are still there. Instead, when writing multithreaded code, you must write the code with the attitude that you are going to avoid writing the bugs in the first place. If you have properly removed the data processing code, the thread handling code should be small enough - preferably a few lines, at worst a few dozen lines - that you have a chance of writing it without writing a bug, and certainly without writing many bugs, if you understand threading, take your time, and are careful. Writing unit tests for multithreaded code Once the multithreaded code is written as carefully as possible, it is still worthwhile writing tests for that code. The primary purpose of the tests is not so much to test for highly timing dependent race condition bugs - it's impossible to test for such race conditions repeatably - but rather to test that your locking strategy for preventing such bugs allows for multiple threads to interact as intended. To properly test correct locking behavior, a test must start multiple threads. To make the test repeatable, we want the interactions between the threads to happen in a predictable order. We don't want to externally synchronize the threads in the test, because that will mask bugs that could happen in production where the threads are not externally synchronized. That leaves the use of timing delays for thread synchronization, which is the technique that I have used successfully whenever I've had to write tests of multithreaded code. If the delays are too short, then the test becomes fragile, because minor timing differences - say between different machines on which the tests may be run - may cause the timing to be off and the test to fail. What I've typically done is start with delays that cause test failures, increase the delays so that the test passes reliably on my development machine, and then double the delays beyond that so the test has a good chance of passing on other machines. This does mean that the test will take a macroscopic amount of time, though in my experience, careful test design can limit that time to no more than a dozen seconds. Since you shouldn't have very many places requiring thread coordination code in your application, that should be acceptable for your test suite. Finally, keep track of the number of bugs caught by your test. If your test has 80% code coverage, it can be expected to catch about 80% of your bugs. If your test is well designed but finds no bugs, there's a reasonable chance that you don't have additional bugs that will only show up in production. If the test catches one or two bugs, you might still get lucky. Beyond that, and you may want to consider a careful review of or even a complete rewrite of your thread handling code, since it is likely that code still contains hidden bugs that will be very difficult to find until the code is in production, and very difficult to fix then.
I also had serious problems testing multi- threaded code. Then I found a really cool solution in "xUnit Test Patterns" by Gerard Meszaros. The pattern he describes is called Humble object. Basically it describes how you can extract the logic into a separate, easy-to-test component that is decoupled from its environment. After you tested this logic, you can test the complicated behaviour (multi- threading, asynchronous execution, etc...)
There are a few tools around that are quite good. Here is a summary of some of the Java ones. Some good static analysis tools include FindBugs (gives some useful hints), JLint, Java Pathfinder (JPF & JPF2), and Bogor. MultithreadedTC is quite a good dynamic analysis tool (integrated into JUnit) where you have to set up your own test cases. ConTest from IBM Research is interesting. It instruments your code by inserting all kinds of thread modifying behaviours (e.g. sleep & yield) to try to uncover bugs randomly. SPIN is a really cool tool for modelling your Java (and other) components, but you need to have some useful framework. It is hard to use as is, but extremely powerful if you know how to use it. Quite a few tools use SPIN underneath the hood. MultithreadedTC is probably the most mainstream, but some of the static analysis tools listed above are definitely worth looking at.
Awaitility can also be useful to help you write deterministic unit tests. It allows you to wait until some state somewhere in your system is updated. For example: await().untilCall( to(myService).myMethod(), greaterThan(3) ); or await().atMost(5,SECONDS).until(fieldIn(myObject).ofType(int.class), equalTo(1)); It also has Scala and Groovy support. await until { something() > 4 } // Scala example
Another way to (kinda) test threaded code, and very complex systems in general is through Fuzz Testing. It's not great, and it won't find everything, but its likely to be useful and its simple to do. Quote: Fuzz testing or fuzzing is a software testing technique that provides random data("fuzz") to the inputs of a program. If the program fails (for example, by crashing, or by failing built-in code assertions), the defects can be noted. The great advantage of fuzz testing is that the test design is extremely simple, and free of preconceptions about system behavior. ... Fuzz testing is often used in large software development projects that employ black box testing. These projects usually have a budget to develop test tools, and fuzz testing is one of the techniques which offers a high benefit to cost ratio. ... However, fuzz testing is not a substitute for exhaustive testing or formal methods: it can only provide a random sample of the system's behavior, and in many cases passing a fuzz test may only demonstrate that a piece of software handles exceptions without crashing, rather than behaving correctly. Thus, fuzz testing can only be regarded as a bug-finding tool rather than an assurance of quality.
Testing MT code for correctness is, as already stated, quite a hard problem. In the end it boils down to ensuring that there are no incorrectly synchronised data races in your code. The problem with this is that there are infinitely many possibilities of thread execution (interleavings) over which you do not have much control (be sure to read this article, though). In simple scenarios it might be possible to actually prove correctness by reasoning but this is usually not the case. Especially if you want to avoid/minimize synchronization and not go for the most obvious/easiest synchronization option. An approach that I follow is to write highly concurrent test code in order to make potentially undetected data races likely to occur. And then I run those tests for some time :) I once stumbled upon a talk where some computer scientist where showing off a tool that kind of does this (randomly devising test from specs and then running them wildly, concurrently, checking for the defined invariants to be broken). By the way, I think this aspect of testing MT code has not been mentioned here: identify invariants of the code that you can check for randomly. Unfortunately, finding those invariants is quite a hard problem, too. Also they might not hold all the time during execution, so you have to find/enforce executions points where you can expect them to be true. Bringing the code execution to such a state is also a hard problem (and might itself incur concurrency issues. Whew, it's damn hard! Some interesting links to read: Deterministic interleaving: A framework that allows to force certain thread interleavings and then check for invariants jMock Blitzer : Stress test synchronization assertConcurrent : JUnit version of stress testing synronization Testing concurrent code : Short overview of the two primary methods of brute force (stress test) or deterministic (going for the invariants)
I've done a lot of this, and yes it sucks. Some tips: GroboUtils for running multiple test threads alphaWorks ConTest to instrument classes to cause interleavings to vary between iterations Create a throwable field and check it in tearDown (see Listing 1). If you catch a bad exception in another thread, just assign it to throwable. I created the utils class in Listing 2 and have found it invaluable, especially waitForVerify and waitForCondition, which will greatly increase the performance of your tests. Make good use of AtomicBoolean in your tests. It is thread safe, and you'll often need a final reference type to store values from callback classes and suchlike. See example in Listing 3. Make sure to always give your test a timeout (e.g., #Test(timeout=60*1000)), as concurrency tests can sometimes hang forever when they're broken. Listing 1: #After public void tearDown() { if ( throwable != null ) throw throwable; } Listing 2: import static org.junit.Assert.fail; import java.io.File; import java.lang.reflect.InvocationHandler; import java.lang.reflect.Proxy; import java.util.Random; import org.apache.commons.collections.Closure; import org.apache.commons.collections.Predicate; import org.apache.commons.lang.time.StopWatch; import org.easymock.EasyMock; import org.easymock.classextension.internal.ClassExtensionHelper; import static org.easymock.classextension.EasyMock.*; import ca.digitalrapids.io.DRFileUtils; /** * Various utilities for testing */ public abstract class DRTestUtils { static private Random random = new Random(); /** Calls {#link #waitForCondition(Integer, Integer, Predicate, String)} with * default max wait and check period values. */ static public void waitForCondition(Predicate predicate, String errorMessage) throws Throwable { waitForCondition(null, null, predicate, errorMessage); } /** Blocks until a condition is true, throwing an {#link AssertionError} if * it does not become true during a given max time. * #param maxWait_ms max time to wait for true condition. Optional; defaults * to 30 * 1000 ms (30 seconds). * #param checkPeriod_ms period at which to try the condition. Optional; defaults * to 100 ms. * #param predicate the condition * #param errorMessage message use in the {#link AssertionError} * #throws Throwable on {#link AssertionError} or any other exception/error */ static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, Predicate predicate, String errorMessage) throws Throwable { waitForCondition(maxWait_ms, checkPeriod_ms, predicate, new Closure() { public void execute(Object errorMessage) { fail((String)errorMessage); } }, errorMessage); } /** Blocks until a condition is true, running a closure if * it does not become true during a given max time. * #param maxWait_ms max time to wait for true condition. Optional; defaults * to 30 * 1000 ms (30 seconds). * #param checkPeriod_ms period at which to try the condition. Optional; defaults * to 100 ms. * #param predicate the condition * #param closure closure to run * #param argument argument for closure * #throws Throwable on {#link AssertionError} or any other exception/error */ static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, Predicate predicate, Closure closure, Object argument) throws Throwable { if ( maxWait_ms == null ) maxWait_ms = 30 * 1000; if ( checkPeriod_ms == null ) checkPeriod_ms = 100; StopWatch stopWatch = new StopWatch(); stopWatch.start(); while ( !predicate.evaluate(null) ) { Thread.sleep(checkPeriod_ms); if ( stopWatch.getTime() > maxWait_ms ) { closure.execute(argument); } } } /** Calls {#link #waitForVerify(Integer, Object)} with <code>null</code> * for {#code maxWait_ms} */ static public void waitForVerify(Object easyMockProxy) throws Throwable { waitForVerify(null, easyMockProxy); } /** Repeatedly calls {#link EasyMock#verify(Object[])} until it succeeds, or a * max wait time has elapsed. * #param maxWait_ms Max wait time. <code>null</code> defaults to 30s. * #param easyMockProxy Proxy to call verify on * #throws Throwable */ static public void waitForVerify(Integer maxWait_ms, Object easyMockProxy) throws Throwable { if ( maxWait_ms == null ) maxWait_ms = 30 * 1000; StopWatch stopWatch = new StopWatch(); stopWatch.start(); for(;;) { try { verify(easyMockProxy); break; } catch (AssertionError e) { if ( stopWatch.getTime() > maxWait_ms ) throw e; Thread.sleep(100); } } } /** Returns a path to a directory in the temp dir with the name of the given * class. This is useful for temporary test files. * #param aClass test class for which to create dir * #return the path */ static public String getTestDirPathForTestClass(Object object) { String filename = object instanceof Class ? ((Class)object).getName() : object.getClass().getName(); return DRFileUtils.getTempDir() + File.separator + filename; } static public byte[] createRandomByteArray(int bytesLength) { byte[] sourceBytes = new byte[bytesLength]; random.nextBytes(sourceBytes); return sourceBytes; } /** Returns <code>true</code> if the given object is an EasyMock mock object */ static public boolean isEasyMockMock(Object object) { try { InvocationHandler invocationHandler = Proxy .getInvocationHandler(object); return invocationHandler.getClass().getName().contains("easymock"); } catch (IllegalArgumentException e) { return false; } } } Listing 3: #Test public void testSomething() { final AtomicBoolean called = new AtomicBoolean(false); subject.setCallback(new SomeCallback() { public void callback(Object arg) { // check arg here called.set(true); } }); subject.run(); assertTrue(called.get()); }
I handle unit tests of threaded components the same way I handle any unit test, that is, with inversion of control and isolation frameworks. I develop in the .Net-arena and, out of the box, the threading (among other things) is very hard (I'd say nearly impossible) to fully isolate. Therefore, I've written wrappers that looks something like this (simplified): public interface IThread { void Start(); ... } public class ThreadWrapper : IThread { private readonly Thread _thread; public ThreadWrapper(ThreadStart threadStart) { _thread = new Thread(threadStart); } public Start() { _thread.Start(); } } public interface IThreadingManager { IThread CreateThread(ThreadStart threadStart); } public class ThreadingManager : IThreadingManager { public IThread CreateThread(ThreadStart threadStart) { return new ThreadWrapper(threadStart) } } From there, I can easily inject the IThreadingManager into my components and use my isolation framework of choice to make the thread behave as I expect during the test. That has so far worked great for me, and I use the same approach for the thread pool, things in System.Environment, Sleep etc. etc.
Pete Goodliffe has a series on the unit testing of threaded code. It's hard. I take the easier way out and try to keep the threading code abstracted from the actual test. Pete does mention that the way I do it is wrong but I've either got the separation right or I've just been lucky.
For Java, check out chapter 12 of JCIP. There are some concrete examples of writing deterministic, multi-threaded unit tests to at least test the correctness and invariants of concurrent code. "Proving" thread-safety with unit tests is much dicier. My belief is that this is better served by automated integration testing on a variety of platforms/configurations.
Have a look at my related answer at Designing a Test class for a custom Barrier It's biased towards Java but has a reasonable summary of the options. In summary though (IMO) its not the use of some fancy framework that will ensure correctness but how you go about designing you multithreaded code. Splitting the concerns (concurrency and functionality) goes a huge way towards raising confidence. Growing Object Orientated Software Guided By Tests explains some options better than I can. Static analysis and formal methods (see, Concurrency: State Models and Java Programs) is an option but I've found them to be of limited use in commercial development. Don't forget that any load/soak style tests are rarely guaranteed to highlight problems. Good luck!
I like to write two or more test methods to execute on parallel threads, and each of them make calls into the object under test. I've been using Sleep() calls to coordinate the order of the calls from the different threads, but that's not really reliable. It's also a lot slower because you have to sleep long enough that the timing usually works. I found the Multithreaded TC Java library from the same group that wrote FindBugs. It lets you specify the order of events without using Sleep(), and it's reliable. I haven't tried it yet. The biggest limitation to this approach is that it only lets you test the scenarios you suspect will cause trouble. As others have said, you really need to isolate your multithreaded code into a small number of simple classes to have any hope of thoroughly testing them. Once you've carefully tested the scenarios you expect to cause trouble, an unscientific test that throws a bunch of simultaneous requests at the class for a while is a good way to look for unexpected trouble. Update: I've played a bit with the Multithreaded TC Java library, and it works well. I've also ported some of its features to a .NET version I call TickingTest.
I just recently discovered (for Java) a tool called Threadsafe. It is a static analysis tool much like findbugs but specifically to spot multi-threading issues. It is not a replacement for testing but I can recommend it as part of writing reliable multi-threaded Java. It even catches some very subtle potential issues around things like class subsumption, accessing unsafe objects through concurrent classes and spotting missing volatile modifiers when using the double checked locking paradigm. If you write multithreaded Java give it a shot.
The following article suggests 2 solutions. Wrapping a semaphore (CountDownLatch) and adds functionality like externalize data from internal thread. Another way of achieving this purpose is to use Thread Pool (see Points of Interest). Sprinkler - Advanced synchronization object
I spent most of last week at a university library studying debugging of concurrent code. The central problem is concurrent code is non-deterministic. Typically, academic debugging has fallen into one of three camps here: Event-trace/replay. This requires an event monitor and then reviewing the events that were sent. In a UT framework, this would involve manually sending the events as part of a test, and then doing post-mortem reviews. Scriptable. This is where you interact with the running code with a set of triggers. "On x > foo, baz()". This could be interpreted into a UT framework where you have a run-time system triggering a given test on a certain condition. Interactive. This obviously won't work in an automatic testing situation. ;) Now, as above commentators have noticed, you can design your concurrent system into a more deterministic state. However, if you don't do that properly, you're just back to designing a sequential system again. My suggestion would be to focus on having a very strict design protocol about what gets threaded and what doesn't get threaded. If you constrain your interface so that there is minimal dependancies between elements, it is much easier. Good luck, and keep working on the problem.
I have had the unfortunate task of testing threaded code and they are definitely the hardest tests I have ever written. When writing my tests, I used a combination of delegates and events. Basically it is all about using PropertyNotifyChanged events with a WaitCallback or some kind of ConditionalWaiter that polls. I am not sure if this was the best approach, but it has worked out for me.
Assuming under "multi-threaded" code was meant something that is stateful and mutable AND accessed/modified by multiple threads concurrently In other words we are talking about testing custom stateful thread-safe class/method/unit - which should be a very rare beast nowadays. Because this beast is rare, first of all we need to make sure that there are all valid excuses to write it. Step 1. Consider modifying state in same synchronization context. Today it is easy to write compose-able concurrent and asynchronous code where IO or other slow operations offloaded to background but shared state is updated and queried in one synchronization context. e.g. async/await tasks and Rx in .NET etc. - they are all testable by design, "real" Tasks and schedulers can be substituted to make testing deterministic (however this is out of scope of the question). It may sound very constrained but this approach works surprisingly well. It is possible to write whole apps in this style without need to make any state thread-safe (I do). Step 2. If manipulating of shared state on single synchronization context is absolutely not possible. Make sure the wheel is not being reinvented / there's definitely no standard alternative that can be adapted for the job. It should be likely that code is very cohesive and contained within one unit e.g. with a good chance it is a special case of some standard thread-safe data structure like hash map or collection or whatever. Note: if code is large / spans across multiple classes AND needs multi-thread state manipulation then there's a very high chance that design is not good, reconsider Step 1 Step 3. If this step is reached then we need to test our own custom stateful thread-safe class/method/unit. I'll be dead honest : I never had to write proper tests for such code. Most of the time I get away at Step 1, sometimes at Step 2. Last time I had to write custom thread-safe code was so many years ago that it was before I adopted unit testing / probably I wouldn't have to write it with the current knowledge anyway. If I really had to test such code (finally, actual answer) then I would try couple of things below Non-deterministic stress testing. e.g. run 100 threads simultaneously and check that end result is consistent. This is more typical for higher level / integration testing of multiple users scenarios but also can be used at the unit level. Expose some test 'hooks' where test can inject some code to help make deterministic scenarios where one thread must perform operation before the other. As ugly as it is, I can't think of anything better. Delay-driven testing to make threads run and perform operations in particular order. Strictly speaking such tests are non-deterministic too (there's a chance of system freeze / stop-the-world GC collection which can distort otherwise orchestrated delays), also it is ugly but allows to avoid hooks.
Running multiple threads is not difficult; it is piece of cake. Unfortunately, threads usually need to communicate with each other; that's what's difficult. The mechanism that was originally invented to allow communication between modules was function calls; when module A wants to communicate with module B, it just invokes a function in module B. Unfortunately, this does not work with threads, because when you call a function, that function still runs in the current thread. To overcome this problem, people decided to fall back to an even more primitive mechanism of communication: just declare a certain variable, and let both threads have access to that variable. In other words, allow the threads to share data. Sharing data is literally the first thing that naturally comes to mind, and it appears like a good choice because it seems very simple. I mean, how hard can it be, right? What could possibly go wrong? Race conditions. That's what can, and will, go wrong. When people realized their software was suffering from random, non-reproducible catastrophic failures due to race conditions, they started inventing elaborate mechanisms such as locks and compare-and-swap, aiming to protect against such things happening. These mechanisms fall under the broad category of "synchronization". Unfortunately, synchronization has two problems: It is very difficult to get it right, so it is very prone to bugs. It is completely untestable, because you cannot test for a race condition. The astute reader might notice that "Very prone to bugs" and "Completely untestable" is a deadly combination. Now, the mechanisms I mentioned above were being invented and adopted by large parts of the industry before the concept of automated software testing became prevalent; So, nobody could see how deadly the problem was; they just regarded it as a difficult topic which requires guru programmers, and everyone was okay with that. Nowadays, whatever we do, we put testing first. So, if some mechanism is untestable, then the use of that mechanism is just out of the question, period. Thus, synchronization has fallen out of grace; very few people still practice it, and they are becoming fewer and fewer every day. Without synchronization threads cannot share data; however, the original requirement was not to share data; it was to allow threads to communicate with each other. Besides sharing data, there exist other, more elegant mechanisms for inter-thread communication. One such mechanism is message-passing, otherwise known as events. With message passing, there is only one place in the entire software system which utilizes synchronization, and that is the concurrent blocking queue collection class that we use for storing messages. (The idea is that we should be able to get at least that little part right.) The great thing about message passing is that it does not suffer from race conditions and is fully testable.
For J2E code, I've used SilkPerformer, LoadRunner and JMeter for concurrency testing of threads. They all do the same thing. Basically, they give you a relatively simple interface for administrating their version of the proxy server, required, in order to analyze the TCP/IP data stream, and simulate multiple users making simultaneous requests to your app server. The proxy server can give you the ability to do things like analyze the requests made, by presenting the whole page and URL sent to the server, as well as the response from the server, after processing the request. You can find some bugs in insecure http mode, where you can at least analyze the form data that is being sent, and systematically alter that for each user. But the true tests are when you run in https (Secured Socket Layers). Then, you also have to contend with systematically altering the session and cookie data, which can be a little more convoluted. The best bug I ever found, while testing concurrency, was when I discovered that the developer had relied upon Java garbage collection to close the connection request that was established at login, to the LDAP server, when logging in. This resulted in users being exposed to other users' sessions and very confusing results, when trying to analyze what happened when the server was brought to it's knees, barely able to complete one transaction, every few seconds. In the end, you or someone will probably have to buckle down and analyze the code for blunders like the one I just mentioned. And an open discussion across departments, like the one that occurred, when we unfolded the problem described above, are most useful. But these tools are the best solution to testing multi-threaded code. JMeter is open source. SilkPerformer and LoadRunner are proprietary. If you really want to know whether your app is thread safe, that's how the big boys do it. I've done this for very large companies professionally, so I'm not guessing. I'm speaking from personal experience. A word of caution: it does take some time to understand these tools. It will not be a matter of simply installing the software and firing up the GUI, unless you've already had some exposure to multi-threaded programming. I've tried to identify the 3 critical categories of areas to understand (forms, session and cookie data), with the hope that at least starting with understanding these topics will help you focus on quick results, as opposed to having to read through the entire documentation.
Concurrency is a complex interplay between the memory model, hardware, caches and our code. In the case of Java at least such tests have been partly addressed mainly by jcstress. The creators of that library are known to be authors of many JVM, GC and Java concurrency features. But even this library needs good knowledge of the Java Memory Model specification so that we know exactly what we are testing. But I think the focus of this effort is mircobenchmarks. Not huge business applications.
There is an article on the topic, using Rust as the language in the example code: https://medium.com/#polyglot_factotum/rust-concurrency-five-easy-pieces-871f1c62906a In summary, the trick is to write your concurrent logic so that it is robust to the non-determinism involved with multiple threads of execution, using tools like channels and condvars. Then, if that is how you've structured your "components", the easiest way to test them is by using channels to send messages to them, and then block on other channels to assert that the component sends certain expected messages. The linked-to article is fully written using unit-tests.
It's not perfect, but I wrote this helper for my tests in C#: using System; using System.Collections.Generic; using System.Threading; using System.Threading.Tasks; namespace Proto.Promises.Tests.Threading { public class ThreadHelper { public static readonly int multiThreadCount = Environment.ProcessorCount * 100; private static readonly int[] offsets = new int[] { 0, 10, 100, 1000 }; private readonly Stack<Task> _executingTasks = new Stack<Task>(multiThreadCount); private readonly Barrier _barrier = new Barrier(1); private int _currentParticipants = 0; private readonly TimeSpan _timeout; public ThreadHelper() : this(TimeSpan.FromSeconds(10)) { } // 10 second timeout should be enough for most cases. public ThreadHelper(TimeSpan timeout) { _timeout = timeout; } /// <summary> /// Execute the action multiple times in parallel threads. /// </summary> public void ExecuteMultiActionParallel(Action action) { for (int i = 0; i < multiThreadCount; ++i) { AddParallelAction(action); } ExecutePendingParallelActions(); } /// <summary> /// Execute the action once in a separate thread. /// </summary> public void ExecuteSingleAction(Action action) { AddParallelAction(action); ExecutePendingParallelActions(); } /// <summary> /// Add an action to be run in parallel. /// </summary> public void AddParallelAction(Action action) { var taskSource = new TaskCompletionSource<bool>(); lock (_executingTasks) { ++_currentParticipants; _barrier.AddParticipant(); _executingTasks.Push(taskSource.Task); } new Thread(() => { try { _barrier.SignalAndWait(); // Try to make actions run in lock-step to increase likelihood of breaking race conditions. action.Invoke(); taskSource.SetResult(true); } catch (Exception e) { taskSource.SetException(e); } }).Start(); } /// <summary> /// Runs the pending actions in parallel, attempting to run them in lock-step. /// </summary> public void ExecutePendingParallelActions() { Task[] tasks; lock (_executingTasks) { _barrier.SignalAndWait(); _barrier.RemoveParticipants(_currentParticipants); _currentParticipants = 0; tasks = _executingTasks.ToArray(); _executingTasks.Clear(); } try { if (!Task.WaitAll(tasks, _timeout)) { throw new TimeoutException($"Action(s) timed out after {_timeout}, there may be a deadlock."); } } catch (AggregateException e) { // Only throw one exception instead of aggregate to try to avoid overloading the test error output. throw e.Flatten().InnerException; } } /// <summary> /// Run each action in parallel multiple times with differing offsets for each run. /// <para/>The number of runs is 4^actions.Length, so be careful if you don't want the test to run too long. /// </summary> /// <param name="expandToProcessorCount">If true, copies each action on additional threads up to the processor count. This can help test more without increasing the time it takes to complete. /// <para/>Example: 2 actions with 6 processors, runs each action 3 times in parallel.</param> /// <param name="setup">The action to run before each parallel run.</param> /// <param name="teardown">The action to run after each parallel run.</param> /// <param name="actions">The actions to run in parallel.</param> public void ExecuteParallelActionsWithOffsets(bool expandToProcessorCount, Action setup, Action teardown, params Action[] actions) { setup += () => { }; teardown += () => { }; int actionCount = actions.Length; int expandCount = expandToProcessorCount ? Math.Max(Environment.ProcessorCount / actionCount, 1) : 1; foreach (var combo in GenerateCombinations(offsets, actionCount)) { setup.Invoke(); for (int k = 0; k < expandCount; ++k) { for (int i = 0; i < actionCount; ++i) { int offset = combo[i]; Action action = actions[i]; AddParallelAction(() => { for (int j = offset; j > 0; --j) { } // Just spin in a loop for the offset. action.Invoke(); }); } } ExecutePendingParallelActions(); teardown.Invoke(); } } // Input: [1, 2, 3], 3 // Ouput: [ // [1, 1, 1], // [2, 1, 1], // [3, 1, 1], // [1, 2, 1], // [2, 2, 1], // [3, 2, 1], // [1, 3, 1], // [2, 3, 1], // [3, 3, 1], // [1, 1, 2], // [2, 1, 2], // [3, 1, 2], // [1, 2, 2], // [2, 2, 2], // [3, 2, 2], // [1, 3, 2], // [2, 3, 2], // [3, 3, 2], // [1, 1, 3], // [2, 1, 3], // [3, 1, 3], // [1, 2, 3], // [2, 2, 3], // [3, 2, 3], // [1, 3, 3], // [2, 3, 3], // [3, 3, 3] // ] private static IEnumerable<int[]> GenerateCombinations(int[] options, int count) { int[] indexTracker = new int[count]; int[] combo = new int[count]; for (int i = 0; i < count; ++i) { combo[i] = options[0]; } // Same algorithm as picking a combination lock. int rollovers = 0; while (rollovers < count) { yield return combo; // No need to duplicate the array since we're just reading it. for (int i = 0; i < count; ++i) { int index = ++indexTracker[i]; if (index == options.Length) { indexTracker[i] = 0; combo[i] = options[0]; if (i == rollovers) { ++rollovers; } } else { combo[i] = options[index]; break; } } } } } } Example usage: [Test] public void DeferredMayBeBeResolvedAndPromiseAwaitedConcurrently_void0() { Promise.Deferred deferred = default(Promise.Deferred); Promise promise = default(Promise); int invokedCount = 0; var threadHelper = new ThreadHelper(); threadHelper.ExecuteParallelActionsWithOffsets(false, // Setup () => { invokedCount = 0; deferred = Promise.NewDeferred(); promise = deferred.Promise; }, // Teardown () => Assert.AreEqual(1, invokedCount), // Parallel Actions () => deferred.Resolve(), () => promise.Then(() => { Interlocked.Increment(ref invokedCount); }).Forget() ); }
One simple test pattern that can work for some (not all!) cases is to repeat the same test many times. For example, suppose you have a method: def process(input): # Spawns several threads to do the job # ... return output Create a bunch of tests: process(input1) -> expect to return output1 process(input2) -> expect to return output2 ... Now run each of those tests many times. If the implementation of process contains a subtle bug (e.g. deadlock, race condition, etc.) that has 0.1% chance to emerge, running the test 1000 times gives 64% probability for the bug to emerge at least once. Running the test 10000 times gives >99% probability.
If you are testing simple new Thread(runnable).run() You can mock Thread to run the runnable sequentially For instance, if the code of the tested object invokes a new thread like this Class TestedClass { public void doAsychOp() { new Thread(new myRunnable()).start(); } } Then mocking new Threads and run the runnable argument sequentially can help #Mock private Thread threadMock; #Test public void myTest() throws Exception { PowerMockito.mockStatic(Thread.class); //when new thread is created execute runnable immediately PowerMockito.whenNew(Thread.class).withAnyArguments().then(new Answer<Thread>() { #Override public Thread answer(InvocationOnMock invocation) throws Throwable { // immediately run the runnable Runnable runnable = invocation.getArgumentAt(0, Runnable.class); if(runnable != null) { runnable.run(); } return threadMock;//return a mock so Thread.start() will do nothing } }); TestedClass testcls = new TestedClass() testcls.doAsychOp(); //will invoke myRunnable.run in current thread //.... check expected }
(if possible) don't use threads, use actors / active objects. Easy to test.
You may use EasyMock.makeThreadSafe to make testing instance threadsafe
Using asynchronous methods in enterprise system
I have been working on very large enterprise system for financial institution for quite some time. I have only noticed few usages of asynchronous methods (frankly speaking maybe 2 or 3). Lets say i have 3 methods: doSomething1(), doSomething2(), doSomething3(); // X = {1,2,3} SomeResult doSomethingX() { // execution of this method takes 5-15 secs } xxx foo() { SomeResult result1 = doSomething1(); SomeResult result2 = doSomething2(); SomeResult result3 = doSomething3(); // some code } So the execution of foo takes about 3x(5-15)sec = ~30sec There is a lot of methods similar to foo in our system and I am wondering why there are not any async methods? Wouldn't just adding #Async to doSomethings() methods make it much faster? Or is it just 'we dont use threads explicitly in enterprise systems'
It is always worth remembering that code written before you joined a project may have been written by someone who had more experience, or who had to solve a unique issue you have not seen, and after trying smarter ways had to do something that seems strange to you. Maybe there is some state you're missing that would not be in place if it was done asynchronously. But of course, it could just be the case that either: a) the developers didn't know about it/use it or b) it wasn't available at the time for whatever reason. Enterprises certainly aren't allergic to asynchronous code, multi-threading, or anything else you may thing of.
If you are using spring, you can use the #Async annotation to doSomething(), but it's not all you have to do: You have to return an AsyncResult from the method and you have to use Future to manage your return values. The following "code" is taken more or less whole cloth from the spring example: https://spring.io/guides/gs/async-method/: Future res1 = doSomething("one"); Future res2 = doSomething("two"); Future res3 = doSomething("three"); // Wait until they are all done while (!(res1.isDone() && res2.isDone() && res3.isDone())) { Thread.sleep(10); //10-millisecond pause between each check } System.out.println(res1.get()); Thats already a fair amount of orchestration (perhaps there are better ways), but it gives you an idea of the amount of labor that will go in to handling concurrency at a low level. With complexity comes risk. It seems to me that most folks have come to the conclusion that it's better to let the container handle such scaling issues rather than to handle them by hand. You're supposed to let the container scale your EJBs and your queue workers. There are plenty of java implementations that let you scale in this way. Nonetheless, if you made something that took 60 seconds take 5 using a low level method like the above, go for it. You'll be a hero.
Thread Safety Vs Performance
I am deciding what is the best way to achieve high performance gain while achieving thread safety (synchronization) for required point. Consider the following case. There are two entry point in system and I want to make sure there is no two or more threads updates cashAccounts and itemStore at same time. So I created a Object call Lock and use it as follows. public class ForwardPath { public void fdWay(){ synchronized (Lock.class){ //here I am updating both cashAccount object and //itemStore object } } } . public class BackWardPath { public void bwdWay(){ synchronized (Lock.class){ //here I am updating both cashAccount object and //itemStore object } } } But this implementation will greatly decrease performance, If both ForwardPath and BackWardPath are triggered frequently. But in this case it is some what difficult to lock only cashAccount and itemStore because both these objects get updates several times inside both paths. Is there a good way to achieve both performance gain and thread safety in this scenario ?
The example is far too abstract, and the little you describe leaves no alternative to synchronization in the methods. To obtain high scalability (thats not necessarily highest performance in all situations, mind you), work is usually subdivided into units of work that are completely independent of each other (these they can be processed without any synchronization). Lets assume a simple example, summing up numbers (purely to demonstrate the principle): The naive solution would be to have one accumulator for the sum, and walk the numbers adding to the accumulator. Obviously, if you wanted to use multiple threads, the accumulator would need to be synchronized and become the major point of contention). To eliminate the contention, you can partition the numbers into multiple slices - separate units of work. Each unit of work can be summed independently (one thread per unit of work, for example). To get the final sum, add up the partial sums of each unit of work. The only point where synchronization is now needed is when combining the partial results. If you had for example 10 billion numbers, and divide them into 10 units of work, you need only synchronized 10 times - instead of 10 billion times in the naive solution. The principle is always the same here: Make sure you can do as much work as possible without synchronization, then combine the partial results to obtain the final result. Thinking on the individual operation level is to fine a granularity to lend itself well to multi threading.
Performance-gain by using Threads is an architectural question, just adding some Threads and synchronized won't do the trick and usually just screws up your code while not working any faster than before. Therefore your code example is not enough to help you on the actual problem you seem to be facing, as each threaded solution is unique to your actual code.