Okay, I know it's dangerous, I know it's deprecated, and I know using it would make baby Jesus cry. I think I'm aware of the implications of calling it and have read this related question.
Here's my scenario. I would like to test a data processing library. It runs multiple jobs, one per thread. Each job only communicates with other jobs via an out-of-process queueing system. Otherwise, jobs are independent: there is no shared state between threads, at least not in my code base.
I would like to test that if some terrible thing such as an OutOfMemoryError or a cosmic ray killing the VM happens at some random point in a job, that the rest of the system is okay. Therefore I want to stop a thread at a completely arbitrary point, and killing the thread should not leave resources accessible by other threads in an undefined state. The job logic is part of a framework that I don't want to compromise for the purposes of this test so it's not viable to intersperse random exits throughout the job code.
Is this an appropriate use of Thread.stop()? And so that this is not an XY question, is there any other practical way to accomplish my goal? (I suppose it could be done with bytecode instrumentation but I think that would be tremendously difficult.)
Related
I am working on a platfor that hosts small Java applications, all of which currently uses a single thread, living inside a Docker engine, consuming data from a Kafka server and logging to a central DB.
Now, I need to put another Java application to this platform. This app at hand uses multithreading relatively heavily, I already tested it inside a Docker container and it works perfectly there, so I'm ready to deploy it on the platform where it would be scaled manually, that is, some human would define the number of containers that would be started, each of them containing an instance of this app.
My Architect has an objection, saying that "In a distributed environment we never use multithreading". So now, I have to refactor my application eliminating any thread related logic from it, making it single threaded. I requested a more detailed reasoning from him, but he yells "If you are not aware of this principle, you have no place near Java".
Is it really a mistake to use a multithreaded Java application in a distributed system - a simple cluster with ten or twenty physical machines, each hosting a number of virtual machines, which then runs Docker containers, with Java applications inside them.
Honestly, I don't see the problem of multithreading inside a container.
Is it really a mistake or somehow "forbidden"?
Thanks.
When you write for example a web application that will run in a Java EE application server, then normally you should not start up your own threads in your web application. The application server will manage threads, and will allocate threads to process incoming requests on the server.
However, there is no hard rule or reason why it is never a good idea to use multi-threading in a distributed environment.
There are advantages to making applications single-threaded: the code will be simpler and you won't have to deal with difficult concurrency issues.
But "in a distributed environment we never use multithreading" is not necessarily always true and "if you are not aware of this principle, you have no place near Java" sounds arrogant and condescending.
I guess he only tells you this as using a single thread eliminates multi threading and data ordering issues.
There is nothing wrong with multithreading though.
Distributed systems usually have tasks that are heavily I/O bound.
If I/O calls are blocking in your system
The only way to achieve concurrency within the process is spawning new threads to do other useful work. (Multi-threading).
The caveat with this approach is that, if they are too many threads
in flight, the operating system will spend too much time context
switching between threads, which is wasteful work.
If I/O calls are Non-Blocking in your system
Then you can avoid the Multi-threading approach and use a single thread to service all your requests. (read about event-loops or Java's Netty Framework or NodeJS)
The upside for single thread approach
The OS does not any wasteful thread context switches.
You will NOT run into any concurrency problems like dead locks or race conditions.
The downside is that
It is often harder to code/think in a non-blocking fashion
You typically end up using more memory in the form of blocking queues.
What? We use RxJava and Spring Reactor pretty heavily in our application and it works pretty fine. You can't work with threads across two JVMs anyway. So just make sure that your logic is working as you expect on a single JVM.
I have an application which is scheduler running different threads.
The application may load new Runnable classes and run them.
Currently the application is in production, that is it's running on remote server.
My team consists of 3 people developing Runnable classes.
When the class is ready, it's uploaded to server and loaded to scheduler.
I would like to give my team the ability to debug specific threads.
That is: person A may debug threads of Runnable A, B-B, and so on.
Giving them the full access to the remote JVM is not a solution, because
the developers are not allowed to see the system core, and each others solutions.
So my question is: how to allow multiple remote debugging with thread specific connections?
Preferable IDE: Eclipse
EDIT:
It's possible to connect remotely to specific thread with jdb
http://docs.oracle.com/javase/7/docs/technotes/tools/windows/jdb.html
Here is an example: http://www.itec.uni-klu.ac.at/~harald/CSE/Content/debugging.html
1) Find your thread with jdb threads
2) Put breakpoint and enter the wanted thread
Still the security issue stays.
One solution was to compile protected code without debug symbols, but it will only protect the core, allow seeing each other's threads.
So, next step - digging Security Manager. Maybe there's privilege layer suitable for my situation.
I'm not sure I've got a good answer to your question, but let's see how it pans out.
As I understand it you want to allow different developers to debug their class alone, and their class runs as a thread as part of a single Java process.
On the face of it that sort of runs counter to the nature of debugging in that normally you have access to everything in the process. I don't imagine that Java is any different to any other language in this respect (I'm no Java programmer).
So how about running the classes in separate Java processes. That way I presume the standard Eclipse tools would allow each developer to remote attach and debug their class.
However I presume that these classes need to interact with each other in some way, otherwise you wouldn't be asking your question in the first place. And running each class in a separate process (JVM) sounds like a bad thing as far as interaction is concerned.
So how about a different form of interaction where tbe process boundary between each class doesn't really matter that much? You could look at using JCSP which, as far as I can tell, doesn't really care if two threads are in the same process or not.
It's a completely different interaction model, based solely on synchronous message passing. You get some nice fringe benefits - scalability is suddenly no longer a massive problem, and it allows you to dodge many pitfalls normally associated with multithreaded programs (deadlock, etc). However if you've already written a large amount of code, adopting JCSP is probably a significant rewrite.
Is that anywhere near the mark? Good luck.
I am developing an application in Java that uses threads to continuously retrieve data from a website. I would like to use Junit to test them but this is not straightforward. How is it possible to test these threads that do not even have a termination point?
One possiblity is to pull out the work that the threads do into helper methods or classes that can be tested separately in single-threaded unit tests.
Another is to provide mock objects that are invoked by the threads, and can check that the expected behaviour occurs.
Another is to spawn the worker threads, and get your test to poll something that will tell it whether the threads worked OK (preferably with a timeout so you tests doesn't run forever. The problem here is that your tests can be slow and non-reproducible.
why not use jvisualvm (which comes packaged with the jdk 6 and up) to monitor the threads
http://docs.oracle.com/javase/6/docs/technotes/guides/visualvm/threads.html
It is not clear what you mean by ‘test them’. It’s hard for me to see what your thread is – how much functionality is has etc. A classic unit test would test the functions in your class, each on its own. But it seems that is not what you want. I assume you want to test whether many of your threads run in parallel and still do the right thing. This kind of integration test is indeed difficult.
A threaded test is in order here. You have to decide how much of the environment you want to mock – run your tests against the real web site or not. The first may not be viewed friendly by the operators, the latter might introduce errors. I would recommend TestNG instead of JUnit, as it will easily allow you to run tests in parallel in any number of threads.
Well I think it depends on exactly what you're trying to test.
If you're just trying to test whether or not threads can be spawned, well that's silly - it's baked into the JVM, and isn't going to fail any time soon. (If you have some particular resource condition like low memory that would cause it to fail, I guess that makes sense, but in most I'd say not.)
I would break the test up into two components. Have a test that just does the data retreval, regardless of it is in its own thread or not. Then have your 'black box' test that tells your central component "Go get this data" - it spawns its threads as it feels it needs to.
I make a tool and provide an API for external world, but I am not sure whether it is thread safe. Because users may want t use it in multiple-thread environment. Is there any way or tool that I can use to verify whether my API is thread safe in Java?
No. There is no such tool. Proving that a complex program is thread safe is very hard.
You have to analyze your program very carefully to ensure that is thread safe. Consider buying "Java concurrency in practice" (very good explanation of concurrency in java).
Stress tests, or static analysis tools like PMD and FindBugs can uncover some concurrency bugs in your code. So these can show if your code is not thread-safe. However they can never prove if it is thread-safe.
The most effective method is a thorough code review by developer(s) experienced in concurrency.
You can always stress-test it with tools like jmeter.
But the main problem with threads is that they're mostly unpredictable, so even with stress-tests etc. you can't be 100% sure that it will be totally thread safe.
Resources :
Wikipedia - Thread-safety
This is a variant (or so called "reduction") of the Halting Problem. Therefore it is provably unsolvable. for all non-trivial cases. (Yes, that's an edit)
That means you can find errors by any usual means (statistics, logic) but you can never completely prove that there are none.
I suppose those people saying proving an arbitrary multithreaded program is thread-safe is impossible are, in a way, correct. An arbitrary multithreaded program, coded without following strict guidelines, simply will have threading bugs, and you can't validly prove something that isn't true.
The trick is not to write an arbitrary program, but one with threading logic simple enough to possibly be correct. This then can be unambiguously validated by a tool.
The best such tool I'm aware of is CheckThread. It works on the basis of either annotations, or xml config files. If you mark a method as '#ThreadSafe' and it isn't, then you get a compile-time error. This is checked by looking at the byte code for thread-unsafe operations, e.g. reads/write sequences on unsynchronised data fields.
It also handles those APIs that require methods to be called on specific threads, e.g. Swing.
It doesn't actually handle deadlocks, but those can be statically eliminated without even requiring annotation, by using a tool such as Jlint. You just need to follow some minimal standards like ensuring locks are acquired according to a DAG, not willy-nilly.
You cannot and never will be able to automatically proof that a program is threadsafe anymore that you can prove that a program is correct (unless you think you solved the halting program, which you didn't).
So, no, you cannot verify that an API is threadsafe.
However in quite some case you can prove that it is broken, which is great!
You may also be interested in automatic deadlock detection, which in quite some case simply "just work". I'm shipping a Java program on hundreds of desktops with such a deadlock detector installed and it is a wonderful tool. For example:
http://www.javaspecialists.eu/archive/Issue130.html
You can also stress test your application in various ways.
Bogus multi-threaded programs tend to not work very well when a high load is present on the system.
Here's a question I asked about how to create easily create a high CPU load on a Un*x system, for example:
Bash: easy way to put a configurable load on a system?
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How should I unit test threaded code?
The classical unit testing is basically just putting x in and expecting y out, and automating that process. So it's good for testing anything that doesn't involve time. But then, most of the nontrivial bugs I've come across have had something to do with timing. Threads corrupt each others' data, or cause deadlocks. Nondeterministic behavior happens – in one run out of million. Hard stuff.
Is there anything useful out there for "unit testing" parts of multithreaded, concurrent systems? How do such tests work? Isn't it necessary to run the subject of such test for a long time and vary the environment in some clever manner, to become reasonably confident that it works correctly?
Most of the work I do these days involves multi-threaded and/or distributed systems. The majority of bugs involve "happens-before" type errors, where the developer assumes (wrongly) that event A will always happen before event B. But every 1000000th time the program is run, event B happens first, and this causes unpredictable behavior.
Additionally, there aren't really any good tools to detect timing issues, or even data corruption caused by race conditions. Tools like Helgrind and drd from the Valgrind toolkit work great for trivial programs, but they are not very useful in diagnosing large, complex systems. For one thing, they report false positives quite frequently (Helgrind especially). For another thing, it's difficult to actually detect certain errors while running under Helgrind/drd simply because programs running under Helgrind run almost 1000x slower, and you often need to run a program for quite a long time to even reproduce the race condition. Additionally, since running under Helgrind totally changes the timing of the program, it may become impossible to reproduce a certain timing issue. That's the problem with subtle timing issues; they're almost Heisenbergian in the sense that altering a program to detect timing issues may obscure the original issue.
The sad fact is, the human race still isn't adequately prepared to deal with complex, concurrent software. So unfortunately, there's no easy way to unit-test it. For distributed systems especially, you should plan your program carefully using Lamport's happens-before diagrams to help you identify the necessary order of events in your program. But ultimately, you can't really get away from brute-force unit testing with randomly varying inputs. It also helps to vary the frequency of thread context-switching during your unit-test by, e.g. running another background process which just takes up CPU cycles. Also, if you have access to a cluster, you can run multiple unit-tests in parallel, which can detect bugs much quicker and save you a lot of time.
If you can run your tests under Linux, valgrind includes a tool called helgrind which purports to detect race conditions and potential deadlocks in programs that use pthreads; you might get some benefit from running your multithreaded code under that, since it will report potential errors even if they didn't actually occur in that particular test run.
I have never heard of anything that can.
I guess if someone was to design one, it would have to have exact control over the execution of the threads and execute all possible combinations of stepping of the threads.
Sounds like a major task, not to mention the mathematical combinations for non-trivial sized threads when there are a handful or more of them...
Although, a quick search of stackoverflow... Unit testing a multithreaded application?
If the tested system is simple enough you could control the concurrency quite well by blocking operations in external mockup systems. This blocking can be done for example by waiting for some other operation to be started. If you can control all external calls this might work quite well by implementing different blocking sequences. I have tried this and it does reveal lock-level bugs quite well if you know possible problematic sequences well. And compared to many other concurrency testing it is quite deterministic. However this approach doesn't detect low level race conditions too well. I usually just go for load testing to find those, but I quess that isn't exactly unit testing.
I have seen these concurrency testing frameworks for .net, I'd assume its only matter of time before someone writes one for Java (hopefully).
And not to forget good old code reading. One of the best ways to find concurrency bugs is to just read through the code once again giving it your full concentration.
Perhaps the answer is that you shouldn't. In concurrent systems, there may not always be a single deterministic answer that is correct.
Take the example of people boarding a train and choosing a seat. You are going to end up with different results everytime.
Awaitility is a useful framework when you need to deal with asynchronicity in your tests. It allows you to wait until some state somewhere in your system is updated. For example:
await().untilCall( to(myService).myMethod(), equalTo(3) );
or
await().until( fieldIn(myObject).ofType(int.class), greaterThan(1));
It also has Scala and Groovy support.