"Cannot reproduce" - is Java deterministic multithreading possible? - java

Is this possible to run multithreaded Java application in a deterministic fashion? I mean to have always the same thread switching in two different runs of my application.
Reason for that is to run simulation in exactly the same conditions in every run.
Similar case is when one gives some arbitrary seed when using random number generator to obtain always the same "random" sequence.

I am not aware of any practical way to do this.
In theory, it would be possible to implement a bytecode interpreter with an entirely deterministic behavior under certain assumptions1. You would need to simulate the multiple threads by implementing the threads and the thread scheduling entirely in software and using a single native thread.
1 - For example, no I/O, and no use of the system clock.

No it is not possible (other than to simulate it yourself) to use multiple threads interleaving in the same way each time around. Threads are not designed to do that.
If you want deterministic results, don't use threads.

As quoted by OldCurmudgeon, it's not possible with multi threading.
If you decide to use single Thread, I prefer newSingleThreadExecutor to normal Thread due to flexibility and advantages of newSingleThreadExecutor
Use
newSingleThreadExecutor from Executors
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.)
Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
Related SE questions:
Difference between Executors.newFixedThreadPool(1) and Executors.newSingleThreadExecutor()
ExecutorService vs Casual Thread Spawner

Related

How to globally set thread pool for all CompletableFuture

I am trying to mimic what single threaded async programming in Javascript in Java with the use of async / await library by EA (ea-async). This is mainly because I do not have long-lasting CPU bound computations in my program and I want to code single thread lock free code in Java.
ea-async library heavily relies on the CompletableFuture in Java and underneath Java seems to use ForkJoinPool to run the async callbacks. This puts me into multi threaded environment as my CPU is multi-core. It seems for every CompletableFuture task, I can supply async with my custom thread pool executor. I can supply Executors.newSingleThreadExecutor() for this but I need a way to set this globally so that all CompletableFuture will be using this executor within the single JVM process. How do I do this?
ea-async library heavily relies on the CompletableFuture in Java and
underneath Java seems to use ForkJoinPool to run the async callbacks.
That is the default behavior of CompleteableFuture:
All async methods without an explicit Executor argument are performed
using the ForkJoinPool.commonPool() (unless it does not support a
parallelism level of at least two, in which case, a new Thread is
created to run each task). This may be overridden for non-static
methods in subclasses by defining method defaultExecutor().
That's a defined characteristic of the class, so if you're using class CompleteableFuture, not a subclass, and generating instances without specifying an Executor explicitly, then a ForkJoinPool is what you're going to get.
Of course, if you are in control of the CompletableFutures provided to ea-async then you have the option to provide instances of a subclass that defines defaultExecutor() however you like. Alternatively, you can create your CompleteableFuture objects via the static factory methods that allow you to explicitly specify the Executor to use, such as runAsync​(Runnable, Executor).
But that's probably not what you really want to do.
If you use an executor with only one thread, then your tasks can be executed asynchronously with respect to the thread that submits them, yes, but they will be serialized with respect to each other. You do get only one thread working on them, but it will at any time be working on a specific one, sticking with that one only until it finishes, regardless of the order in which the responses actually arrive. If that's satisfactory, then it's unclear why you want async operations at all.
This puts me into multi threaded environment as my CPU is multi-core.
It puts you in multiple threads regardless of how many cores your CPU has. That's what Executors do, even Executors.newSingleThreadExecutor(). That's the sense of "asynchronous" they provide.
If I understand correctly, you are instead looking to use one thread to multiplex I/O to multiple remote web applications. That is what java.nio.channels.Selector is for, but using that generally requires either managing the I/O operations yourself or using interfaces designed to interoperate with selectors. If you are locked in to third-party interfaces that do not afford use of a Selector, then multithreading and multiprocessing are your only viable alternatives.
In comments you wrote:
I'm starting to think maybe BlockingQueue might do the job in
consolidating all API responses into one queue as tasks where a single
thread will work on them.
Again, I don't think that you want everything that comes with that, and if in fact you do, then I don't see why it wouldn't be even better and easier to work synchronously instead of asynchronously.

Is there a default thread pool in java

I can create a new threadpool in java and execute tasks on it using the ExecutorService.newFixedThreadPool and ExecutorService.submit methods.
Is there a 'default' threadpool that I can reuse for all executor services in my java program? Or do I just have to create a singleton that contains a default threadpool? C# has a default threadpool that runs tasks when the Task.Factory.StartNew method is called.
Since Java-8 there's ForkJoinPool.commonPool() which is used by default by many methods involving parallel or asyncronous execution. For example, Arrays.parallelSort() or parallel Stream API operation use this pool. You can submit your own tasks to this pool using many methods of CompletableFuture class like CompletableFuture.supplyAsync().
Using separate threadpools is good, default practice, and sharing threadpools is a (possibly premature) optimization.
Through Java 7 the answer is no, there is not a default threadpool, and the recommendation is to have many threadpools. It's good separation and will prevent blocking behavior on one collection of tasks from interfering with another.
If you share threadpools you should ask questions like:
will the logging framework be able to distinguish tasks? (Threads is one way to distinguish.)
If task pool A accidentally requests way too many threads and gets cut off, should task pool B starve? When you notice task pool B is failing will you be able to diagnose the problem in task pool A?
If pool A blocks should B starve?
Maybe you create something like a LightweightThreadpool. And the first 5 tasks you write use it in a lightweight fashion. And the 6th task... does, except it also writes errors to disk, and those errors are surprisingly big, and sometimes there's many of them, and they're not throttled. Suddenly the first 5 tasks are starved and have no idea what hit them, and furthermore, when you wrote those tasks, you really believed they were secure and might not have prepared for this type of incident.
So sharing threadpools is about as okay as having two different processes run on the same server is okay. You should think about resource management very carefully first and understand that the tasks are resource-coupled now. The lack of a default threadpool is trying to force you to use separate ones by default, and think about these questions carefully before sharing one.
As of Java 8 the answer is "yes" (per Tagir's answer on this question). But you will notice everything will start horribly failing if you submit blocking tasks to that threadpool.

Is Thread to be favoured over Executor here?

As far as I understand Executors help handling the execution of runnables. E.g. I would choose using an executor when I have several worker threads that do their job and then terminate.
The executor would handle the creation and the termination of the Threads needed to execute the worker runnables.
However now I am facing another situation. A fixed number of classes/objects shall encapsulate their own thread. So the thread is started at the creation of those objects and the Thread shall continue running for the whole life time of these objects.
The few objects in turn are created at the start of the programm and exist for the whole run time.
I guess Threads are preferable over Executors in this situation, however when I read the internet everybody seems to suggest using Executors over Threads in any possible situation.
Can somebody please tell me if I want to choose Executors or Threads here and why?
Thanks
You're somewhat mixing things. Executor is just an interface. Thread is a core class. There's nothing which directly implies that Executor implementations execute tasks in separate threads.
Read the first few lines of the JavaDoc.
Executor
So if you want full control, just use Thread and do things on your own.
Without knowing more about the context, it's hard to give a good answer, but generally speaking I'd say that the situations that calls for using Thread are pretty few and far between. If you start trying to synchronize your program "manually" using synchronized I bet things will get out of hand quickly. (Not to mention how hard it will be to debug the code.)
Last time I used a thread was when I wanted to record some audio in the background. It was a "start"/"stop" kind of thing, and not "task oriented". (I tried long and hard to try to find an audio library that would encapsulate that for me but failed.)
If you choose to go for a thread-solution, I suggest you try to limit the scope of the thread to only execute within the associated object. This will to an as large extent as possible avoid forcing you to think about happens-before relations, thread-safe publishing of values etc throughout the code.
ExecutorService can have thread pool
It optimizes performance, because creating a Thread is expensive.
ExecutorService has life cycle control
shutdown(), shutdownNow() etc are provided.
ExecutorService is flexible
You could invoke variety of behaviors: customize ThreadFactory, set thread pool size, delay behavior ScheduledThreadPoolExecutor etc...

Advantages of Executors over new Thread

What benefit is there to use Executors over just Threads in a Java program.
Such as
ExecutorService pool = Executors.newFixedThreadPool(2);
void someMethod() {
//Thread
new Thread(new SomeRunnable()).start();
//vs
//Executor
pool.execute(new SomeRunnable());
}
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)? Does it actually multiplex runnables onto the threads it creates instead? If not is it just a way to avoid having to write new Thread(runnable).start() every time?
Yes, executors will generally multiplex runnables onto the threads they create; they'll constrain and manage the number of threads running at once; they'll make it much easier to customize concurrency levels. Generally, executors should be preferred over just creating bare threads.
Creating new threads is expensive. Because Executors uses a thread pool, you get to easily reuse threads, resulting in better performance.
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)?
Executors#newFixedThreadPool(int), Executors#newSingleThreadExecutor do this, each one under different terms (read the proper javadoc to know more about it).
Does it actually multiplex runnables onto the threads it creates instead?
Yes
If not is it just a way to avoid having to write new Thread(runnable).start() every time?
ExecutorService helps you to control the way you handle threads. Of course, you can do this manually, but there's no need to reinvent the wheel. Also, there are other functionalities that ExecutorService provides you like executing asynchronous tasks through the usage of Future instances.
There are multiple concerns related to thread.
managing threads
resource utilization
creation of thread
Executors provides different kind of implementation for creating a pool of threads. Also thread creation is a costly affair. Executors creates and manages these threads internally. Details about it can be found in the below link.
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
As I said over in a related question, Threads are pretty bad. Executors (and the related concurrency classes) are pretty good:
Caveat: Around here, I strongly discourage the use of raw Threads. I
much prefer the use of Callables and FutureTasks (From the javadoc: "A
cancellable asynchronous computation"). The integration of timeouts,
proper cancelling and the thread pooling of the modern concurrency
support are all much more useful to me than piles of raw Threads.
For example, I'm currently replacing a legacy piece of code that used a disjoint Thread running in a loop with a self-timer to determine how long it should Thread.sleep() after each iteration. My replacement will use a very simple Runnable (to hold a single iteration), a ScheduledExecutorService to run one of the iterations and the Future resulting from the scheduleAtAFixedRate method to tune the timing between iterations.
While you could argue that replacement will be effectively equivalent to the legacy code, I'll have replaced an arcane snarl of Thread management and wishful thinking with a compartmentalized set of functionality that separates the concerns of the GUI (are we currently running?) from data processing (playback at 5x speed) and file management (cancel this run and choose another file).

Parallel-processing in Java; advice needed i.e. on Runnanble/Callable interfaces

Assume that I have a set of objects that need to be analyzed in two different ways, both of which take relatively long time and involve IO-calls, I am trying to figure out how/if I could go about optimizing this part of my software, especially utilizing the multiple processors (the machine i am sitting on for ex is a 8-core i7 which almost never goes above 10% load during execution).
I am quite new to parallel-programming or multi-threading (not sure what the right term is), so I have read some of the prior questions, particularly paying attention to highly voted and informative answers. I am also in the process of going through the Oracle/Sun tutorial on concurrency.
Here's what I thought out so far;
A thread-safe collection holds the objects to be analyzed
As soon as there are objects in the collection (they come a couple at a time from a series of queries), a thread per object is started
Each specific thread takes care of the initial pre-analysis preparations; and then calls on the analyses.
The two analyses are implemented as Runnables/Callables, and thus called on by the thread when necessary.
And my questions are:
Is this a reasonable scheme, if not, how would you go about doing this?
In order to make sure things don't get out of hand, should I implement a ThreadManager or some thing of that sort, which starts and stops threads, and re-distributes them when they are complete? For example, if i have 256 objects to be analyzed, and 16 threads in total, the ThreadManager assigns the first finished thread to the 17th object to be analyzed etc.
Is there a dramatic difference between Runnable/Callable other than the fact that Callable can return a result? Otherwise should I try to implement my own interface, in that case why?
Thanks,
You could use a BlockingQueue implementation to hold your objects and spawn your threads from there. This interface is based on the producer-consumer principle. The put() method will block if your queue is full until there is some more space and the take() method will block if the queue is empty until there are some objects again in the queue.
An ExecutorService can help you manage your pool of threads.
If you are awaiting a result from your spawned threads then Callable interface is a good idea to use since you can start the computation earlier and work in your code assuming the results in Future-s. As far as the differencies with the Runnable interface, from the Callable javadoc:
The Callable interface is similar to Runnable, in that both are designed for classes whose instances are potentially executed by another thread. A Runnable, however, does not return a result and cannot throw a checked exception.
Some general things you need to consider in your quest for java concurrency:
Visibility is not coming by defacto. volatile, AtomicReference and other objects in the java.util.concurrent.atomic package are your friends.
You need to carefully ensure atomicity of compound actions using synchronization and locks.
Your idea is basically sound. However, rather than creating threads directly, or indirectly through some kind of ThreadManager of your own design, use an Executor from Java's concurrency package. It does everything you need, and other people have already taken the time to write and debug it. An executor manages a queue of tasks, so you don't need to worry about providing the threadsafe queue yourself either.
There's no difference between Callable and Runnable except that the former returns a value. Executors will handle both, and ready them the same.
It's not clear to me whether you're planning to make the preparation step a separate task to the analyses, or fold it into one of them, with that task spawning the other analysis task halfway through. I can't think of any reason to strongly prefer one to the other, but it's a choice you should think about.
The Executors provides factory methods for creating thread pools. Specifically Executors#newFixedThreadPool(int nThreads) creates a thread pool with a fixed size that utilizes an unbounded queue. Also if a thread terminates due to a failure then a new thread will be replaced in its place. So in your specific example of 256 tasks and 16 threads you would call
// create pool
ExecutorService threadPool = Executors.newFixedThreadPool(16);
// submit task.
Runnable task = new Runnable(){};;
threadPool.submit(task);
The important question is determining the proper number of threads for you thread pool. See if this helps Efficient Number of Threads
Sounds reasonable, but it's not as trivial to implement as it may seem.
Maybe you should check the jsr166y project.
That's probably the easiest solution to your problem.

Categories