I started reading Vert.x framework documentation, but i didn't understand how it works and what is a Reactor pattern, i read this article https://dzone.com/articles/understanding-reactor-pattern-thread-based-and-eve and noticed that instead of general servlet based (one request one thread) approch, Reactor pattern uses event-driven architecture where single thread named event loop takes a request put it to some sort of the job queue and provides a handler that will be executed once the task has been finished , and code in handler will be executed by this event loop, so golden rule is - don't block event loop.
What I don't understand is , from article:
Those handlers/callbacks may utilize a thread pool in multi-core environments.
So handlers use thread pool , how this pool is difference from the standard thread pool for example Servlet's container TOMCAT. How these two concepts are different from each other in case of Http server if both are using Thread pool to manage requests.
Thank in advance
Forget that DZone article. Forget the Reactor pattern. Learn Asynchronous procedure call.
There are 2 ways to split all the work in computer to parts: threads and tasks (in Java- tasks are Runnables). Tasks execute on a thread pool when they are ready. And when they are not ready, they do not occupy thread with its huge stack, and we can afford to have millions of tasks in single JVM instance, while 10000 threads in single JVM instance is problematic.
The main problem with tasks is when task needs data which is not ready (not calculated by other task, or not yet arrived via network). In the thread world, the thread waiting for data executes a blocking operation like inputsream.read(), but tasks are not allowed to do this, or they would have occupied too many threads from thread pool and all advantages of task-based programming would be lost. So tasks are augmented with mechanisms which submit that task to the thread pool exactly when all their parameters arrived. Task with such a mechanism is called asynchronous procedure call. All the event-driven architectures are variants of asynchronous procedure call: Vert.x, RxJava, Project Reactor, Akka Actors etc. They just pretend to be something original and not always talk about this.
Related
Given a threadpool from Executors.newSingleThreadExecutor(), will execute all of the requests on the same CPU core/thread or at every task it could pick a new one?
A core and a thread are two different things. As far as physical core is concerned, the library documentation of ThreadPoolExecutor doesn't guarantee any such claim to bind a core to a thread. Neither does JVM out of the box provide such low level support to bind a thread to a physical core of the processor.
Binding a process or thread to a processor is known as processor affinity. For whatever reason, if you wish to bind a thread to a particular core, you would want to look at this : https://github.com/OpenHFT/Java-Thread-Affinity.
If your question is about thread and whether the same thread will execute all the tasks submitted to a pool? Then not necessarily, based on the idle time of the thread, the pool may terminate and create a new thread to execute a task. I hope this clears your query.
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.)...
like - network operation and bitmap manipulating an image loading and other kinds of work can I create a single TheadPoolExecuter for my whole application and execute on it.
if the answer is no -> why? and how to create thread pool for every single operation?
or if yes -> is performance problem occurs?
thanks in advance.
Both of approach have advantages and disadvantages.
In case of single thread pool (singleton implementation, I suppose):
➕ you have one entry point to submit background task
➕ it easily to implement and control life cycle
➖ if you have a lot of different quick tasks and some long running task, long running tasks may hold all thread in limited pool while user wait some quick action in UI
Different thread pools (one pool for one type of task):
➕ thread pool of long-running tasks can accumulate task while quick task can be executed in their own thread pool in-depend
➕ you know everything about tasks in your application - you can fine-tune pool size for every type of task, setup threads priority, initial stack size etc. with thread factory
➕ if you define thread group and thread name, it can help you in debug
➖ have different thread pools involve to hard control their life cycle
➖ this implementation will not give a lot of benefits in poor separation by tasks classes
Any case, you need some compromise and an assessment of the advantages
Talking teorically i think you can do that and according to oracle documentation should be improve your performance:
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
I get that with threads being nonblocking, we don't need to have Thread sprawl depending on N concurrent requests, but rather we put our tasks in a single event loop in our reactive web programming pattern.
Yes, that can help, but since the event loop is a queue, what if the first task to be processed blocks forever? Then the event loop will never progress and thus end of responses and processing other than queueing more tasks. Yes, timeouts are probably possible, but I can't wrap my head around how the event loop can be a good solution.
Say you have 3 tasks that take 3 seconds to wait for IO and run each executions and they got submitted to the event queue. Then they will still take 9 seconds to be able to be processed and also to execute once IO resolved. In the case of making threads that block, this would have resolved in 3 seconds since they run concurrently.
Where I can see a benefit is if the event loop is not really a queue and upon signal that a task is ready to be processed, it dispatches that task to be processed. In that case though, this would mean that order of task execution is not maintained and also each task has to still be running a thread in order to be able to tell when IO is resolved.
Maybe I am not understanding the event loop and thread handling correctly. Can someone correct me please because it seems like this Reactor pattern seems to make things possibly worse.
Lastly, upon X requests in Spring Reactor, does only 1 thread get created to run handlers instead of the traditional X threads? In that case, if someone accidently wrote blocking code, doesnt that mean each subsequent requests get queued?
It is not a good idea to use the event loop for long running tasks. This is considered an anti-pattern. Usually it is merely used for quickly picking up imminent events, but not actually doing the work associated with these events if the work would block the event loop noticeably. You would want to use a separate thread pool for executing long running tasks. So the event loop would usually only initiate work using asynchronous and hence non-blocking structures (or actually doing the work only if it can be done very quickly) and pass the heavier and possibly blocking tasks to a separate thread pool (for CPU intensive computations) or to the operating system (such as data buffers to be sent over the network).
Also, don't be fooled by the fact that only one thread is dealing with the events, it is very fast and is usually enough for even demanding applications. Platforms like NodeJS or frameworks like Netty (used in Akka, Play framework, Apache Cassandra, etc.) are using an event loop at their heart with great success. One should just be aware of the fact, that performing blocking operations inside the event loop is generally a bad idea.
Please have a look at some of these posts for more information:
The reactor pattern and non blocking IO
Unix Network Programming
Kotlin Webflux
Slightly off topic but still a very prominent example: Don't Block the Event Loop (NodeJS)
A question on using threads in java (disclaimer - I am not very experienced with threads so please allow some leeway).
Overview:
I was wondering whether there was a way for multiple threads to add actions to be performed to a queue which another thread would take care of. It does not matter really what order - more important that the actions in the queue are taken care of one at a time.
Explanation:
I plan to host a small server (using servlets). I want each connection to a client to be handled by a separate thread (so far ok). However, each of these threads/clients will be making changes to a single xml file. However, the changes cannot be done at the same time.
Question:
Could I have each thread submit the changes to be made to a queue which another thread will continuously manage? As I said it does not matter on the order of the changes, just that they do not happen at the same time.
Also, please advise if this is not the best way to do this.
Thank you very much.
This is a reasonable approach. Use an unbounded BlockingQueue (e.g. a LinkedBlockingQueue) - the thread performing IO on the XML file calls take on the queue to remove the next message (blocking if the queue is empty) then processing the message to modify the XML file, while the threads submitting changes to the XML file will call offer on the queue in order to add their messages to it. The BlockingQueue is thread-safe, so there's no need for your threads to perform synchronization on it.
You could have the threads submit tasks to an ExecutorService that has only one thread. Or you could have a lock that allows only one thread to alter the file at once. The later seems more natural, as the file is a shared resource. The queue is the implied queue of threads awaiting a lock.
The Executor interface provides the abstraction you need:
An object that executes submitted Runnable tasks. This interface provides a way of decoupling task submission from the mechanics of how each task will be run, including details of thread use, scheduling, etc. An Executor is normally used instead of explicitly creating threads."
A single-threaded executor service seems like exactly the right tool for the job. See Executors.newSingleThreadExecutor(), whose javadoc says:
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.) Tasks are
guaranteed to execute sequentially, and no more than one task will be
active at any given time. Unlike the otherwise equivalent
newFixedThreadPool(1) the returned executor is guaranteed not to be
reconfigurable to use additional threads.
Note that in a JavaEE context, you need to take into consideration how to terminate the worker thread when your webapp is unloaded. There are other questions here on SO that deal with this.
I'm working on a project where execution time is critical. In one of the algorithms I have, I need to save some data into a database.
What I did is call a method that does that. It fires a new thread every time it's called. I faced a runoutofmemory problem since the loaded threads are more than 20,000 ...
My question now is, I want to start only one thread, when the method is called, it adds the job into a queue and notifies the thread, it sleeps when no jobs are available and so on. Any design patterns available or examples available online ?
Run, do not walk to your friendly Javadocs and look up ExecutorService, especially Executors.newSingleThreadExecutor().
ExecutorService myXS = Executors.newSingleThreadExecutor();
// then, as needed...
myXS.submit(myRunnable);
And it will handle the rest.
Yes, you want a worker thread or thread pool pattern.
http://en.wikipedia.org/wiki/Thread_pool_pattern
See http://www.ibm.com/developerworks/library/j-jtp0730/index.html for Java examples
I believe the pattern you're looking for is called producer-consumer. In Java, you can use the blocking methods on a BlockingQueue to pass tasks from the producers (that create the jobs) to the consumer (the single worker thread). This will make the worker thread automatically sleep when no jobs are available in the queue, and wake up when one is added. The concurrent collections should also handle using multiple worker threads.
Are you looking for java.util.concurrent.Executor?
That said, if you have 20000 concurrent inserts into the database, using a thread pool will probably not save you: If the database can't keep up, the queue will get longer and longer, until you run out of memory again. Also, note that an executors queue is volatile, i.e. if the server crashes, the data in it will be gone.