Our current course assignment specifies that we are supposed to create a manager for a thread pool using the "Object Pool Manager" design pattern which spawns a set amount of threads. The ownership of these threads shall be transferred to the client and then back to the pool after the client has finished using it. If no thread exists in the pool then the client has to wait.
My confusion comes from the fact that a thread is supposedly not reusable, which defeats the purpose of pooling them. Have I understood the assignment incorrectly?
Threads are reusable as long as they have not ended. A pool of threads generally involves threads that do work as it is given to them, and then wait for more work. Thus, they never end until explicitly told to do so. The trick is designing them in a way such that the work they are given ends, but the thread itself does not. Thread pools are useful because it is often relatively expensive to create/destroy threads.
#Kaliatech has already explained the concept behind re-use of threads. Also "The ownership of these threads shall be transferred to the client" is slightly misleading as the ownership of threads generally remain with the thread-pool/object-pool as it is the manager of this pool and the client should simply submits the task to the pool which can either complete successfully or fail. The thread continues to run ready to pick the next task submitted to the pool. As a design too the separation of task object ( Runnable/Callable) and the object representing thread execution (Thread) are designed to be different. Should the need arise the thread-pool is responsible for ramping up/down the number of threads as they are expensive to create and manage. Java ThreadPoolExecutor will be a good example to refer to how typically such a thread pool works.
Related
I read it is not possible to restart the same Thread. If it should be restarted, then you have to create a new Thread.
But the thing is, the threads are limited, you can not create 40,000 threads because the operating system can only create 15,000 threads. (This is just an example).
Is there another possibility?
But the thing is, the threads are limited, you can not create 40,000 threads because the operating system can only create 15,000 threads. (This is just an example).
Yes, that's confusing - but you're mixing terms.
A java java.lang.Thread object is not an OS-level thead.
You can make a million or so Thread objects. Go ahead, try it.
You would not be able to simultaneously start all 1 million of em, however - start() causes things to occur, eventually leading to an 'OS level thread' to be created.
Once a thread ends, the java.lang.Thread object will stick around, but the OS level thread won't. Hence, your central problem thesis, which I'll summarize as:
Restarting an existing java.lang.Thread object is better than creating a new java.lang.Thread object, because I read that you can only have 40k or so java.lang.Thread objects
Is just flat out wrong. There is zero benefit between a hypothetical re-use of a j.l.Thread object by 'restarting' it (which implies it has ended), vs. just making a new object instead once the 'old' one has ended.
What you'd have to do if you want to 'restart' a thread is to have a run() method (the one that the thread ends up running in a new thread) that is a frameworky thing: It checks a queue of jobs, pulls one queue off the top, runs it, and then goes back to checking the queue. This is tantamount to 'reusing a thread', though you're now just reinventing what ExecutorPool (baked into core java itself) and friends already do.
ExecutorPools are probably a good idea. However, they don't 'solve' your problem. Your problem isn't actually a problem.
Use an Executor or ExecutorService. Class Executors has some factory methods for those. By using one that uses a thread pool you can submit actions without threads necessarily being created for each separate action.
You can use a thread pool. Database connections are often managed in a thread pool, for example. After it has done its work (e.g., queried some data), the thread returns to the pool, and can be reused.
The other thing you may be missing is: while you cannot have e.g., 40,000 threads at a time, nothing prevents you from creating new threads after old ones have finished.
like - network operation and bitmap manipulating an image loading and other kinds of work can I create a single TheadPoolExecuter for my whole application and execute on it.
if the answer is no -> why? and how to create thread pool for every single operation?
or if yes -> is performance problem occurs?
thanks in advance.
Both of approach have advantages and disadvantages.
In case of single thread pool (singleton implementation, I suppose):
➕ you have one entry point to submit background task
➕ it easily to implement and control life cycle
➖ if you have a lot of different quick tasks and some long running task, long running tasks may hold all thread in limited pool while user wait some quick action in UI
Different thread pools (one pool for one type of task):
➕ thread pool of long-running tasks can accumulate task while quick task can be executed in their own thread pool in-depend
➕ you know everything about tasks in your application - you can fine-tune pool size for every type of task, setup threads priority, initial stack size etc. with thread factory
➕ if you define thread group and thread name, it can help you in debug
➖ have different thread pools involve to hard control their life cycle
➖ this implementation will not give a lot of benefits in poor separation by tasks classes
Any case, you need some compromise and an assessment of the advantages
Talking teorically i think you can do that and according to oracle documentation should be improve your performance:
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
When using a thread pool, is it beneficial to still use singular thread objects for a specific task. I'm wondering in terms of a server in Java, whether or not the thread which is listening for connections, should share its resources with any other threads which are then allocated from this one listening thread? I may also be missing the point as I'm not familiar with this concept.
Yes, singular tasks that have to run concurrently can have their own threads outside of the thread pool. Forcing every thread to be part of the pool might obscure your design because you need all kinds of machinery to make concurrent tasks look like worker threads.
I'd create two pools, one for listening and one for internal tasks. This way you're never putting your server at risk of not being able to listen for connections.
The pool for internal tasks can be small if it's only a thread now and then, but at least it's safely isolated.
Resource sharing might be necessary in cases where your server needs to maintain a global application state (e.g. using an AtomicLong for the number of requests served by your server etc.). Your main thread would typically wait, ready to accept incoming connections/requests. You then update the global state (like hit counter), create a new "job" based on the new request (typically a Runnable or Callable) and submit it to a thread pool (java.util.concurrent) provides them.
The purpose of a thread pool is just to help you manage your threads. In other words, a thread pool handles the creation and termination of threads for you as well as giving work to idle threads. Threads that are blocked or waiting will not receive new tasks.
Your connection listener will probably be in an infinite loop waiting for connections and thus never be idle (although it could be in a wait state). Since this is the case, the connection listener thread will never be able to receive new tasks so it wouldn't make sense to pool it with the other threads.
Connection listening and connection handling are also two different things. From that perspective the connection listener shouldn't be pooled with the connection handlers either.
SImilar to #larsman's comment, I would do what ever you feel is simpler and clearer. I have tended to use one thread pool for everything because it appeared to be easier to manage. You don't have to do it that way and the listening task can be its own thread.
Can any one guide me with example about Thread and ThreadPool what is difference between them? which is best to use...? what are the drawback on its
Since a thread can only run once, you'd have to use a thread per task. However, creating and starting threads is somewhat expensive and can lead to a situation where too many threads are waiting for execution (don't remember the exact name for this right now) - which further reduces performance.
A thread pool is - as the name suggests - a pool of worker threads which are always running. Those threads then normally take tasks from a list, execute them, then try to take the next task. If there's no task, the thread will wait.
Using a thread pool has several advantages:
you don't have to create a thread per task
you normally have the optimal number of threads for your system (depending on the JVM too)
you can concentrate on writing tasks and use the thread pool to manage the infrastructure
Edit: Here are some quite good articles on concurrency in general: Sutter's Mill, look at the bottom for more links. Although they're primarily written for C/C++ the general concepts are the same, since it also describes the interdependence between concurrency solutions and hardware. A good article to understand concurrency performance issues is this article on drdobbs.com.
A thread pool is a collection of threads which are assigned to perform uniformed tasks.
The advantages of using thread pool pattern is that you can define how many threads is allowed to execute simultaneously. This is to avoid server crashing due to high CPU load or out of memory condition, e.g. the server's hardware capacity can support up to 100 requests per second only.
Database pooling has the similar concept with thread pool.
This pattern is widely used in most of the back-end servers' application process.
While a thread, is a unit which execute a task.
When writing a multithread internet server in java, the main-thread starts new
ones to serve incoming requests in parallel.
Is any problem if the main-thread does not wait ( with .join()) for them?
(It is obviously absurd create a new thread and then, wait for it).
I know that, in a practical situation, you should (or "you must"?) implement a pool
of threads to "re-use" them for new requests when they become idle.
But for small applications, should we use a pool of threads?
You don't need to wait for threads.
They can either complete running on their own (if they've been spawned to perform one particular task), or run indefinitely (e.g. in a server-type environment).
They should handle interrupts and respond to shutdown requests, however. See this article on how to do this correctly.
If you need a set of threads I would use a pool and executor methods since they'll look after thread resource management for you. If you're writing a multi-threaded network server then I would investigating using (say) a servlet container or a framework such as Mina.
The only problem in your approach is that it does not scale well beyond a certain request rate. If the requests are coming in faster than your server is able to handle them, the number of threads will rise continuously. As each thread adds some overhead and uses CPU time, the time for handling each request will get longer, so the problem will get worse (because the number of threads rises even faster). Eventually no request will be able to get handled anymore because all of the CPU time is wasted with overhead. Probably your application will crash.
The alternative is to use a ThreadPool with a fixed upper bound of threads (which depends on the power of the hardware). If there are more requests than the threads are able to handle, some requests will have to wait too long in the request queue, and will fail due to a timeout. But the application will still be able to handle the rest of the incoming requests.
Fortunately the Java API already provides a nice and flexible ThreadPool implementation, see ThreadPoolExecutor. Using this is probably even easier than implementing everything with your original approach, so no reason not to use it.
Thread.join() lets you wait for the Thread to end, which is mostly contrary to what you want when starting a new Thread. At all, you start the new thread to do stuff in parallel to the original Thread.
Only if you really need to wait for the spawned thread to finish, you should join() it.
You should wait for your threads if you need their results or need to do some cleanup which is only possible after all of them are dead, otherwise not.
For the Thread-Pool: I would use it whenever you have some non-fixed number of tasks to run, i.e. if the number depends on the input.
I would like to collect the main ideas of this interesting (for me) question.
I can't totally agree with "you
don't need to wait for threads".
Only in the sense that if you don't
join a thread (and don't have a
pointer to it) once the thread is
done, its resources are freed
(right? I'm not sure).
The use of a thread pool is only
necessary to avoid the overhead of
thread creation, because ...
You can limit the number of parallel
running threads by accounting, with shared variables (and without a thread pool), how many of then
were started but not yet finished.